[Note: I have expanded on the material below at a talk given at the Conference The Split at the University of Copenhagen in September 2017. Slides and handout available at the conference homepage. An article from the proceedings is due to appear within 2019 (Bjørn forthc.(b)). As data continue to gather, the hypothesis is similarly developing, and the treatment below reflects the thesis of early 2017.]
While a couple of items have unmistakable external connections and established loan etymologies (wherefore these items also formally are included in the main part of the wordlist, § 4.1), the following treatment of the IE numerals will also take the form of an excourse discussing the strictly internal arguments for the traditionally reconstructed system in PIE.
- ‘One’ (item 37 and § 5.1 in Bjørn 2017)
Two PIE roots produce the first cardinal in the attested languages, viz. *(h1)oi- and *sem- (cf. Ringe et al. 2002: 74f.), but there is some reason to assume that the former may have been the first choice, given that it by far is the most widespread base (different derivations abound, e.g. *-no- and *-ko-); that Anatolian, Tocharian, Greek, and Albanian employ the latter may thus be ascribed to semantic innovation (cf. Gamkrelidze & Ivanov 1995: 740f.). Martínez holds that, strictly speaking, the numeral ‘one’ is not really counting (1999: 211), and other language families, including Uralic, Kartvelian, and Semitic, are similarly without a single ‘one’ (Mallory & Adams 1997: 398).
NW Caucasian: Abkhaz *ajə́ba ‘orphan’
Bomhard suggests the connection of PIE *(h1)oi- ‘one’ with the Northwest Caucasian item as a sign of an ancient adstrate relation (2015: 17), but for multiple obvious reasons the proposition has to be rejected, most saliently because the Abkhaz form convincingly can be connected with the Northeast Caucasian stock treated under *h3orbh– ‘to change allegiance’ (item 62).
The PIE reconstruction of the second numeral is wholly unproblematic and all branches are securely attested. It may be noted that no convincing external comparanda has been proposed for this item.
As with the preceding numeral there is no reason to question its ancient status within PIE, and there is no obvious external comparanda for this numeral either. Martínez has, however, suggested that it is transparently derived from *ter ‘beyond’ (1999:207), but this has no consequence for the present inquiry without proposed loan etymologies and with the numerical value established by the split of the Anatolian branch from the rest of PIE.
PIE (1) *kwetwor Attestations:
Toch. A śtwar, Toch. B śtwer; Lat. quattuor; OIr. cethair; Goth. fidwor; Lith. keturì; OCS četyre; Alb. katër; Myc. qe-to-ro-pi ‘four-footed’, Gr. τέσσαρες; Arm. cՙorkՙ ; Ved. catvára-, Av. čaθvārō Notes: Entirely consistent in all branches except for Anatolian, this PIE forms may be a compound *kwe-twor, and, following Villar (1996: 158), be a missegmentation that originally belonged to the numeral ‘three’, that thus shared desinence with ‘five’, cf. the simpler attestations in Ved. turīya– ‘fourth’ (Martínez 1999: 214), YAv. tūiriia– ‘id.’, and possibly also Greek τράπεζα (Myc. to-pe-za) ‘four-leg’, the standard explanation for which is zero grade *kwtuṛ– without realization of the initial consonant (cf. Mayrhofer 1991: 657 and Beekes 2010: 1499). The assumption, then, is that the sequence went ‘three-and-four-(and)-five-and’, viz. *tres-kwe-twor-pen-kwe (cf. also Bammesberger 1995: 218f.); the fact that ‘four’ does not have a separate *-kwe nonetheless remains suspicious. It does not amount to much of a counterargument to accept *pen(kw)-sti- ‘fist’ (item 104) as the base word for the numerical derivation, since the vital *kw is eclipsed in a consonant cluster (but note finger).
PIE (2) *méh1-u-
Hit. miyu-, CLuw. maauua-, Lyc. mupm̃m- ‘fourfold?’; (?)Myc. mi-we-jo ‘less’
Notes: This item introduces problems for the otherwise consistent IE decimal system with Anatolian discontinuation. Under the binary paradigm methodologically employed here the Anatolian forms may have a profound impact of the traditionally reconstructed numerical system of early PIE, if, indeed, the difference cannot simply be ascribed to loss and substitution, which would be unique within IE where, as far as the attestations show, all other branches retain all numerals from ‘two’ through ‘ten’ (cf. Fortson 2010: 145f.). This form may well be an innovation, likely derived from an adjective also continued in Mycenaean, as suggested by Martínez (1999: 207), but this says nothing about whether the traditionally reconstructed root (1) was discarded. Indeed, there are dozens of uninterrupted modern continuations of *kwetwor (in Danish, Spanish, Kurdish, Hindi, Welsh, etc.) and substitutions are demonstrably exceptionally rare, if not phantasmal. The point that is being driven at here is, of course, that a claim of such an exceptional substitution in the Anatolian languages of an ostensibly inherited word for ‘four’ requires solid evidence; tantalizingly, according to Kloekhorst, the common IE root may actually be continued in the legal term kutruuan ‘witness’, literally the fourth part, after defendant, plaintiff, and judge (2008a: 499ff., see also Eichner 1992: 80ff.), yet this claim is disputed and other likely cognates exist (cf. Puhvel 1997: 299f.). This dichotomy is a fundamental differentiation in the stratification of PIE and adds plausibility and urgency to the internal (or, theoretically, external, cf. the numbers 6 ‘six’ through 8 ‘eight’, § 5.6-5.8) derivation of the other root (1). Ultimately this analysis elucidates a stage of innovative morphology was disrupted by the branching of the ancestor of the
PIE (3) *(h3)ok̂t–
(?)PIE *(h3)ok̂tṓ- ‘eight’ (du.); Av. ašti– ‘breadth of four fingers’
Notes: See § 5.8 ‘eight’ that appears to be a dual form. Widespread loss of the original form is required, only directly attested in Avestan.
(?)Luw. 5-w(a) /panku-/; Toch. A päñ, Toch. B piś; Lat. quīnque; OIr. cōic; Goth. fimf; Lith. penki; OCS pętĭ; Alb. pesë; Gr. πέντε; Arm. hing; Ved. páñca, Av. panča
Notes: Italic, Celtic, and Germanic all require altogether unproblematic assimilations of the inherited stops, although Germanic does so in the opposite direction from the rest. According to Carruba an Anatolian alternative may possibly be attested in Lycian cm̃ne ‘five’ (1979: 192), but this is emphatically denied by Melchert (1994: 32), the form is not mentioned by Eichner in his treatment of Anatolian numerals (1992), and the phonetic complement in Luwian does seem to hint at the common root. Treating *-kwe as ‘and’ (cf. § 5.4) is certainly favored by the otherwise aberrant desinence *-e. The strongest hypothesis considering the numeral a complete root is Polomé’s connection to Hittite panku– ‘all, whole’ (1969: 99-101), although this is questioned by Kloekhorst (2008a: 624ff.). Sequentially it is worth noting that Anatolian here appears to agree with all other branches, rendering ‘four’ an isolated case of noncompliance. Treating this evidence with some degree of consequence, it may be surmised that the numeral ‘five’ was established at an earlier stage in PIE, which does seem intuitive glancing at the number of fingers on each authoring hand. Further, the suggested derivation of Anatolian ‘four’ is from the meaning ‘less’ which fittingly necessitates the greater number as its referent.
Uralic: *piηз ‘palm of the hand’
Trombetti (1923 :549) suggests a connection with Uralic *piηз ‘palm of the hand’ (UEW: 384), which would constitute a lexical rather than numerical argument, and is consequently treated as such, see *pen(kw)-sti- ‘fist’ (item 104); typologically parallelled is the origin of the Semitic numeral ‘five’ ḫamš– (Lipiński 2001: 295). Regardless of the many different proposals, it seems beyond contention that the lexeme is an internal innovation that likely predates the split of Anatolian, although this cannot be securely established.
- ‘six’ (item 120 and § 5.6 in Bjørn 2017)
PIE *(s)wek̂s Attestations:
Toch. A ṣäk, Toch. B ṣkas; Lat. sex; Welsh chwech; Goth. saíhs; Lith. šešì, OPrus.
us(ch)ts ‘sixth’; OCS šestĭ; Myc. we-pe-za ‘six-footed’, Gr. ἓξ; Arm. vec; Ved. ṣáṣ, Av. xšvaš.
Formally the union of all of the IE comparanda is problematic, parts of which may be explainable with s-mobile, or, perhaps better yet, as sequential assimilation of initial *s- from ‘seven’ (§ 5.7, cf. Szemerényi 1996: 222), but this still fails to explain all of the variation in the onset. Moreover, there is no Anatolian data for this item (Eichner 1992: 83).
Semitic: *šidt- > Akk. šiššet
[Add. Berber: *sa, cf. Lipinski]
NW Caucasian: *səx̑cə (Colarusso 1997: 144)
NE Caucasian: Khinalug zäk
Kartvelian: *ekws > Georg. ekws-, Ming. amšw– Laz a(n)š, Svan usgwa (Klimov 1985:206)
Discussion: The Kartvelian material requires metathesis akin to the one seen in IE to make all forms fit (Fähnrich 2007: 151f.), making a foreign origin probable in that family. This is usually ascribed to Armenian (cf. Kaiser & Shevroshkin 1986: 369f.), but Klimov is unwavering in positing a PIE loan and rejects a later Armenian source on chronological grounds, and is thus comparable to ‘seven’ (§ 5.7). The solitary attestation in Northeast Caucasian (Blažek 1999c:83) makes an old relation highly unlikely, and is, if related, more likely a later cultural transfer. Note, however, that Nikolayev & Starostin unproblematically include this and the Northwest Caucasian comparandum (suggested by Colarusso 1997: 144) into the North Caucasian material, thus from *ʔrǟnƛ_E (NCED s.v. ‘six’) and evidently not related to the PIE form. For the IE material, Levin (1995: 402) invokes the initial sibilant in Vedic, otherwise only a consequence of the RUKI rule, i.e. minimally requiring a preceding phoneme, or in peripheral lexical items such as onomatopoetic ṣthu ‘spit’, as evidence of a foreign element; this anomaly could, however, also be explained as assimilation to the internal sibilant (cf. Sihler 1995:413 and, more elaborately, Lubotsky 2008: 357), while Martínez (1999:208-209) and Mallory & Adams (1997: 402) all accept the loan hypothesis. A question of chronology still remains, however, to explain how the original borrowed sound was retained to produce distinct outcomes in the different IE dialects; perhaps late diffusion, which perhaps may be tacitly supported by the non-evidence from Anatolian. The Semitic form is formally removed from the PIE reconstruction with a different occlusive quality and without the initial glide, and the later Akkadian form is evidently assibilated and thus difficult to posit as primary to the hard velar treatment continued in the IE centum languages. Considering the strong case for a Semitic origin of ‘seven’, it is difficult to discard this proposition, but the connection is not as obvious and may require intermediate languages, possibly in the Caucasus, but more probably in the Balkans, to fit the picture.
- ‘seven’ (item 109 and § 5.7 in Bjørn 2017)
PIE *septm̥ Attestations:
Hit. sipta-; Toch. A ṣpät, Toch. B ṣukt; Lat. septem; OIr. sechtN; Goth. sibun; Lith.
septynì; OCS sedmĭ; Gr. ἑπτά; Arm. ewtՙn; Ved. sápta, Av. hapta.
Notes: The PIE provenance of ‘seven’ is indisputable, since a Hittite cognate has been demonstrated in si-ip-ta-mi-ya ‘drink of seven’ conclusively mirrored in VII-mi-ya (cf. Neu 1999). There is little internal variation, although obvious external comparanda make a loan etymology unavoidable.
Kartvelian: *šwid- > Georg. šwid-, Ming. škwit-, Laz šk(w)it, Svan išgwid-.
HU: Hurrian sitta [Add: this form is a hapax and with disputed meaning, see also below; secure is, however, sindi- ‘seven’]
The number ‘seven’ has convincing cognates transcending most other linguistic
divides, thus PIE *septm, Finnish seitsemän, as well as Arabic sa’ba and Georgian šwid– (Dolgopolsky 1987: 15, Katlev 2004). It was borrowed into Uralic from IE languages at up to three different stages (Dolgopolsky 1995); Janhunen suggests that Proto-Samoyedic *sejtɜwe be a loan from Proto-Tocharian (1983: 5), but the formal resemblance with Finnish seitsen leaves some internal chronology left accounted for (cf. Joki 1973: 313); these occurrences clearly are of a secondary nature and the scope of the present investigation does not allow further scrutiny. Note that, according to Napolskikh, the Tocharian form travelled even further afield, cf. Old Chinese sjɛt ‘seven’ and Turkic *jetti (2001: 373). [Add. The Chinese connection would even bring the item further afield, as far as the Southwestern Tai languages, incl. Modern Thai, into which it was borrowed (Pittayaporn 2014: 55-56, 61)]. Klimov entertains the idea that Hurro-Urartian could be a center for much of this cultural distribution (1985: 209), but its simpler structure questions this hypothesis, and the form is, indeed, considered a secondary borrowing by Diakonoff & Starostin (1986: 20); a different, and supposedly native form, is shared with the North Caucasian languages that do not attest the root presently under scrutiny. The Kartvelian forms are generally assumed to stem from Semitic, cf. Klimov (1985: 206) and even Fähnrich, who agrees that the item exhibits obvious traits of a loan character (2007: 531), most likely from a form closely related to Akkadian sibittu. The saliency of this particular numeral no doubt emanates from the measure of the important seven-day week (Nichols 1997: 127), probably associated with the spread of agricultural practices and cultic rituals. With sound quasi-Afro-Asiatic cognates in Egyptian, Semitic, and Berber, an internal genesis for the form has been proposed by Blažek as deriving from a numeral ‘three’, cf. East Chadic *sab̩u/sub̩a (1997b: 18f.). It is thus highly probable that PIE similarly received the item from Semitic, and although the PIE desinence *-m has been explained as an internal sequential phenomenon from ‘nine’ and ‘ten’ (Mallory & Adams 1997: 402), the definite form in Semitic (*šab’a-t-Vm, where V = *-u- in the nominative, and *-a- in the accusative), too, could provide the coda (Dolgopolsky 1993: 243), wrapping the whole package that all but technically proves the origin of the term. A theoretical later borrowing directly into Anatolian from the, probably, Semitic origin seems unlikely considering the close phonological affinity with the rest of the IE stock. The unvoiced PIE reflex of the voiced Semitic origin is probably due to voicing assimilation to the following *-t-, although an argument could be made as to the exact phonetic nature of the earliest PIE phoneme system. Adding the sequential correspondence for the same families also for the preceding numeral ‘six’ (§ 5.6) renders Diakonoff’s criticism (1985: 124) insufficient to reject these very convincing comparanda.
- ‘eight’ (item 97 and § 5.8 in Bjørn 2017)
(?)HLuw. (see notes); Toch. A okät, Toch. B okt; Lat. octō; OIr. ochtN; Goth. ahtau;
Lith. aštuonì; OCS osmĭ; Alb. tetë; Gr. ὀκτώ; Phr. *o(t)tuos; Arm. utՙ; Ved. aṣṭá-, Av. ašta–
Notes: There is ample evidence to suggest that the desinence is the grammatical dual marker, perhaps most obviously with the Avestan singular ašti– ‘breadth of four (4) fingers’ (cf. § 5.4). Though hard Anatolian evidence must be considered lost, there is a slight suggestion in Hieroglyphic Luwian, 8-wa-a-ī for *(h)ak(?)-tauanzi, where the IE dual ending might be discerned (Eichner 1992:85), and the form may likely be ascribed to the oldest strata of PIE. The initial laryngeal is not attested but may be posited to meet the expectations of the PIE root structure.External comparanda:
Kartvelian: *otχo ‘four’Discussion:
The importance of the dual in PIE is essential for the comparison with the Kartvelian form that then represents the basic meaning lost in most branches of IE. Already Bopp noted the similarity (1847), and the unmistakable parallels are upheld by Klimov (1994) and Dolgopolsky (1987: 21). The Kartvelian fricative may thus relate to the palatal nature of PIE, albeit metathesized. Curiously, the Kartvelian numeral ‘eight’ *arwa is likely a borrowing directly from Semitic ‘four’, cf. Arabic arbaʔ, and the connection between eight and four appears to be typologically common (cf. Klimov 1985: 206); Kartvelian likely also got its number ‘10’ *a(š)t from Semitic, cf. Arab. ašr ‘ten’ (Nichols 1997: 142). Semantically the transfer requires the basic meaning to have been alive, but with the attestation in Avestan this can safely be posited for, at least, the Iranian branch, which blunts the imperative for a particularly old phenomenon. The conditions favorable for a transfer of the numeral ‘four’ eludes, and a borrowing in the opposite direction may justifiably be posited as an alternative, especially given the confusing state of the numeral in PIE (§ 5.4). There are clearly some interesting stratificational consequences of the numeral exchange between PIE, Kartvelian, and Semitic.
With cognates attested in Anatolian this numeral may safely reconstructed for the earliest strata of PIE; unmistakable traces of the common IE root are thus found in Lycian nuñtãta ‘a number (with nine as a component)’. Internal derivation has been proposed as ‘the new one’ fra *neu– (cf. Martínez 1999: 212). Greek and Armenian requires special attention for this etymology to work, and it was suggested already by Pedersen that Graeco-Armenian represents the innovation of prefixing *en– (1893: 272), then literally ‘anew’, possibly after
‘eight’, which, too, helps explain the conundrum of the Greek geminate in the classical reconstruction *h1neu– (cf. e.g. Beekes 2010: 427-428). No obvious external comparanda have been proposed.
- ‘ten’ (item 9 and § 5.10 in Bjørn 2017)
Attestations: Toch. A śäk; Lat. decem; Welsh deg; Goth. taíhun; Lith. dẽšimt; OCS desętĭ; Alb. dhjetë ; Gr. δέκα; Arm. tasn; Ved. dáśa, Av. dasa.Notes:
Without Anatolian attestations (Eichner 1992: 88) this item is otherwise wellrepresented in all other branches. Mallory & Adams convincingly argue in favor of a derivation to *dek̂-s ‘right hand’ continued in Latin dexter (1997: 403).
Uralic: Fenno-Volgaic *-tVksVn > e.g. Fin. –deksanDiscussion:
The use of the number in Finnish is of some interest as it is confined to compounds, thus kah-deksan ‘eight’ and yh-deksän ‘nine’, literally ‘two (and one, respectively) from ten’ (the standard term for ‘ten’ is kymmenen). It is tempting to assume that this expression of the numeral ‘ten’ in some western branches of Uralic is of IE origin (Hakulinen 1946: 33), but an internal collocation may also be posited (Itkonen 1973: 337ff.), and is preferred in both SKES (1978: 1856, s.v. yhdeksan) and UEW (1988: 643f.). If a transfer did occur, it must have happened earlier than that of *śata– ‘hundred’ (§ 5.12). Note that the existence of extensive trade relations between stages of PIE and Uralic is uncontroversial, and seeing that at least early middle PIE had a concrete number ten, very likely tied to a decimal system, while Uralic demonstrably did not, it is by no means inconceivable that the numeral could have transferred in certain collocations where the exact place in the native system was undetermined, allowing a different root for ten to be adopted when the system finally was established.
Seeing that the numeric sequence even through to ‘ten’ are ripe with internal
inconsistencies, it can be no wonder that even more complex formations pick up the mantel in more or less idiosyncratic ways in the various dialects. One particular, and, from the looks of it, odd, system that spans a wide geographical as well as linguistic area is the “left-over” teens of Proto-Germanic (only eleven and twelve), Lithuanian, and the Samoyedic language Tundra Nenets (Martínes 1999: 212). The Indo-European forms are very likely to be cognate (Germ. *-lif and Lith. –lika, despite the problem of labiovelar reflexes in Germanic), but are
formally different from Tundra Nenets yəŋk°nʹa ‘separate’ (Nikolaeva 2014: 52). The very limited geographical distribution renders this suggested correspondence highly speculative.
- A short note on higher order numerals
Just like Fenno-Ugric borrowed ‘hundred’ from Indo-Iranian *śata– → FU *śata, so did Kartvelian adopt its *as1ir ‘hundred’ from Semitic, cf. Akkadian ‘esr ‘ten’ (Klimov 1985:208). A similar derivation from ‘ten’ is likely found in PIE *(d)k̂m̥tóm ‘100’ ~ *dek̂m̥ ‘10’ (Mallory & Adams 1997:404).