Professional Documents
Culture Documents
I S O L AT E S
Language Isolates explores this fascinating group of languages that surprisingly com-
prise a third of the language families (linguistic lineages) of the world.
Individual chapters written by experts on these languages examine the world’s major
language isolates by geographic regions, with up-to-date descriptions of many, including
previously unrecognized language isolates. Each language isolate represents a unique
lineage and a unique window on what is possible in human language, making this an
essential volume for anyone interested in understanding the diversity of languages and
the very nature of human language.
Language Isolates is key reading for professionals and students in linguistics and
anthropology.
List of figures vi
List of maps vii
List of tables viii
List of contributors x
Introduction Lyle Campbell xi
Index 373
FIGURES
6.16 Nominal case suffixes in the two main variants of Yukaghir 150
6.17 Kusunda vowel phonemes 153
6.18 Kusunda consonant phonemes 153
6.19 Kusunda case markers 153
6.20 Kusunda person and number marking in class I verbs (realis mode) 155
6.21 Kusunda person and number marking in class II verbs 155
6.22 Possibly autochthonous Nahali lexical items 156
6.23 Nahali nominal case suffixes 156
6.24 Nahali personal pronouns (singular only) 157
6.25 Personal pronouns in Jarawa, Onge, and Great Andamanese 158
7.1 African language isolates not generally disputed 167
7.2 African isolates: reported, suggested, controversial 167
7.3 Bangi Me consonants 170
7.4 Bangi Me vowels 170
7.5 Hadza consonants 171
7.6 Hadza vowels 172
7.7 Hadza number and gender marking 173
7.8 Jalaa consonants 173
7.9 Jalaa vowels 174
7.10 Jalaa number marking in comparison to Cham 174
7.11 Laal consonants 175
7.12 Laal vowels 175
7.13 Laal nominal number marking 175
7.14 Laal subject pronouns 176
7.15 Laal verb forms 177
7.16 Koman, Shabo and Gumuz shared lexical items 178
7.17 Hypotheses concerning the classification of Ongota 178
7.18 Kwadi-Khoe lexical correspondences 180
7.19 Mpra-Dompo resemblances 181
9.1 Transitive vs. intransitive morphology in Seri 240
9.2 Kinship vs. other inalienable possessive prefixes 240
10.1 Numbers on South American isolates and lineages compared
to other continents 260
10.2 Itonama classifiers 271
11.1 Numbers on isolates and families in the New Guinea region
compared to other continents 288
12.1 Complex predicates shared by Wubuy and Anindilyakwa 327
12.2 Phoneme inventory of Anindilyakwa 327
12.3 Harvey’s Non-Pama-Nyungan bound pronominal reconstructions
compared with Tiwi 330
13.1 Language Endangerment Index (LEI) 346
13.2 Language isolates and endangerment statuses 348
13.3 Language isolates by region and endangerment status 362
13.4 Endangered isolates vs. non-isolate endangered languages by region 362
13.5 Least documented isolates and speaker numbers 367
CONTRIBUTORS
Blench, Roger
University of Cambridge
Bowern, Claire
Yale University
Campbell, Lyle
University of Hawai‘i at Mānoa
Dougherty, Thomas
University of Hawai‘i at Mānoa
Georg, Stefan
University of Bonn, Germany
Hammarström, Harald
Max Planck Institute for the Science of Human History
Heaton, Raina
University of Oklahoma
Lakarra, Joseba A.
University of the Basque Country (La Universidad del País Vasco/Euskal Herriko
Unibertsitatea)
Michalowski, Piotr
University of Michigan
Mithun, Marianne
University of California Santa Barbara
Okura, Eve
University of Hawai‘i at Mānoa
Seifart, Frank
University of Amsterdam, University of Cologne
Smith, Alexander D.
University of Hawai‘i at Mānoa
INTRODUCTION
Lyle Campbell
It may strike many of us as ironic for a volume on language isolates to appear in Rout-
ledge’s Language Family series. However, as this book makes clear, language isolates are
also language families – they just happen to be language families that have only a single
member language, a language with no known relatives. And, to be extra clear, it could be
added that by “relatives” we mean other languages related genetically, i.e. phylogeneti-
cally via descent from a common ancestor: a language isolate is a language which has not
been shown to be the descendant of any ancestral language which has other descendants
(daughters) in addition to the language isolate in question. The chapters of this volume
show, moreover, that language isolates have more in common with other language fami-
lies than has usually been realized.
The chapters presented here survey several of the world’s most famous language iso-
lates and the language isolates of various regions of the world. There are vastly more
language isolates than most people, including most linguists, are aware of – c.159. That
is 39% of the world’s c.407 independent language families!
The absolute number of language isolates, is, however, uncertain and in fact will never
be known, for several reasons. There are many unclassified, or unclassifiable, languages,
where the documentation is so poor or non-existent that it is not possible to compare
the language meaningfully to others to determine whether it may be related to any other
language or not (see Campbell, this volume, for discussion). Moreover, in the case of
some languages with some but limited attestation, opinion among scholars has differed
with respect to whether the documentation is sufficient to determine that the language in
question is an isolate because it cannot be shown to be related to any other or whether
the available corpus is simply too scant to make such a determination, leaving the lan-
guage unclassified. This raises an interesting but mostly unaddressed research question
for language isolates: how much documentation, and of what sort, is necessary in order
to determine whether a language is best considered unclassified or a language isolate?
That is, when is sufficient information available so that when compared, no discernible
relatives can be found for that language, leaving it a language isolate?
Another reason that the total number of language isolates is uncertain has to do with the
vexed question of separate languages versus dialects. A language with dialects (variants
of a single language) can be a language isolate if it has no known relatives. However, if
there is sufficient diversification that the variants are not mutually intelligible and hence
are considered distinct languages, then it is a matter of a family of related languages
rather than a language with multiple dialects – and a family of multiple languages cannot
by definition be considered a language isolate. Since the boundary between relatively
divergent dialects of a single language and closely related language is sometimes difficult
xii Introduction
to determine, there are instances of disagreement among specialists where some see a lan-
guage isolate, a single language with dialects, but others see a family of related languages
and hence no language isolate. The language-versus-dialect issue poses another import-
ant research question, how to distinguish dialects from languages in such circumstances,
one much addressed but so far without results that make the determination in uncertain
or borderline cases any clearer.
Another reason that we do not know the exact number of language isolates involves
“uncontacted” languages. In Brazil alone there are officially at least 40 “uncontacted”
isolated indigenous groups, some count nearly 70, with around 90 for South America as
a whole (see, for example, https://en.wikipedia.org/wiki/Uncontacted_peoples#South_
America, accessed 9–6–2016). For many of these cases, it is not known whether the peo-
ple speak a variety of an already identified language, a language currently unknown but
which belongs to a known language family, a language that represents an as yet unknown
language family, or a language with no other relatives in the world, an isolate.
With regard to linguistic theory and language typology, language isolates are of mon-
umental importance. As languages unrelated to other languages, each language isolate
represents an independent lineage among the world’s languages, a unique development.
Given the goal of linguistics to comprehend the full range of what is possible and impos-
sible in human languages, the investigation of language isolates should hold a privileged
position in linguistic research. Each language isolate constitutes an independent window
on what can be found in the languages of the world, on what is possible in human lan-
guage, and, through the study of languages on the potentials and limitations of the human
cognition. Collectively, the language isolates of the world constitute an extremely rich
and important laboratory for striving to achieve this goal to discover the full scope of
what is possible in the human language.
Given this importance, endangered language isolates deserve extremely high prior-
ity in decisions about which languages to document and how to deploy resources and
research efforts. (See Okura, this volume.) Already 59 of the 159 language isolates of
the world are extinct (37%). Put in a larger perspective, all the languages belonging to
c.96 of the world’s c.407 independent language families (including language isolates)
are extinct – 24% of the linguistic diversity of the world, calculated in terms of language
families, is gone forever. However, nearly two-thirds of all extinct languages families
are language isolates, 59 of those 96 (61%). Most of the remaining language isolates are
endangered, many severely so. Truly, the loss of language isolates is taking an alarming
toll on the linguistic diversity of the world. The loss of a language isolate represents the
loss of an entire linguistic lineage. The loss of any language that has relatives constitutes a
monumental loss of scientific information and cultural knowledge, comparable in gravity
to the loss of a species, say, the Bengal tiger or the right whale. However, the extinction
of a language isolate or of other whole families of languages is a tragedy comparable in
magnitude to the loss of whole branches of the animal kingdom, similar to the loss of all
felines or all cetaceans – a catastrophic loss of information and knowledge unparalleled
and unimagined in biology. Yet this is what confronts us.
Because languages reflect the world’s knowledge and wisdom, the loss of a language
isolate means loss to our understanding of the range of potential ways of experiencing
and understanding the world. Great reservoirs of historical information are recovered
from the study of languages. Historical linguistic investigation gives us history of human
groups, information about their contacts and migrations, their homeland, and past culture.
All of this information and these insights into human experiences are irretrievably lost
when a language is lost without adequate documentation.
Introduction xiii
Another research question is how can we find out about history of language isolates?
Campbell (this volume) discusses a number of techniques for getting at the history of lan-
guage isolates, showing that claims that these languages have no history are misguided.
Not only can research on language isolates help us to comprehend what is possible in
human languages and to understand the history of individual language isolates, it contrib-
utes to understanding of the history of humankind.
An important part of the history of many language isolates involves language con-
tact. The identification of loanwords and of structural influences from other languages
contributes to understanding the history of the speakers of these languages, but it
also helps us eliminate similarities due to borrowing from considerations of possible
genetic relationships with other languages. The methodological issue expressed in
Swadesh’s (1951) title “Diffusional Cumulation and Archaic Residue as Historical
Explanation” is of crucial importance in attempts to find distant relatives of language
isolates (and of larger language families, as well). We need to distinguish between
the inherited and the diffused/borrowed, eliminating the latter from considerations of
potential kinship.
Language contact must be dealt with effectively in the methodology for evaluating pro-
posals of distant genetic relationship. Efforts no doubt will continue to attempt to show
that various language isolates are not really isolates but are related to other languages as
members of broader language families. This is to be encouraged and positive results to
be hoped for. Nevertheless, proposals of such relatedness will gain little sympathy unless
careful methods are followed. Most proposals attempting to show language isolates are
related to other languages have not been accepted. We should not say, however, that we
have learned nothing from the attempts. Indeed, we found in several cases that the evi-
dence that certain languages are isolates is very strong, precisely because of the failure of
many attempts to find relatives for them. In particular, the methodology for investigating
distant genetic relationship has become much clearer, and we understand better why cer-
tain proposals fail and we know what considerations are important if a plausible, defen-
sible hypothesis of a family relationship is to be found to have merit. (See Campbell and
Poser 2008.) In assessing the evidence brought forward for these hypotheses, in several
cases we have discovered numerous loanwords and have come to understand influences
from language contact much better.
What does the future hold? We expect documentation of language isolates to be
given priority, especially documentation of endangered language isolates, for the
contributions they can make to linguistic theory and to understanding the history of
human beings on the planet. We can anticipate that attempts to seek possible relatives
of various language isolates will continue, occasionally with some success. However,
it should not be expected that the total number of language isolates is going to shrink
markedly. Possibly, though less likely, additional language isolates will be discovered.
Where appropriate and possible, it is to be hoped and encouraged that effective lan-
guage revitalization efforts will be undertaken to conserve these language isolates. It is
hoped that the language isolates of the world will become better understood and their
potential contributions better appreciated. Hopefully they will be analyzed carefully
for the contributions they will make to linguistic theory and language typology and to
the overall history of human language. The chapters of this book contribute towards
those ends.
This volume, then – with 159 of the world’s c.407 language families, albeit families
with but a single member language – takes a rightful place in the Routledge’s Language
Family series. It involves 39% of all the language families of the world.
xiv Introduction
REFERENCES
Campbell, Lyle and William J. Poser. 2008. Language Classification: History and Method.
Cambridge: Cambridge University Press.
Swadesh, Morris. 1951. Diffusional cumulation and archaic residue as historical explana-
tion. Southwestern Journal of Anthropology 7: 1–21.
CHAPTER 1
1 INTRODUCTION
How many language isolates are there in the world? (And, how many language families
are there?) Most linguists do not know, and opinions vary greatly. To answer these ques-
tions is complicated because of differing views about fundamental issues in historical lin-
guistics and because of the limited amount of information that is available to us on a good
number of these languages. This chapter attempts to answer the questions: how many
language isolates are there? How can we advance knowledge of the history of language
isolates? What lessons does the study of specific isolates offer for understanding better
the history of language isolates in general and of other specific isolates? What are the
prospects for finding relatives for some language isolates, for showing that they belong
together with other languages in a family of related languages?1
So, what is a language isolate? The standard definition is that a language isolate is
a language that has no known relatives, that is, that has no demonstrable phylogenetic
relationship with any other language. It is a language which has not been shown to be
the descendent of any ancestral language which has other descendants (other daughter
languages). Thus, language isolates are in effect language families that have only one
member. The best-known and most cited language isolates are Basque, Burushaski, and
Ainu, though there are many others not so well known represented in this book.
Since language isolates are often contrasted with families made up of related lan-
guages, we also need to ask, what is a language family? A language family is a set of
languages for which there is sufficient evidence to show that they descend from a single
common ancestral language and are therefore phylogenetically related to one another.
The total number of language families in the world is the set of independent families for
which no genealogical relationship can be demonstrated with any other language family.
A “family” can be composed of but a single language in the case of language isolates,
languages with no known relatives.
The number of the world’s language isolates, as we shall see, comes to c.159, but this
is far from a secure answer to the question of how many there are. And, how many inde-
pendent language families (including isolates) are there in the world? There are approx-
imately 407 (cf. Campbell 2013:159). This number is relevant in considerations to come
later in this chapter.
relatives in the past that have disappeared without coming to be known, leaving these
languages isolated.
For example, Ket in Siberia is the only surviving language of the Yeniseian family (see
Georg, this volume). Nevertheless, there were other Yeniseian languages, now extinct:
Arin, Asan, Kott, Pumpokol, and Yugh (Vajda 2001). If these languages had disappeared
without a trace, Ket would now be considered an isolate. However, since data from these
extinct languages was registered before they disappeared, Ket was not left an isolate,
rather a member of a family of languages, albeit the only surviving member. Examples
such as Ket show that language isolates could well have once been members of lan-
guages families whose other relatives disappeared before they could come to be known.
This shows one way in which language isolates are not so very different from languages
families.
of Germanic. Their differences and similarities, when compared with those between
Aquitanian and Basque, turn out to be quite similar. Therefore, it should be asked, could
the relationship between Basque and Aquitanian be that of related languages as exists
between Gothic and English (sister languages), and not like that between Latin and Span-
ish, where an ancestral language and one of its descendants are involved?
family. The 2005 edition of Ethnologue (Ethnologue.com) listed three Basque languages,
though later editions have only one, yet noting still “some inherent intelligibility among
regional varieties except Souletin” (Lewis et al. 2016 [accessed 5–14–2016]).3
The main point here is not to insist that Basque once had relatives (questioning the
nature of its relationship with Aquitanian) or that Basque has relatives now (if some
varieties are mutual unintelligible); rather what is important in this discussion is to show
that Basque easily could cease to be considered a language isolate and, therefore, that
language families and language isolates are not so different.
Aranama-Tamique, Texas
Baenan, Brazil
Camunico, Northeast Italy (survived to 2nd half of 1st millennium bce)
Eteocretan, Crete, 7th–3rd centuries bce (see Michalowski, this volume)
Gamela, Brazil
Iberian, spoken on the Iberian Peninsula (5th–1st centuries bce or a bit later) (see
Michalowski, this volume)
Kara, possible language of Korea, only from 13 toponyms
Kaskean, Northeast Anatolia 2nd millennium bce
Koguryo possible language, NE China, Manchuria, Korea, 1st–8th centuries ce,
known only from toponyms and a few words
Ligurian, Northeast Italy, few words, 300 bce–100 ce
Maratino, Northeast Mexico
Minoan Linear A, undeciphered, 1800–1450 bce
Mysian, Western Anatolia, before the 1st century bce
Naolan, Tamaulipas, Mexico
Northern Picene, Adriatic coast of Italy, 1st millennium bce
Pictish, Scotland 7th–10th centuries ce, few inscriptions
Puyo, Manchuria (with Koguryo?), few attested words
Quinigua, Northeast Mexico
Raetic, Northern Italy, Switzerland, Austria, 1st millennium bce
Sicanian, Central Sicily, pre-Roman epoch
Solano, Texas, Northeast Mexico
Sorothaptic, Iberian Peninsula, pre-Celtic, Bronze Age
Tarairiú, Brazil
Wamoe (Huamoé), Brazil
Language isolates and their history 5
Tartessian (Spain, 1st millennium bce) is an interesting case, a language that until recently
was considered unclassified, but for which Kaufman (2015) presents strong evidence that
it was in fact a Celtic language.
Indus Valley (Harappan) (India and Pakistan, 2600–1900 bce) is another interesting
case, often listed as an unclassified language, based on undeciphered inscriptions. How-
ever, its status as a real writing system is disputed (cf. Farmer et al. 2004; Michalowski,
this volume). (For several other unclassified languages of Asia and Europe, see Micha-
lowski, this volume.)
The second kind of unclassified languages are the extant languages which cannot
be classified for lack of data, languages not yet described sufficiently to compare them
meaningfully with other languages in order to determine whether they may have rela-
tives. Examples include:
• In Africa: Bung, Lufu, Kujargé, perhaps Mpre (Mpra) (Blench, this volume), and
Rer Bare. Oropom is sometimes listed among unclassified African languages, though
Blench (this volume) reports that it is probably a spurious language. Weyto is another
“speculative” language sometimes listed among unclassified languages of Africa.
• In Asia and the Pacific: Sentinelese (Andaman Islands), Bhatola (India), Waxianghua
(China), Doso (Papua New Guinea), Kembra (Indonesia Papua), and Lepki (Indonesia
Papua).
• In South America there are many, for example: Ewarhuyana, Himarimã (perhaps
Arawan?), Iapama (uncontacted, possibly speakers of a known neighboring lan-
guage), Kaimbé, Kambiwá, Kapinawá, Korubo (maybe Panoan?), Pankararé, Truká,
Tremembé, Wakoná, Wasu, etc. (See Campbell 2012, Zamponi in press; cf. Seifart and
Hammarström, this volume.)
It should be noted that some of these unclassified languages could also be language
isolates, but without evidence, we cannot know.
by which certain languages are known. In South America, for example, most languages
are known by more than one name, and often a single name has been applied to more than
one language (see Campbell 2012). When comparing different accounts, it is important
to be careful with the problem of multiple names for single languages, so as not to extend
lists of isolates falsely by the inclusion of the same thing under different names. For all
these reasons, the number of language isolates reported in recent publications varies and
is uncertain.
In the list here, extinct languages are indicated by an asterisk (*) after the name of the
language.
Blench (this volume) list 12 other languages that have been reported/suggested as isolates,
but which are controversial: Bēosi (Madagascar), Dompo, Guanche (Canary Islands),
Gumuz, Kujargé (unclassified), Kwadi (unclassified),4 Meroitic (now generally seen
as a relative of Nubian), Mpra (perhaps unclassified), Ongota, Oropom (unclassified),
Sandawe, and Shabo.
Greater Andamanese (if it is a single language) may be another isolate, but this is uncer-
tain (see Georg, this volume).
Hurrian (Hurro-Urartean) (in Northeast Anatolia) is now known to be a family of two
separate languages, though earlier Urartean was thought to be a late form of Hurrian
(see Michalowski, this volume).
Koreanic is a small family, with at least two members, Korean and Jejueo; Korean is
often listed as a language isolate, but this is not accurate.
Sentinelese, on North Sentinel Island, one of the Andaman Islands, is sometimes listed
as an isolate, but its status is unknown. As Georg (this volume) reports, “it may, thus,
be just another Andamanese language, it may be exclusively related to Great Anda-
manese, it may be related to some other language of the greater region, or, then, to
none at all.”
Itel’men (Kamchadal), in southwest Kamchatka has sometimes been treated as a lan-
guage isolate, but the evidence shows that it is related to Chukchi-Koryak (Chukchi,
Koryak, Kerek, Alyutor) in the Chukotko-Kamchatkan language family (Georg, this
volume).
Europe: [1]
Basque
Some would include also Iberian, an extinct language of Spain, but it is better consid-
ered unclassified due to insufficient information.
As mentioned above, Etruscan, long considered an isolate, is related to Lemnian
(Tyrsenian family) and so is not a true language isolate.
Some of these may eventually turn out to have relatives (Hammarström, this volume).
North America: [22] (cf. Mithun, this volume, Campbell et al. in press)
Adai*(?) Beothuk*
Calusa*(?) Cayuse*
8 Lyle Campbell
Chimariko* Chitimacha*
Coahuilteco* Cotoname*
Esselen* Haida (?)
Karankawa* Karuk
Kootenai Natchez*
Siuslaw* Takelma*
Tonkawa* Tunica*
Washo Yuchi (Euchee)
Yana* Zuni
The number of languages listed as isolates in North America varies considerably in dif-
ferent publications, and the differences illustrate some of the difficulties in determining
whether something is or is not a language isolate and, consequently, what the total num-
ber of language isolates in the world is.
For example, Alsean is most commonly considered a small family of two closely
related languages, Alsea and Yaquina. However, some consider these to be dialects of
a single language and treat this “Alsea” as a language isolate.
Aranama is often listed as an ‘isolate,’ but the documentation is so poor it should
probably better be considered unclassified/unclassifiable. As Mithun (this volume)
says:
Our entire documentation of the language consists of one single-word and one two-
word phrase: himiyána ‘water’ and Himiána tsýi! ‘Give me water!’ These were
recorded by Albert Gatschet in 1884 from a Tonkawa man known as Old Simon,
who also provided a short vocabulary of Karankawa, another Texas language. Old
Simon himself identified the language as Hanáma or Háname (Gatschet 1884). The
only people indigenous to that area with a similar name were those known as the
Aranama, Saranames, or Jaranames.
Calusa is usually listed as an isolate, although, as Mithun (this volume) explains, the lan-
guage is known from only about a dozen words from 1575 from a Spanish captive among
the Calusa and from 50–60 place names. Early accounts report that Calusa was distinct
from other languages of the area. But as Mithun says, “with such a small record, however,
its [Calusa’s] status cannot be confirmed.”
Solano is also usually listed as an isolate. Mithun (this volume) also explains the Solano
corpus: “a sheet with 21 words. . . the page bore the following description: ‘Near the end
of the original book of baptisms of the San Francisco Solano Mission, 1703–1708, is a
brief vocabulary, presumably of the Indians of that mission’.” This documentation prob-
ably is not sufficient to permit the language to be classified.
In the case of Keres(an), there is uncertainty about the Keresan dialect continuum.
Most think it has enough diversity that at least two separate languages must be distin-
guished: Acoma-Laguna and Rio Grande Keresan. Some, however, believe Keres is a
single language, an isolate, with several dialects.
Some consider Atakapa a single language which is an isolate; others see it as a small
family, Atakapan, that includes Atakapa and Akokisa, and perhaps Eastern Atakapa,
though it is uncertain from the documentation whether these are two or three separate
languages or are all variants of a single one (see Mithun, this volume).
Language isolates and their history 9
Adai has sometimes been considered ‘unclassified’ because of the limited documenta-
tion, 275 words recorded, c.1802, though usually it is listed as a language isolate.
Molala and Klamath-Modoc, formerly considered language isolates, together with
the Sahaptian languages, belong to the Plateau (Plateau Penutian) language family
(Campbell et al. in press).
Salinan, often listed as a language isolate, is a small family, composed of the two lan-
guages, Antoniano and Migueleño (see Campbell et al. in press).
Timucua may be a language isolate and is often considered one, but this is unclear.
Some assign Timucua and Tawasa together as members of a Timucuan language family.
However, the status of Tawasa is disputed (based on a list of 60 words from 1707). As
Mithun (this volume) explains, a number of the Tawasa forms are so similar to Timucua
that they may represent the same language. She says, “if Tawasa was actually Timucua
itself or a dialect of that language, Timucua would remain an isolate.” Also, if Tawasa
was a separate but unrelated language, Timucua would be an isolate; however, if Timucua
and Tawasa were separate but related languages, then Timucuan would be a language
family. The status is just unclear.
There are three small language families in Central America that are sometimes listed as
language isolates:
(Languages that had speakers in both southeastern Texas and northeastern Mexico are
treated with North American languages; see the isolate Coahuilteco and several unclas-
sified languages.)
South America [53] (see Seifart and Hammarström, this volume, and Zamponi in press)
Aikanã (Brazil) Andaquí* (Colombia)
Andoque (Brazil, Peru) Arutani (Awaké, Uruak) (Venezuela,
Brazil)
Atacameño* (Cunza) (Chile) Betoi* (Betoi-Jirara) (Colombia)
Camsá (Sibundoy) (Colombia) Candoshi (Candoshi-Shapra) (Peru)
Canichana* (Bolivia) Cayubaba (Cayuvava)* (Bolivia)
Chiquitano (Bolivia) Chono* (Chile)
Cofán (A’ingaé) (Colombia, Ecuador) Culli (Culle)* (Peru; unclassified?)
Esmeralda (Atacame) (Ecuador) Fulniô (Yaté) (Brazil)
Guachí* (Brazil) Guamó* (Venezuela)
Guató (Brazil) Irantxe (Iranche, Mynky) (Brazil)
Itonama* (Bolivia) Jeikó (Brazil)
Jotí (Yuwana) (Venezuela) Kanoé (Kapixaná) (Brazil)
10 Lyle Campbell
Seifart and Hammarström (this volume) and others list several additional South Ameri-
can languages as isolates, though several of these have relatives, making them members
of language families, and others are better considered unclassified.
• Gününa-Küne (Puelche) belongs to the Chonan family (see Viegas Barros 2005).
• Jirajaran is a small family in Venezuela composed of Jirajara, Ayomán (Ayamán),
and Gayón (Coyón) (all extinct).
• Mosetén-Chinamé is sometimes considered a small family (Mosetenan) of two very
closely related languages, Mosetén and Chimané.
• Otomacoan is a small family in Venezuela, with Otomaco and Taparita.
• Pirahã is considered a language isolate by some and by others as a member of the
Muran family, together with Mura, Bohurá (Buxwaray), and Yahahí (in Brazil). At
issue is whether these are all varieties of a single language or are distinct but related
languages.
• Purí-Coroado (Brazil) is a language isolate, though traditionally Purí and Coroada,
together with Koropó, were considered members of a Purían language family; how-
ever, recent work reveals that Purí and Coroado are variants of the same language
and that Koropó is not related to it, but seems to be related instead to the Maxakalían
languages (see Zamponi in press).
• Sechura is often listed as a language isolate, but there is evidence for a Sechuran-
Catacoan language family, with Sechura (Atalán, Sec) related to the Catacaoan
languages (Catacao and Colán) (all in Peru). While these languages are extinct
and poorly attested, Adelaar and Muysken (2004:400) find the data sufficient to
support this family classification, while Seifart and Hammarström (this volume)
consider Sechura a language isolate, not finding Adelaar and Muysken’s evidence
convincing.
• Timote-Cuica, often considered a language isolate, is probably a small family of
languages (Timotean) in Venezuela, all extinct, composed of Timote-Cuica (Miguri,
Cuica) and Mucuchí-Maripú (Mocochí, Mirripú). It is not clear whether Timote and
Cuica were separate languages or were dialects of a single language. Timote may
survive as Mutú (Loco, Mutús), an unstudied language (cf. Adelaar and Muysken
2004:125).
Language isolates and their history 11
Though clearly related, members of the Lule-Vilelan family, Vilela and Lule (of Argen-
tina) are often listed as isolates (see Viegas Barros 2001; cf. Campbell 2012 and Zamponi
in press, for additional discussion of these).
Thus, the total number of isolates in the world comes to c.159 (as precise a figure as
current circumstance permit). There are c.407 independent language families (including
isolates), for which it is not possible to demonstrate a genetic relationship with any other
language family. Isolates thus make up 39% of all ‘language families’, of the world’s
linguistic diversity, calculated in terms of language families. Seen from this perspective,
isolates are not at all weird; they have as their ‘cohorts’ well over one-third of the ‘lan-
guage families’ of the world.
So, to what do we owe the general attitude that language isolates are weird and sus-
picious, that languages with no relatives should not be tolerated? I believe this is due to
scholars generally not knowing that there are many language isolates and not knowing
that isolates really differ little from other languages families, as seen above, so there
should be no particular motive to feel driven to try to get rid of them by assigning them
as members to some higher-order language family.
Internal reconstruction
Evidence from loanwords
Comparative reconstruction based on dialects
Toponyms and other proper names
Philological study of attestations and historical reports
Language contact and areal linguistics
Wörter und Sachen
5.4 Loanwords
Loanwords are another source of evidence on the history of language isolates. For exam-
ple, from the semantic content of the more than 300 ancient loanwords from Latin into
Basque, it is clear that the Romans had much influence in the areas of laws, administra-
tion, technology, religion, and refined culture. Moreover, the relative age of many of these
loanwords in Basque is known from phonological considerations. Many were borrowed
before the changes in Romance that transformed the ten vowels, five long and five short,
to a system of just seven vowels, and were borrowed before the palatalization of velar
consonants before front vowels. These older loans in Basque reflect the pronunciations
before these changes in Romance had taken place (Michelena 1988, 1995, Trask 1997).
In another case, from Mesoamerica, we know something of the history of Huave (isolate)
and its speakers from words borrowed from Mixe-Zoquean (MZ). Some examples are:
Several of these loans show early cultural influence from Mixe-Zoquean on Huave,
reflecting cultural concepts in ancient Mesoamerica. They support the hypothesis that the
ancient Olmecs – the first highly successful agricultural civilization in Mesoamerica –
spoke a Mixe-Zoquean language, as seen also in the Mixe-Zoquean influence on many
other languages in the area (Campbell and Kaufman 1976).
These areal facts also provide information about the history of Basque, of Basque
contacts.
in it. Similarly, Basque gari ‘wheat’ is inferred to be older than garagar ‘barley’, since
garagar involves a reduplicated from of the word for ‘wheat’ and thus is morphologically
analyzable. And, the word for ‘wheat’ must also be older than that for ‘beer’, since the
‘barley’ component of ‘beer’ is morphologically complex, with ‘wheat’ in it. Basque
janarbi ‘radish’ is analyzable as jan ‘eat + arbi ‘turnip’; it is thus inferred that the ‘turnip’
word is older than the ‘radish’ word.
Another Wörter und Sachen strategy involves the analyzability of toponyms. It is
held that place names that can be analyzed into component parts probably are more recent
in a language than those which have no such internal analysis. For example, it is inferred
that York is older in English than New York, since the latter is composed of identifiable
pieces, but not the former. In Basque, since the names of several rivers in the French
Basque area have no clear etymology (not analyzable into parts), it is inferred that they
are old names. However, the names of several rivers of Biscaya are analyzable, for exam-
ple Ibaizabal from ibai ‘river’ + zabal ‘wide’, and Artibai from arte ‘between(?’) + ibai
‘river’. It is thus inferred that these latter, analyzable names are not as old in the language
as the former.
A third Wörter und Sachen strategy involves words that bear non-productive (irreg-
ular) morphemes; these words are assumed to be possibly older than words composed
only of productive (regular) morphemes. For example, the Basque morph -di is frozen,
not productive; its presence in the animal names ardi ‘sheep’, zaldi ‘horse’, idi ‘ox’, and
ahardi ‘sow’ suggests that these animals have been known for a long time.8 However, it is
possible to conclude only that words containing the non-productive morphology are old,
but nothing can be inferred about the age of words lacking such forms. For example, for
otso ‘wolf’ and ahuntz ‘goat’, lacking the non-productive morpheme, it is not possible to
conclude anything about their age in the language based on this criterion.
In sum, based on the resources just seen, much is known of the history of Basque.
This demonstrates, in turn, that we can learn about the history of isolates using these
techniques.
Judging from these successful instances, it can be expected that with more data and
following adequate methods (see Campbell and Poser 2008), more cases of family
Language isolates and their history 15
7 CONCLUSIONS
The conclusions that follow from the discussion here include:
1 There is nothing unusual about isolates; there are c. 159 language isolates in the world.
2 Language isolates make up over one-third (39%) of the world’s c. 407 independent
families (including isolates).
3 Language isolates are not very different from language families that have languages
with relatives. Isolates could easily have had relatives that are now lost and unknown,
or an isolate’s dialects can diversify further into related languages, members of a lan-
guage family of multiple languages.
4 Language isolates have descriptive data; they are not to be confused with unclassi-
fied languages, which are not classified for lack of adequate data.
5 Progress has been made in a sense in the search for relatives of Basque and some
other language isolates in that it has been demonstrated that many hypotheses of
distant genetic relationship involving them are not supported by the evidence, and
much more is known now of the methods necessary to demonstrate a phylogenetic
relationship among languages (see Campbell and Poser 2008).
6 In spite of claims that nothing can be discovered about the history of language iso-
lates, there are several tools (techniques) that can help to recover considerable histor-
ical information about these languages. These tools include: internal reconstruction,
philological investigation of earlier attestations, comparative reconstruction based
on the dialects, evidence from loanwords, language contact and areal linguistics, and
Wörter und Sachen strategies.
7 It can be expected that with more data and dedication, and by employing adequate
methods, new phylogenetic relationships may be discovered for some language iso-
lates. However, it is not to be expected that there will be many such cases, and this
is highly improbable in the case of Basque.
NOTES
1 Portions of this chapter are based on Campbell (2011), and parts of it appeared in
Campbell (2016, a paper presented in 2010). I thank Joseba Lakarra for valuable feed-
back on an earlier version of this chapter.
2 Mitxelena (in various publications) dated Old Common Basque (the origin of all the
moden dialects) to the fifth-sixth centuries CE, and this date is often cited. However,
Proto-Basque dates to before contact with Romans (second century BCE). (Joseba
Lakarra, personal communication.)
3 Joseba Lakarra (personal communication) points out that no investigation has been
undertaken and no evidence presented to demonstrate the assumed non-intelligibity of
Zuberoa (Suletino) with other Basque dialects, but that the now-extinct Roncalés dia-
lect (closely linked with Suletino) was even more difficult to understand for almost all.
4 Kwadi is extinct and “its affiliation cannot be resolved” (Blench, this volume), although
it is treated by some as Khoe.
5 Nivkh (Gilyak) has two fairly divergent varieties, and opinion has varied concerning
whether they should viewed as separate languages or dialects of a single language.
16 Lyle Campbell
Under the view that multiple languages are involved, the putative small family has
been called Amuric (Georg, this volume).
6 For understanding the history of a single language, Meillet also accepted the evidence
of changes in the history of that language (as seen in his historical treatments of Greek
and Latin). In the case of Basque, however, he was uncertain or misinformed about the
quantity and relevance of the changes Basque had undergone. (I thank Joseba Lakarra
for pointing this out to me.)
7 This etymology of *ardano is an old, popular one. Lakarra (this volume) goes beyond
this, with *ardano < *e-da-ra-dan-o, with the prefixes da- ‘locative/dative’ and ra-
‘causative’ before a root dan ‘to drink’ plus the suffix -o ‘completive’, from something
like ‘that which is made to drink for’ or ‘that with which toasting is made’.
8 Joseba Lakarra (personal communication) indicates that the -di of these words is an
allomorph of the archaic suffix -ti from the root *din ‘to come, be converted into’; this
suffix served to derive ablative-prospectives, adjectives, and future-potentials, a case
of the grammaticalization of a word meaning ‘to come’ in these functions.
REFERENCES
Adelaar, Willem F.H. 2000. Propuesta de un Nuevo Vínculo Genético entre dos Gru-
pos Lingüísticos Indígenas de la Amazonia Occidental: Harakmbut y Katukina.
Actas I Congreso de Lenguas Indígenas de Sudamérica, ed. by Luis Miranda, Lima,
2:219–236.
Adelaar, Willem F.H. and Pieter C. Muysken. 2004. The Languages of the Andes. Cam-
bridge: Cambridge University Press.
Bowern, Claire. 2007. Australian Models of Language Spread. Paper presented at the
International Conference on Historical Linguistics, 6–12 August 2007, University of
Quebec, Montreal.
Campbell, Lyle. 2011. La Investigación Histórica de las Lenguas Aisladas, o ?es Raro el
Vasco? II. Congreso de la Cátedra Luis Michelana, ed. by Joseba A. Lakarra, Joaquín
Gorrochategui, and Blanca Urgell, 23–40. Vitoria Gasteiz: Editorial de la Universidad
del País Vasco.
Campbell, Lyle. 2012. The Classification of South American Indigenous Languages. The
Indigenous Languages of South America: A Comprehensive Guide, ed. by Lyle Camp-
bell and Verónica Grondona, 59–166. Berlin: Mouton de Gruyter.
Campbell, Lyle. 2013. Historical Linguistics: An Introduction (3rd edition). Edinburgh:
Edinburgh University Press, and Cambridge, MA: MIT Press.
Campbell, Lyle. 2016. Language Isolates and Their History, or, What’s Weird, Anyway?
Berkeley Linguistics Society 36: 16–31.
Campbell, Lyle, Victor Golla, Ives Goddard, and Marianne Mithun. In press. Languages
of North America. Atlas of the World’s Languages, ed. by J. Moseley and Ronald E.
Asher. London: Routledge.
Campbell, Lyle and Terrence Kaufman. 1976. A Linguistic Look at the Olmecs. Ameri-
can Antiquity 41:80–89.
Campbell, Lyle and William J. Poser. 2008. Language Classification: History and
Method. Cambridge: Cambridge University Press.
Farmer, Steve, Richard Sproat, and Michael Witzel. 2004. The Collapse of the Indus-
Script Thesis: The Myth of a Literate Harappan Civilization. Electronic Journal of
Vedic Studies 11:19–57.
Language isolates and their history 17
Suárez, Jorge. 1975. Estudios Huaves. (Colección Científica, Lingüística, 22.) México:
Departamento de Lingüística, INAH.
Trask, Robert Lawrence. 1995. Origin and Relatives of the Basque Language: Review of
the Evidence. Towards a History of the Basque Language, ed. by José Ignacio Hualde,
Joseba A. Lakarra, and Robert Lawrence Trask, 65–99. Amsterdam: Benjamins.
Trask, Robert Lawrence. 1997. History of Basque. London: Routledge.
Vajda, Edward. 2001. Yeniseian Peoples and Languages: A History of Yeniseian Studies.
London: Curzon.
Viegas Barros, J. Pedro. 2001. Evidencias del Parentesco de las Lenguas Lule y Vilela.
(Colección Folklore y Antropología 4.) Santa Fe: Subsecretaría de la Provincia de
Santa Fe.
Viegas Barros, J. Pedro. 2005. Voces en el Viento. Raíces Lingüísticas de la Patagonia.
Buenos Aires: Mondragón Ediciones.
Vovin, Alexander. 1993. A Reconstruction of Proto-Ainu. Leiden: Brill.
Zamponi, Raoul. In press. Extinct Isolates, Unclassified Languages, and Families. Ama-
zonian Languages, ed. by Patience Epps and Lev Michael. Berlin: Mouton de Gruyter.
CHAPTER 2
1.1 Introduction
Information on many ancient languages has been preserved in hundreds of thousands of
cuneiform documents recovered from the remains of Near Eastern habitations, spanning
a time frame from ca. 3300 bce to at least the first century ce. The quantity and quality
of this information varies: some languages are documented by vast amounts of literary,
scholarly and administrative texts, some are known only from personal names, and still
others merely by name. Because scribes often wrote in foreign languages, traces of inter-
ference from their native vernaculars can sometimes be detected. One such instance is
documented in the fourteenth century bce Akkadian language correspondence between
the Egyptian Crown and its vassals in the Levant, in places such as Megiddo and Jeru-
salem, containing occasional glosses that reveal small glimpses of the correspondent’s
native Semitic Canaanite dialects that are otherwise undocumented. Slightly later, per-
sonal names and glosses texts from the Syrian city of Emar, not far from Aleppo, also
written in imported scribal Akkadian, show traces of a local Semitic tongue that may
have affinities with Ugaritic and ancient South Arabic (Arnaud 1995, del Olmo Lete
2012). Multilingualism was often the norm. This is well illustrated by a quote from an
inscription written in Indo-European Luwian in the name of Yararis, a ninth/eighth cen-
tury regent who ruled the city of Karkemish in what is now southeastern Turkey (Payne
2012: 87): “in the city’s writing (=hieroglyphic Luwian), in Surean (=Phoenician) writ-
ing, in Assyrian writing (=Mesopotamian cuneiform), in Taimani (=South Arabic) writ-
ing. And I knew twelve languages.” During the eighth and seventh centuries bce, the
Assyrians deported tens of thousands of people throughout their empire, and their wars
resulted in population movements that created novel sociolinguistic situations. In one
mid-seventh century letter, the writer reports that “there are many languages (being spo-
ken) in the (Babylonian) city of Nippur under the protection of the king, my master”
(Frame 2013: 88). These examples illustrate the complexities of language use, geography
and identity of those times.
Scholars working with ancient written languages face particular challenges. The texts
they work with were encoded in writing systems that underrepresented phonological
detail to various degrees, but one also has to accept the fact that written language has
its own norms and is often limited by rhetorical restrictions. Most important, many of
the surviving texts were written in foreign male languages or dialects that were taught –
mainly to young boys – by adults for whom these were not mother tongues.
20 Piotr Michalowski
Native sources also bear witness to the diversity of languages in early Western Asia and
surrounding areas. Thus, the annals of Assurbanipal (668–627 bce), the last great king of
Assyria, include mention of a unique situation: a messenger had arrived at court, but “of
all the languages of East and of West, over which the god Ashur has given me control,
there was no interpreter of his tongue. His language was foreign, so that his words were
not understood” (Cogan and Tadmor 1977: 68; the language was probably Indo-European
Lydian). A late first-millennium bce cuneiform tablet, likely from Babylon, contains 22
words in an unidentified language with their Babylonian equivalents (Lambert 1987).
Recent excavations at an Assyrian provincial capital in present-day southwestern Turkey
revealed a document from ca. 800 bce that includes the names of 59 women; only 15 of
these names can be classified as Akkadian, Hurrian or possibly Luwian, leaving us with
44 names in still another unidentified language or languages (MacGinnis 2012).
In historical times, much of Western Asia was dominated by speakers and writers of
languages that belonged to the Afroasiatic (Ancient Egyptian, Akkadian [Babylonian/
Assyrian], Eblaite, Aramaic Canaanite, Phoenician, Hebrew etc.); Indo-European (Hittite,
Luwian, etc.); and Hurrian-Urartean families. In addition, there is evidence for unclas-
sified languages and isolates. There is some measure of disagreement among scholars
concerning the isolate status of specific languages, and many attempts have been made
to link all of them to larger families or phyla. For many years, Urartean was considered
a late form of Hurrian, but it is now recognized that the two were separate languages that
belonged to a larger family used in Western Asia beginning with the middle of the third
millennium (Benedict 1960). The fact that we can classify so many ancient languages as
isolate or unclassifiable is obviously a consequence of two factors: modern vagaries of
chance discovery and ancient sociopolitical choices that privileged specific registers of
certain tongues for writing, ignoring their relatives whose preservation might have struck
them off this specific list. It is equally obvious that numerous ancient languages and lan-
guage families thrived in the oral sphere but disappeared into the mists of time.
1.2 Sumerian
Sumerian was the first written language in Western Asia, with the longest written his-
tory in the area. Beginning with the late fourth or early third millennium bce, Sumerian
was used for administrative records, school materials, eventually for literary, historical
and scientific texts (see Section 2.1 on proto-cuneiform). The language died out some
time before ca. 1900 bce, but the date and processes that led to its extinction as a spo-
ken vernacular are difficult to determine. Although in everyday records, inscriptions and
eventually in literature it was replaced by Akkadian (a member of the Semitic family),
the old language, Sumerian, continued to be used as a literary tongue to the very end of
cuneiform literacy in the first century of the Common Era. The classic early grammar is
Poebel (1923); more recent are Thompsen (1984), Attinger (1993), Edzard (2003), Zóly-
omi (2006), Kaneva (2006), Rubio (2007a), Michalowski (2008, in press) and Jagersma
(2010); for a bibliography of studies of Sumerian grammar, see Peust (2011).
The basic word order of Sumerian was strongly SOV (AOV/SV); the morphology
was agglutinative with marking of both heads and dependents. Except for one or two
prefixed derivational morphemes, nouns bore only suffixes while verbs had both prefixes
and suffixes, although more of the former than the latter. As a rule adjectives and numer-
als followed nouns, and main clauses followed subordinate clauses, although such order
could be manipulated for semantic and stylistic purposes. Possessors were marked and
Ancient Near Eastern and European isolates 21
followed the unmarked head in possessive (genitive) constructions. There was no overt
marking of definite/indefinite properties. Sumerian was a pronoun-drop language: free
pronouns were used only for focus or special accentuation. Alignment of nouns was erga-
tive regardless of aspect while independent pronouns made no distinction between transi-
tive and intransitive subject. The indexing patterns in verbs, however, reveal an aspectual
split: ergative marking in the perfective and nominative/accusative in the imperfective.
1.2.1 Phonology
In view of the way in which the cuneiform writing system underdetermined the phono-
logical inventory of the language and the very long history of Sumerian, much of it writ-
ten by speakers of very different languages, it is virtually impossible to provide a credible
description of its phonology. From the second millennium onward, the vowel inventory
very much seems to resemble that of Akkadian (there are disagreements concerning the
existence of /o/), as shown in Table 2.1.
For a variety of reasons, including the evidence of limited vowel harmony in verbal
prefixes in one part of Babylonia during a relatively short time in the third millennium,
scholars have proposed various larger vocalic inventories, e.g., Poebel (1923, six vow-
els); Keetman (2005, seven vowels); and Smith (2007, seven vowels). Smith proposed
that early Sumerian included an ATR (advanced tongue root) feature, and his reconstruc-
tion is reproduced in Table 2.2. In recent years certain scholars have proposed that Sume-
rian vowels could be long or short, but the evidence for phonemic long vowels is highly
dubious.
The consonantal inventory of Sumerian is equally contested. Once again, from the
eighteenth century bce on the consonants very much resemble what one finds in Akka-
dian, as shown in Table 2.3.
p t k
b d g
s š h(?)
z
r l
m n ŋ(?)
22 Piotr Michalowski
1.2.2 Morphology
Sumerian nouns were divided between animate and inanimate gender and two numbers,
singular and plural. Number was marked on pronouns and verbal indexes but only on
animate nouns.
Nouns could be followed by other nouns in possessive relationship and by adjectives
and/or numerals. Marking of number and case came at the end so that a Sumerian nom-
inal phrase could contain several morphemes in a specific order: NOUN ADJECTIVE
NUMERAL NOUN-Genitive-Number-Case, as in for example:
As seen in this example, in nominal possession an unmarked possessor (head) noun was
followed by the possessed noun marked with the genitive morpheme -ak, but the dative
case occupied a different slot. This is usually described as case displacement, as Suffix-
aufnahme, or as dependent-noun case preemption (Aristar 1995: 432). In pronominal
possession, the possessed noun was followed by a pronominal suffix, according to the
rules shown in Table 2.5, marking the possessor lugal-ŋu, “my king.”
The same stems were used to form independent pronouns that could be marked by
nominative, dative, comitative or similative case markers. Sumerian also had interroga-
tive and demonstrative pronouns.
Sumerian nouns were marked by suffixed case markers that were sensitive to gender:
as a rule, the spatial cases were restricted to inanimates, as evidenced by the following
paradigm (with lugal, “king,” uru (iri), “city,” tukul, “weapon”), as shown in Table 2.6.
Ancient Near Eastern and European isolates 23
1 -ŋu -me
2 -zu -zu(e)ne(ne)
3 animate -ani -anene
inanimate -bi -bi(ene)
The sole dedicated nominal plural marker (-ene) was restricted to animates. Plurality
of both animate and inanimate nouns could be signaled by reduplication of a following
adjective; even more restricted was the reduplication of the noun itself to mark plurality:
lugal-ene, “kings”
lugal gal-gal-e, “great kings (erg.)”
uru gal-gal, “great cities”
lugal-lugal, “(all?) kings”
Sumerian had a relatively rich inventory of adjectives but no regular morphological means
for signaling comparative and superlative degrees. These regularly followed nouns (for
example, uru kug “sacred/sanctified city”), but a small number of them could be preposed
in poetic contexts (kug inana “sacred goddess Inana”).
Few Sumerian adverbs are attested. These could be formed from nominal, adjectival,
pronominal or verbal bases, most commonly with a suffix -bi, originally probably an inan-
imate deictic, added to adjectival roots, e.g. gal-bi “greatly.” Another derivational mor-
pheme -eš formed adverbs out of nouns and adverbs, e.g. ud-eš “daily,” gal-eš “greatly”;
sometimes, mostly in later texts, the two could be combined, e.g., gal-bi-eš, “greatly.”
Verbs were either simple or complex, consisting of a noun that was most often marked
as absolutive (often a body part) and an inflected verb. Sumerian had a complex agglu-
tinative verbal morphology with five prefix and four suffix slots (the parsing of these
morphemes differs from author to author). The prefixes marked mood, conjunction, inner
aspect or voice (the matter is debated), multiple applicatives and main argument index-
ing. The applicatives included benefactive, comitative and locational/instrumental, and
these could be utilized for several purposes (e.g. comitative and locative prefixes could
be used to mark causatives). The suffixes marked aspect, main argument affixing and
future tense for imperfectives. The verb (and the whole clause) could be followed by a
24 Piotr Michalowski
1.3 Kassite
The word that we render as “Kassite” referred to an ethnic group, in Babylonian terms, as
well to the language associated with these people (Brinkman 1976–1980, Sassmannshausen
Ancient Near Eastern and European isolates 25
1999, Zadok 2015). The modern term derives from the Akkadian ethnicon kaššû, which
is thought to have been an adaptation of Kassite galdu, or the like. There is no trace of
anyone named thus or bearing an identifiable Kassite personal name before ca. 1800 bce,
when they first appear in northern Babylonia and towards the west, settling in rural areas
and fortresses. People referred to as Kassites then begin to appear in Babylonian texts,
usually as hostiles living in groups referred to as “houses” associated with eponymous
ancestors or totemic names (e.g., the place name Bīt-Hašmar “House of Falcon,” Beaulieu
1988; the word bītu is Babylonian). As Babylonia descended into chaos during the late
seventeenth century, some of these groups established local power centers. In the wake of
a political vacuum after a Hittite army raided Babylon in 1595 bce, one such Kassite group
was able to take the throne of the city. This Kassite royal family would rule the land until
ca. 1155, and even after that, Kassite officials were prominent in some parts of the polity
until the eighth century, and some of the “houses” continued to serve as administrative
units (Brinkman 1976–1980: 466). They appear sporadically in texts until the end of the
Assyrian empire ca. 612 bce. Classical writers described a warlike group/people named
Cossaei (and variants) in western Iran, but it is only speculation that this had anything to
do with the earlier Kassites.
The members of the Kassite dynasty – the longest such reign in Babylonian history –
may have carried Kassite names, but like so many foreign ruling families in the area,
they quickly took on local identities. Contemporary writings, be they epistolary, inscrip-
tional, administrative or literary, bear no traces of any new Kassite influence, although a
few Kassite gods were added to the pantheon. As noted by Brinkman (1976–1980: 467),
“there is no obvious trace of either a Kassite ruling caste of officials or even of a dispro-
portionately large Kassite population within Babylonia.” Unlike the rulers, many people
identified as Kassites in administrative texts bore Babylonian names, even if their fathers
and grandfathers had Kassite names.
Where the Kassites came from before they appeared in Babylonia is unknown. After
the twelfth century bce, they are primarily attested in the eastern highlands (Reade 1978).
This, and their mastery of equid breeding and handling, suggests an origin in the eastern
mountains, but that is only speculation.
including names of Mesopotamian kings, translated into Babylonian, mainly from Sume-
rian, but also including nineteen Kassite names. The second is a unique first-millennium
glossary from Babylonia of 48 words, 16 of which are divine names, translated into
Akkadian (Balkan 1954: 2–3, Brinkman 1977). The date of the original composition of
these texts is difficult to establish, and there are discrepancies between them concerning
the rendition of certain words, but the glossary seems to be the more reliable source
(Balkan 1954: 4–11, Brinkman 1969: 242).
and Ivanov 1995: 631). This may indicate that at some time certain “Kassites” may have
lived in proximity to an early Indo-European speaking population.
1.4 Hattic
The Hattic (Hattian) language is documented in almost 360 texts from the Hittite archives
in Hattusha (modern Boghazköy) and Sapinuwa (modern Ortaköy), both in central Tur-
key. The Hittite polity was a major power in the Near East, ruling most of central and
eastern Anatolia, often expanding into parts of Syria. Its archives, totaling more than
30,000 cuneiform tablets in seven languages (Hittite, Luwian, Palaic, Hurrian, Akkadian,
Sumerian, Hattic), span a period from ca. 1650–1200 bce. The majority of texts were
written in Indo-European Hittite, which is well understood, and therefore the 15 extant
Hattic-Hittite bilingual texts provide important data for the reconstruction of the Hattic
language. The rest of the documentation consists of monolingual Hattic texts and of pas-
sages in the language embedded in Hittite rituals without translation. The modern name
of the language derives from Hittite hatt-ili, “in Hattic,” with the Hittite comparative and
language designation adverbial suffix -ili.
Almost all extant Hattic texts or textual passages relate to the cult, and there can be no
doubt that early Hittite (Old Kingdom) religion was strongly influenced by Hattic prac-
tices: this is evident in the mythology and in the names of deities, many of which were
Hattic. It has often been assumed that Hattic was an older language that served as the
substrate for Hittite and related languages. Goedegebuure (2008) has argued that it was the
other way around: Indo-European Luwian served as a substrate for Hattic within the com-
plex sociolinguistic context of early second millennium bce Anatolia. It is difficult to know
who still spoke Hattic in Hittite times, but it is possible that the language was still alive as
late as the fourteenth century bce. For descriptions of the grammar, see Diakonoff (1967b),
Kammenhuber (1969), Dunajevskaja and Diakonoff (1979), Girbal (1986), Soysal (2004),
Klinger (2005), Kassian (2010), and the succinct description by Goedegebuure (2013).
indexing morphology, that Hattic was “an active, or semantically aligned language with
an ergative base” that moved more towards ergative alignment in middle Hittite times
(fourteenth century).
1.4.2 Phonology
Information on Hattic phonology must be retrieved from the simplified syllabic/logo-
graphic system of Babylonian cuneiform adopted by the Hittites and which may not have
had the resources to represent the full phonological inventory of Hattic. Moreover, it is
apparent that Hittite scribes who copied or wrote down Hattic passages often had prob-
lems with the representation of the language in writing, as evidenced by much variation
in the writing of the same words. The minimal inventory, as reconstructed by Soysal
(2004: 70), is shown in Table 2.7 and Table 2.8, although some would recognize vowel
length in Hattic.
Dunajevskaja and Diakonoff (1979) also recognized the existence of glides (/w/, /j/)
and possibly /c/; Kassian (2010: 312–313) would add the affricates /ts/ and /č/, or an
interdental fricative /θ/; rather than /ḫ/, he and others suggest a velar/uvular spirant /h/.
1.4.3 Morphology
Hattic had two genders, masculine and feminine, as well as two numbers, singular and
plural. As already noted, the language was almost exclusively prefixing. Nouns had at
least five prefix slots and one for case suffixes (nominative, accusative, genitive, dative
and locative, although there are disagreements about the actual morphology). Verbal
roots carried prefixes and one or two suffix slots. Soysal (2004: 189), for example, recon-
structed six slots before the root, marking person, number, various spatial categories and
main argument indexing, as well as one suffix slot, containing tense or modal morphs,
which could also be followed by clitic particles such as the conjunctive. The imperative
consisted of the stem followed by -a, as in miš-a “take,” but could also take locative
prefixes. There is no consensus on the number of morphological slots or on the meaning
of all morphemes. For an example of such differing opinions, see the review of Soysal
(2004) by Braun and Taracha (2007) or compare the charts of infixes in Dunajevskaja
(1959: 26) and Soysal (2004: 189).
Nominal possession was marked by the case ending -(V)n. Most occurrences of nomi-
nal possession resemble left-dislocated, topicalized constructions in other languages, e.g.,
i u
(e)
a
Labiale p P’ (b) f or v m,
m’(?)
Dentale t t’(d) n,
n’(?)
Vélaire k k’ (g)
Palato- š č
Alvéolaire
Alvéolaire s s’ (z) ts l r
Laryngale (h)
Latérale
Rétroflexe ll rr
tabarna-n le-wuur, “king-OBL PRO-land,” “of the king, his land=king’s land.” Nouns
were apparently not flagged for subject and object, but these were indexed in the verb.
1.5 Elamite
Elamite was a language of southwestern Iran, attested in writing from ca. 2200 to 300
bce, in administrative texts, building and dedicatory inscriptions, a diplomatic treaty, per-
sonal names, items marked as Elamite in wordlists and a few loanwords in Mesopotamian
writings. The geographical range of spoken Elamite is difficult to establish based on the
written record and undoubtedly changed over time, but may have encompassed the area
delimited as Fars, Khuzestan and southern Luristan in modern Iran. The documentation
is discontinuous and is documented in relatively limited registers, known mainly from
administrative texts and royal inscriptions, but also from a handful of entries in Meso-
potamian wordlists and from personal names in Mesopotamian documents (Zadok 1987:
1–16). Traditionally, the language has been divided into four phases: Old Elamite (OE,
a treaty from ca. 2200, a somewhat later text from southern Mesopotamia, and then two
economic texts from Iran and a handful of magical charms from Mesopotamia from ca.
1800 bce, and a few, mostly unpublished, royal inscriptions); Middle Elamite (ME, ca.
1500–1000 bce, approximately 175 royal building and dedicatory inscriptions, mainly
from Susa and Choga Zanbil, but also from other sites, administrative and legal texts
30 Piotr Michalowski
from Susa and Anshan); Neo-Elamite (mainly seventh/sixth centuries bce, royal inscrip-
tions, 25 fragmentary letters from Nineveh, a few from Susa and one from Armavir
Blur in Armenia, as well as some scattered unprovenanced ones, ca. 300 early sixth-cen-
tury administrative texts, an omen text and a hemerological [favorable and unfavorable
days for activities] text from Susa, some seal inscriptions); and Achaemenid Elamite
(AE, 539–300 bce, administrative texts, Elamite versions of mainly multilingual royal
inscriptions).
The latest phase is the best attested, documented by royal inscriptions, some of them
multilingual, and by large administrative archives. Such archives, written in a number
of languages, but mainly in Elamite, were probably kept in many places at the time.
They are known today from two large collections from Persepolis (Stolper 2014) and
from a single tablet from Susa and one or two fragments from Kandahar, in Afghanistan
(Fisher and Stolper 2015), demonstrating that the use of written Elamite extended over
an enormous territory but apparently only for administrative purposes. There are many
differences between the linguistic remains from these periods, with particular changes
observable in AE, some of them attributable to language interference from Old Persian.
Apparently, most of the administrative scribes were Iranophone who had learned Elamite
with various degrees of accuracy; as a result AE acquired not just Iranian loanwords but
also important contact induced morpho-syntactic calques and other forms of restructuring
(Henkelman 2011).
There are disparate and conflicting opinions on many aspects of Elamite grammar
that cannot be easily summarized here. The more comprehensive studies of the language
include Diakonoff (1967a), Reiner (1969), Grillot-Susini and Roche (1987), Khačikjan
(1998, 2010), Stolper (2008), Krebernik (2006), Tavernier (2011) and Bavant (2014:
235–336). For the Elamite lexicon, see Hinz and Koch (1987) and Zadok (1984, 1995).
1.5.1 Phonology
The language was written with a syllabic version of Mesopotamian cuneiform script and
therefore the phonology at first glance very much resembles what is found in Akkadian,
but this, once again, is probably an illusion founded on phonetic underrepresentation in
the script. The only full study of the subject is found in Paper’s (1955) analysis of AE,
often referenced with some minor additions or subtractions. There are differing opinions
on the nature of Elamite stops, as already noted by Labat (1951: 28). Paper recognized
only voiceless /p/, /t/, /k/; Reiner (1969: 111–115) observed that spelling conventions
indicated some distinction between two series of stops, possibly a tense/lax opposition,
perhaps realized intervocalically as voiced/voiceless. Tavernier (2002, 131–138, 2011:
320) built on this analysis. He agreed that Elamite had no voiced stops but recognized a
tense/lax (fortis/lentis) distinction, expressed in writing by reduplication of consonants,
reconstructing /p’/, /t’/, /k’/ rather than /b/, /d/, /g/ and /s’/ rather than /z/, but also raising
the possibility of tense /m’/ and /n’/, following Reiner and Khačikjan, but this is doubtful.
He also recognized the existence of an alveolar affricate /ts/ and palato-alveolar affricate
/č/ and retroflex sonorants /ll/ and /rr/. See also Stolper (2008: 58–59), who provides this
inventory of consonants, observing that the status of possible glides, a vocalic /r/, the true
phonetic nature of /h/ (which is lost in AE), etc. are not clear at present.
It is generally agreed that Elamite had four vowels, /a/, /e/, /i/ and /u/, no diphthongs
and that vowel length was not phonemic. Tavernier (2011: 320) posited the nasalization
of vowels and also the possibility of an additional vowel /ǝ/.
Ancient Near Eastern and European isolates 31
1.5.2 Morphology
Elamite morphology was agglutinative. There were two genders, animate and inanimate,
and two numbers, singular and plural. Most words consisted of one or two syllables (CV,
VC, VCV, CVCV), with only a very small number of CVCCV lexemes (Tavernier 2011:
320). Basic word order was verb final; SOV, in the earlier phases, but more flexible in AE.
Nouns were not marked for case but could be followed by locational particles or clitics (it
is often difficult to determine word boundary), but pronouns were signaled with suffixes
that separately marked subject/indirect object and transitive object. Adjectives followed
nouns. The morphology was exclusively suffixing, and it was postpositional, although
there was a preposition kuš, “to(ward), until,” in ME and AE (Stolper 2008: 71).
Because of the lack of direct marking of arguments, there are conflicting opinions on
the basic typological description of alignment. Diakonoff (1967a: 99), and others claimed
it was ergative, but this has been strongly critiqued by Wilhelm (1978, 1982). According
to Khačikjan (1998: 65), it was an accusative language with traces of ergative features.
Elamite had a series of suffixes that were attached to nominals (including numerals),
nominalized verbal forms, pronouns, and even an independent negative particle; entire
phrases could be nominalized (“grain for women-who-have-been-sent-hither-from-
Susa”). These suffixes have been the subject of much debate and have been described as
class, or as gender/person/number suffixes (Reiner 1969: 77, Stolper 2008: 60, Tavernier
2011: 321–322); the animate distinguished three persons and two numbers, but there
was only one person (most likely only third) in the inanimate, which was not marked for
number (the terminology was introduced by Reiner 1969: 77):
The second person is attested only with certain verbal forms, never with nouns; indeed,
there are clear examples in which the animate third person delocutive was used for sec-
ond person address. The variants of the inanimate are somewhat problematical; -n may
simply be an allomorph of -me, and -t may have been an older collective marker (W. Hen-
kelman, personal communication). A bewildering array of functions has been ascribed to
these suffixes: possessive, appositive, derivational, syntactic agreement, etc. Thus, for
example, with sunki “king,” the forms are: sunki-k “I, the king,” sunki-r “he, the king,”
sunki-p “they, the kings,” but sunki-me “kingship, kingdom.” Bavant (2014: 249, 314),
who has thoroughly analyzed these suffixes, confirms only an attributive possessive func-
tion, as in e.g.:
siyan pinikir-me
temple Pinigir-SUFF
“temple of (divine) Pinigir”
taki-me u-me
life-3.IN I-SUFF
“my life”
32 Piotr Michalowski
menik hatamti-k
ruler Elam-SUFF
“(I) ruler of Elam”
menik hatamti-r
ruler Elam-SUFF
“ruler of Elam”
napip hatamti-p
gods Elam-SUFF
“the gods of Elam”
In this reconstruction, -me was used when the possessed was inanimate; -k, when it was the
speaker; -r, when it was an animate noun different from the speaker; and -p, when it was
a plural animate (Bavant 2014: 314). The derivational function of some of these markers
would follow from the possessive function, e.g. hatamti-r, “of Elam” = “Elamite.” More-
over, because these class morphemes were used with adjectives, Elamite was a language
in which adjectives were essentially denominal, e.g. napi-r riša-r, “god-SUFF great-
SUFF” = “great god” (Tavernier 2011: 323). The fact remains that these suffixes were
used to mark a variety of attributive relations: adjectival, dative, locative, etc.
Possession in Elamite took many forms and in some cases may have distinguished
alienable versus inalienable possession, as described by Bavant (2014: 303). There were
two series of possessive suffixes (Tavernier 2011: 325–326). In NE and AE, there was a
suffix -na that is generally described as a genitive, but may have been a composite of n +
relative a, that replaced the entire suffix system, hence a sign of reduction of syntactical
complexity under Iranian influence (W. Henkelman, personal communication). Elamite
verb forms made a distinction between perfective and imperfective aspects as well as
distinguishing indicative, imperative, prohibitive and optative moods.
A poet had the Mesopotamian King Shulgi of Ur (Rubio 2006) claim to speak at least
four languages, among them Sumerian, Elamite and the language of the “Black Land,”
which is generally assumed to refer to Marhashi and a later text mentions a “Marhashian
translator,” but the few personal names of people designated as coming from there seem
to be Elamite, Akkadian, Hurrian, or unclassifiable (Glassner 2005), and this tells us
nothing about the language(s) of Marhashi, now thought to be in the general area of Jiroft
in the Kerman province of southeastern Iran (Steinkeller 2006, 2014, see Section 2.3).
The consequences of such ambiguity are perhaps best illustrated by the case of the
Kashka “people” or language. Hittite sources from the second millennium bce mention
a group or polity labeled as Kashka, who occupied the northern frontier in north-central
Anatolia (Singer 2007). A study of the ethnogenesis of the Kashka shows that, while they
had their own political and military role to play, they were not otherwise distinct from
the Hittites, and there is no convincing evidence for a separate Kashka language (Gerçek
2012: 55–56). Singer (2007) argued that they were essentially Hattians.
Mesopotamian documents as well as historical and literary texts include words that
referred to human groups in a manner that may be roughly described as ethno-linguistic.
They also preserve an enormous number of personal names in a variety of languages,
many of which cannot be ascribed to any known tongue. Moreover, persons from other
lands sometimes took on Sumerian or Babylonian names, and in certain times and places,
visitors have been registered in Mesopotamian documents using local appellations rather
than their original native names. Terms used for geographical areas did not necessarily
describe languages, so that even mentions of translators for people of a certain area did
not necessarily define separate languages. It is also important to keep in mind that people
in the ancient world often wrote in languages that were not their mother tongues and that
the practical use of writing was often linked with issues of prestige, power and display.
2.1 Proto-Cuneiform
The earliest notational system of Western Asia that is considered, by some scholars at
least, as writing was proto-cuneiform, known from more than 6,000 clay tablets from
southern Mesopotamia, most of them from the metropolis of Uruk, dating to ca. 3350–
3000 bce (Englund 1998, Woods 2010). The script was invented towards the end of a
period of rapid multifarious economic expansion reflected in massive urban growth in
Uruk itself, which had expanded to an area of ca. 250 hectares and may have housed
between 25,000 and 50,000 inhabitants (Algaze 2012, Nissen 2016). All the Uruk tablets
were discovered in trash deposits and fill associated with majestic architectural remains
in the very heart of the city in an area later occupied by temples and named Eanna. Thus,
the tablets have no precise functional provenienve. This complex script was developed
solely for the recording of economic transactions and used a fluctuating number of signs,
600 in its earliest documented phase, at some points numbering more than 900. Meaning
was conveyed through signs standing for specific goods, for semantic classifiers, account-
ing activities, as well as complex numerical and metrological signs, but also through
use of distinct tablet shapes and sizes as well as rulings and divisions used to format the
tablets. The signs were made with a reed stylus on clay tablets. During the earliest docu-
mented phase, labeled Uruk IV, the signs were curvilinear, drawn on the tablets with mul-
tiple stokes, but in the later Uruk III stage the contours became straight as each line was
impressed with a limited repertoire of strokes using a differently designed reed stylus,
leading to more abstract stylized representations, and this was the writing technique that
would be used for cuneiform for millennia, down to the first centuries of the Common Era
34 Piotr Michalowski
(the Roman numbers are used to designate archaeological levels in the Eanna district).
The second phase is also characterized by an increased number of goods and transac-
tion types as well as new information such as time notation, but also by more complex
organization of information on tablets. The fact that the Uruk III documents registered
more data and more complex information loads may be due to changes in administration,
internal developments of the registration system or both. The distinction between earlier
and later text groups is not stratigraphic but epigraphic because the discarded tablets were
not associated with any archaeological levels. To date, Uruk IV type tablets have not been
discovered outside of the city, but during later phases, the use of proto-cuneiform is docu-
mented from at least seven cities from both southern and northern Babylonia.
The earlier history of the script is unknown. Some of the sign shapes and numerical
concepts were influenced by or even borrowed from earlier accounting techniques such
as small clay tokens or tablets with purely numerical signs, but proto-cuneiform as a com-
plex system was undoubtedly a novel invention. Among the 600 signs of the earlier tablet
group, more than half of the signs were either combinations or graphically altered ver-
sions of simple signs. Among the simple signs, 98 were totally abstract and 114 were nat-
uralistic drawings, such as a picture of a head or a boat, or somewhat abstracted versions
of such depictions (Nissen 2016: 44). Therefore, while pictography played an important
role in the creation of proto-cuneiform, it cannot be classified as a pictographic system.
This bookkeeping system was used to register movement of various dry goods such as grain,
flour, bread, dried fruits, beer, oils, textiles, as well as animals and their products, rationing
and organization of laborers and/or slaves etc. Distinctions were made between kinds of ani-
mals and humans, marking various terminological categories such as age or gender.
A small percentage of proto-cuneiform tablets consists of multiple copies of standard-
ized lists of words of various classes such as names of professions, of metals or of types
of pottery that were used to teach the writing system to prospective scribes; while most
of these were already in use during the earlier phase, most of the extant manuscripts date
from Uruk III times. In essence, this system, which was devised only for bookkeeping,
consisted of things and ways of counting them, with rudimentary notations concerning
the actors and activities involved and did not encode any morphological information.
Some signs or sign combinations were used to convey personal names, but these have
proven difficult to analyze. Many of the proto-cuneiform symbols resemble later cunei-
form word signs that can be read in Sumerian or Akkadian, and therefore it is possible to
understand the recorded transactions without recourse to any specific language.
Thus, to a large degree proto-cuneiform was an autonomous semiotic system based
on and parallel to natural language and therefore cannot be classified as a writing system
in the narrow sense of the term. One schooling composition may have been an attempt
to record the outlines of a story, but it did so without any overt notation of morphologi-
cal information (Civil 2013). Most of the signs can be defined as logograms, with very
limited recourse to phonetic values used almost exclusively as phonetic complements
(Woods 2010: 43). There are conflicting opinions on the language that may have been
associated with the documents and wordlists. Because of certain gloss-like phonetic ele-
ments, many Sumerologists assume that it was Sumerian (e.g. Rubio 2005, Wilcke 2005),
but Englund (2008: 81, 2009: 7–27), who has worked on this material more than anyone
else, has been agnostic on the matter but has also insisted on occasion that there are no
traces of the Sumerian language in any Mesopotamian inscriptional material prior to the
so-called Archaic Texts from the city of Ur that are approximately one or two centuries
younger than the last proto-cuneiform tablets.
Ancient Near Eastern and European isolates 35
2.2 Proto-Elamite
The late fourth millennium bce Proto-Elamite tablets evidence the earliest writing in Iran,
although simpler commodity-recording devices were used there prior to the invention
of this script. More than 1,700 Proto-Elamite clay tablets have been discovered from 8
sites, spanning most of the Iranian plateau, from Susa in the west to a single tablet from
Shahr-e Sokhta in the east, all but approximately 200 from Susa (Desset 2016: 69), with a
distribution over a much larger area than Proto-Sumerian script ever achieved. The script
remains undeciphered.
The known tablets recorded economic activities using 17 numerical and approximately
1,400 (or 1,900, depending on interpretation) non-numerical signs drawn with a stylus on
clay tablets dating from the end of the fourth and the very beginning of the third millen-
nium bce (Dahl 2009, 2013). It is generally recognized that initially, at least, the inventors
of Proto-Elamite were influenced by fourth millennium Mesopotamian proto-cuneiform
and even borrowed specific signs (including the numerical ones) from the older system
or from some common source. Desset (2016: 90) suggested that the beginnings of Proto-
Cuneiform and Proto-Elamite were roughly contemporary, but Pittman (2013: 328–329)
has persuasively argued on stratigraphic, iconographic and other grounds that the earliest
examples of an early phase of Proto-Elamite writing were discovered in contexts that are
later than the time of the first Uruk tablets.
With the possible exceptions of two mathematical exercises, all the known Proto-
Elamite texts recorded complex bookkeeping activities and included headings and sub-
scripts as well as commodities (foodstuffs, animals, types of workers, tools etc.), followed
by numerical signs, and the context appears to be restricted to agricultural production,
labor and animal management (Dahl 2013: 252). A small group of signs may have desig-
nated proper names (owners, geographical names, professional designations etc.) using
syllabic signs (Dahl 2005), but it is difficult to ascertain the levels of linguistic encoding
in Proto-Elamite, and it is impossible to know what language or languages may have been
involved, as is the case with contemporary proto-cuneiform. The modern name is a mis-
apprehension: it was coined in conjunction with the geographical term Elam, originally in
the labeling of archaeological strata in the city of Susa, without any linguistic connotation
and no links have been established between recordings in Proto-Elamite script and the
Elamite language. In addition to the scholarly resources cited here, see now the detailed
survey by Desset (2012: 3–92).
of dubious authenticity. Because of differences in the repertoire and form of signs in some
texts, it is difficult to establish the levels of synchronic and/or diachronic variation within
the system.
Ten of the eighteen monumental texts from Susa can be attributed to the reign of King
Puzur-Inshushinak, who ruled in Susa ca. 2150 bce (André and Salvini 1989). Almost
half are closely related pseudo-duplicates of a royal inscription of some kind, two of
which are accompanied by perfectly understandable Akkadian inscriptions, but this has
proven to be of little help, and Linear Elamite remains undeciphered. Even though this
was a time when Elamite was presumably spoken in Susa, there are no clues to identify
the language of these inscriptions and it is possible that the Linear Elamite could have
been used for different languages in places such as Shadad or Konal Sandal.
Nevertheless, most attempts at deciphering Linear Elamite have identified the under-
lying language as Elamite. Desset (2012: 104–127, with description of earlier attempts at
decipherment) has recently proposed, more narrowly, that the Puzur-Inshushinak inscrip-
tions, at least, were written in a form that language, but seemingly does not exclude the
possibility that some of the other Linear Elamite inscriptions might be in Elamite as well.
The fact that so many of the Susa texts were associated with the reign of this king has led
many scholars to assume that Linear Elamite was used for a very short time, perhaps even
limited to the reign of one ruler. Some recent work suggests that only a few generations
of scribes were involved (Dahl, Petrie and Potts 2013: 375), but Desset (Desset 2012: 98,
2016: 93–95) argued that the script was in use for several centuries, from ca. 2500 to ca.
1900 bce if the somewhat problematical Konar Sandal texts are taken into consideration.
The total inventory of signs in Linear Elamite numbered between 61 to over 200,
depending on how one counts them, suggesting a syllabary mixed with logograms
(Moqaddam 2009: 55–56, Desset 2012: 99–100). Dahl (2009: 30) has suggested that
the non-display texts, which often have very different glyphs and patterns, were a form
of pseudo-writing, mimicking a script that could no longer be understood for various
cultural reasons.
The Linear Elamite enigma has become even more complicated with the discovery
of four inscribed clay tablets in Konar Sandal South (Moqaddam 2009, Madjidzadeh
2011, Desset 2014), in the southeastern part of Iran that the Mesopotamians designated
as Marhashi (Steinkeller 2006, 2014). The tablets, dating from the second half of the
third millennium bce, seem to contain at least two different scripts, although it must be
noted that the circumstances of their discovery have prompted some skepticism, not all of
which has dissipated (Lawler 2007; for a more precise description of the circumstances
and context of their discovery, see Madjidzadeh 2011: 236–241). Two of them are cov-
ered on the obverse with Linear Elamite, with a small number of symbols in another sys-
tem, described as “geometric,” on the reverse. A third tablet is also covered with Linear
Elamite on the obverse, but with a line of “geometric” symbols at the end. The fourth
tablet is only a fragment with only remnants of six symbols remaining and while these
might be Linear Elamite, it is also possible that it only contains ornamental designs. If
these two sets of designs were in fact writing, then at least two different systems, and
perhaps two different languages were involved.
In addition, objects with a few designs that have been identified as Proto-Elamite have
been found in Gonur Tepe in Turkmenistan (a potsherd with two incised designs, Kloc-
hkov 1999) and at the site of Ra’as al-Junays in Oman (two stamp seals, a seashell ring;
Glassner 1999: 137–140, 2002: 363–368). It is highly doubtful that these designs are in
any way related to Proto-Elamite writing (Desset 2012: 94 n 4).
Ancient Near Eastern and European isolates 37
2.4 Gutian
People described as Guti/Gutium first make an appearance in Mesopotamian texts from
ca. 2200 bce during a time of weakening central control and appear sporadically in texts
into the late first millennium bce (Foster 2013). Some documents mention the names
of kings of Gutium who presumably held some form of hegemony over local rulers in
certain cities. The longest lists of such names are found in a literary text known as The
Sumerian King List that describes, often in fictional terms, the kings of cities that held
hegemony over Mesopotamia in early periods. The earliest known manuscript, from ca.
2100 bce (Steinkeller 2003), refers to this as a time when a horde, or army (Akkadian
ummānum) ruled the land. Only six names of such rulers are preserved before a break,
two of which are probably Akkadian, and only three have some resemblance to those
enumerated in eighteenth-century manuscripts, which refer to this period as the time of
the “troops/horde of the land of Gu-tu-um,” but even in these, no two versions have the
same list of rulers in the same order (Michalowski 1983: 247–248). There are also names
of persons, mostly from the Zagros Mountains to the east that some have identified as
Gutian, although only rarely labeled as such in texts.
It is impossible to know if all these names belonged to one language, how many of
them were garbled and which may have been invented in antiquity. Nevertheless, Hen-
ning (1978) and Gamkrelidze and Ivanov (1989, 1995: 786–787 n. 30, 2013: 119–120)
have proposed links between Gutian and the Tocharian branch of Indo-European, known
from sixth to eighth century ad texts found in the Tarim Basin in China, but much of this
spurious (Zadok 1987: 21).
2.5 Lullubean
Much the same can be said about groups such as the Lullubi or Lul(l)u, who are first
documented in the western Zagros Mountains in the general area around Suleimani-
yah, in present-day Kurdistan, but also moved into northern Mesopotamia, between the
Syrian Jebel Sinjar and Tur Abdin ranges in the second millennium bce (Eidem 1992:
50–54, Schrakamp 2013) but were still attested in the Zagros during the first millennium
bce (Zadok 2005). The language of these groups – and it is unlikely that they were lin-
guistically homogeneous – is once again known only from a small number of personal
names (some of which are Mesopotamian) and an entry in a late Assyrians wordlist
claims that their word for “god” was kiurum, or the like (Zadok 2005). The terms Guti
(Quti) and Lullu were sometimes used in Mesopotamian writings as general terms for
eastern/mountain dwelling/foreign plunderer or barbarian. Indeed, it is possible that the
name Lullu(bi) might have been related to Hurrian/Urartean lul(l)u, “foreigner” (Rubio
2007b: 103).
urban phase of the civilization (Possehl 1990: 273), but more recent excavations (Meadow
and Kenoyer 2005) have revealed inscriptions from contexts that go back 200 years or
so (Early Indus script), and it is possible that the origins of this symbolic system can
be traced even earlier in the city of Harappa (Kenoyer and Meadow 2000: 68, Kenoyer
2006). The use of the script on seals and prestige objects disappears after ca. 1900 but
seems to have been used occasionally on pottery in some parts of the civilization. More
than a hundred published attempts at decipherment or identification of the putative lan-
guage of these markings as Indo-European or Dravidian have not borne fruit (for a history
of many of these, see LeBlanc 2013). While most scholars argue that the Indus symbols
were used to convey some aspect of Indus language (Kenoyer and Meadow 2010: xliv,
n. 1), Farmer et al. (2004) and Sproat (2014, 2015) have used statistical methods to argue
that the Indus symbols were not used for the registration of natural language. This idea
has been met with much resistance by those who believe that the Indus system was used
to convey linguistic segments (Vidale 2007, Rao et al. 2010, 2015).
Many Indus script writings consist of a single sign; the average inscription runs to 5
signs, and the longest has 28. Seals provide the largest body of data for study because
they were relatively well preserved and contain longer script sequences. However, recent
excavations at Harappa have shown that they comprise only around 3% of the total
sample of inscribed objects (J. M. Kenoyer, personal communication), with many more
inscriptions preserved on incised tablets, molded terracotta tablets and inscribed pottery.
The number of discrete signs is disputed, but probably around 400 were in use, many of
them compound (A. Parpola 1996: 169–169).
Stamp seals and sealings resembling Harappan types have also been found in western
Iran, Mesopotamia and, most important, on the Persian Gulf island of Bahrain, which
was an important trade entrepôt in antiquity. There were innovations in the Harappan
symbols used on the locally made Gulf Style seals from ca. 2100; Laursen (2010: 130)
suggests that acculturated Harappans adjusted their script to accommodate writing in
another language.
2.8 Philistine
One other ancient Near Eastern unclassifiable language requires mention here, namely
Philistine. The Philistines (Plšt) were first mentioned in Egyptian texts from the reign of
Ramses III (1186–1155 bce) as one element of a group of “sea peoples” with whom the
Egyptians claimed to have fought a large naval battle and who then appear in various
books of the Hebrew Bible (Pelešet) as outsiders who settled in southwestern Israel on
the Mediterranean coast and created polities that acted as major forces in the area down
to the sixth century bce (Machinist 2000). Philistine ethnic identity was transcultural in
origin, resulting from the entanglement of various sub-groups of the “sea peoples” with
local populations and underwent dynamic changes once they settled on the Mediter-
ranean coast (Davis, Maeir and Hitchcock 2015, Knapp and Manning 2016), some of
whom might have been pirate in origin (Hitchcock and Maeir 2014). Recent excavations
at Tell Tayınat in southwestern Turkey (Harrison 2014), probably the central city of a
kingdom named P/Walastin or Patin (Weeden 2015), have revealed large amounts of
locally made pottery that has been associated with Philistines and have strengthened sug-
gestions (e.g., Singer 2013) that at least some of these people first settled north of Canaan
and assimilated with local Luwian (Indo-European) speakers before moving south.
No texts in a putative Philistine language have survived, and the only traces that remain
are personal names and a handful of loanwords that are assumed to be Philistine only
because they have no good Semitic etymology, even though in the Bible Philistines – and
their gods – have good Semitic names (the most recent collection of Philistine loanwords
in Hebrew is Niesiołowski-Spanò [2012: 427–428; see now 2016]).
The six known inscriptions from the area they controlled are difficult to classify (Davis,
Maeir and Hitchcock 2015: 142–143); one from Ashdod has been identified as Cypro-
Minoan and another from Tell Aphek has been interpreted as similar to Linear A, but
these ascriptions are not quite secure (Maeir, Hitchcock and Horwitz 2013: 11). Indeed,
it has been suggested that the Philistines used a variety of written scripts and languages
to negotiate public identity, but at least some of them at some point spoke “one or more
creoles based on Late Bronze age and/or Iron Age trade language(s)” (Davis, Maeir and
Hitchcock 2015: 157). Some inscriptions may have been written in a modified version of
the Hebrew alphabet (Rollston 2014: 204 n. 5).
Machinist (2000: 64) has suggested that, whatever language(s) these people may have
spoken earlier, by the time they came in contact with Israel, they had largely assimilated
into the surrounding Semitic language context (see also Schneider 2011). Even though
the remains of Philistine language(s) are scarce and uncertain, there is a long tradition
of attempts to identify it as Indo-European on the basis of the etymology of a few loans
such as Hebrew seren (attested as plural serenim), which reflected Philistine trn, “ruler.”
Recent studies (Giusfredi 2009, Davis, Maeir and Hitchcock 2015) have plausibly linked
this term (and later Greek tyrannos) with the Luwian royal title tarwanis and therefore
such words are of little help in identifying the languages of Phoenicia.
Ultimately, it is difficult to reconstruct a unified Philistine identity; the origins of var-
ious groups in the area were heterogeneous, from various foreign and local populations
and the resulting new groups and identity formations were in constant flux (Maeir and
Hitchcock 2017), and this suggests that the linguistic situation was equally mixed, fluid
and complex.
Scholarly research on issues relating to “Philistines” continues at a rapid pace. Among
the most recent publications one may mention Stockhammer (2017) and Ben-Dor Evian
(2015, 2017). The former, who in other publications has argued for Cypriot connections
of the “Philistines,” provides evidence requiring a rethinking of the use of Aegean-type
40 Piotr Michalowski
pottery to identify specific groups of actors in late 13th and early 12th century BCE South-
ern Levant. The latter has undermined the utility of the modern term “sea peoples,” which
has been used to render an Egyptian word thr that must probably be rendered as “allied
troops.” She suggests that these were warrior groups that were displaced in the wake of
the collapse of the Hittite empire in Anatolia; some of them settled in northern Syria,
while others moved further south. She concludes (Ben-Dor Evian 2015: 71): “it is from
the Levant itself, and not from the ‘sea’ that some of the ‘Sea-Peoples’ actually came.”
this volume; Campbell, this volume pp. 2–4); together they are sometimes referred to as
Vasconic. There have been many speculations on the languages of Europe before the spread
of Indo-European; in recent years, Vennemann (2003) has proposed a complex relationship
between Vasconic and Afroasiatic speakers in prehistoric Europe, based principally on the
analysis of hydronyms and various place names. For informed critiques of such hypotheses,
see the essays in Udolph (2013) and the comments of Baldi and Page (2006).
Etruscan has long been considered an isolate, but it is now generally recognized that it
was related to two other poorly attested ancient European languages, Raetic (Schumacher
1998c) and Lemnian (known from a single stele with two different texts from the island
of Lemnos; Schumacher 1998b, Eichner 2012, 2013) sometimes subsumed under the
general term Tyrrhenian. Still another language, known as Camunic, may belong with
these three, but some would identify it as Indo-European (Schumacher 1998a, Zavaroni
2005). This language, written in a script derived from the Etruscan alphabet, is docu-
mented by more than 125 very short rock inscriptions (most of them 1 or 2 words, the
longest had 8) from the Valcamonica Valley in northwestern Italy that have been dated
between ca. sixth/fifth century bce and the turn of the millennium.
3.4.1 Tartessian
One such hypothetical language is Tartessian, documented in inscriptions from southern
Portugal and southwestern Spain (Untermann 1997) that date to ca. 750–500 bce. These
were written in a Paleohispanic script based on the Phoenician alphabet. Some would
ascribe almost a hundred such inscriptions on stele, rocks and potsherds to this language
or language group, but not everyone agrees as to which of these actually belong to Tartes-
sian. Opinions are divided on the language of these inscriptions: most view it as unclas-
sified, but some have insisted that it was an Indo-European Celtic language. See, most
recently, the debate published in the Journal of Indo-European Studies between Koch
(2014), the main proponent of the Celtic hypothesis, and Eska (2014), as well as Valério
42 Piotr Michalowski
(2014a), who provide strong critiques of this ascription. T. Kaufman (2015), however,
presented compelling evidence that Tartessian is an otherwise undocumented/unattested
Celtic language in conjunction with an analysis of all unbroken texts.
3.4.2 Iberian
Another group of inscriptions from northeastern Spain and southern France, dating per-
haps from the fifth to the first centuries bce, or even perhaps somewhat later, written in
Paleohispanic scripts may bear witness to still another language or group of languages
labeled as Iberian (Untermann 1980, 1990). The inscriptions are found on a broad range
of objects and materials, including stone monuments, funerary stele and blocks, ceramics,
tiles, loom weights, coins, mosaics and bronzes as well as lead strips. The latter often con-
tain the longer texts, but many of these may be forgeries (Doménech-Carbó et al. 2015).
3.5.2 Linear A
The Linear A system consisted of 65 signs, written on tablets and on a variety of objects
numbering almost 1500. As noted by Kober (1948: 88), “this is the only one of the Minoan
scripts found regularly on objects which must have had a religious function, like libation
tables and votive ladles, etc.,” although many of the tablets are obviously administrative
records. Linear A was mostly written on tablets and sealings, less often on pottery, stucco,
stone and metal objects (Olivier 1986: 384). Because two-thirds of the symbols were also
utilized as consonant-vowel syllabograms in the similar Linear B script that was used
to write an early Greek dialect, it is assumed that this was also a syllabary. The oldest
exemplar is from ca. 1800 bce, but the first group comes from the palace at Knossos
Ancient Near Eastern and European isolates 43
from a century later; it was used until ca. 1450 bce. An outlier is a painted inscription
from ca. 1350 (Perna 2014: 254). Although the use of Linear A overlapped with Cretan
Hieroglyphics – indeed there are deposits that contain writings in both systems – unlike
the latter it was never used on seals, with one possible exception (Perna 2014: 256). There
is some evidence to suggest that the script – and possibly also Cretan Hieroglyphic – was
also used on parchment and therefore our picture of Minoan writings is skewed by the
chance survival of more permanent surfaces (Perna 2014: 258).
The similarity with Linear B notwithstanding, Linear A has not been deciphered, and it
is not possible to establish whether the inscriptions were used for one language or more.
Nevertheless, according to Davis (2013), there are indications that it utilized VSO word
order. There have been many suggestions to as to its identity, often focusing on Indo-
European (see discussion in Egetmeyer 2004: 235–236).
3.5.4 Eteocretan
The evidence for this language consists of five incomplete stone inscriptions excavated at
Dreroes and Praisos on eastern Crete, ranging in date from the late seventh century/early
sixth to the third centuries bce. There are three more fragmentary texts from Praisos, but
some of them may be in Greek (Duhoux 1982). The monuments were written in various
forms of the Greek alphabet and the two early items from Dreros, cut into the wall of
the temple of Apollo, together with Greek texts, were Eteocretan-Greek bilingual texts,
but only a few lines of each have survived, and the language has defied interpretation.
Gordon (1975 and elsewhere) insisted that Eteocretan (and Linear A) was North-West
Semitic, but this is hardly a mainstream view, even if has been revived recently by Mag-
nelli and Petrantoni (2013). It is highly unlikely, however, that the languages encoded in
Linear A and Eteocretan writings would have been the same.
44 Piotr Michalowski
While most of the Cypriot Syllabic inscriptions were written in Greek, at least 26
of these, give or take a few, cannot be understood and are presumed to be in another
language (or languages), which, for lack of a better word has been named Eteocypriot,
although any connection between this putative language and the language(s) of the
Cypro-Minoan texts is a matter of speculation. Four of them are bilingual, accompanied
by a Greek version. Even though the remains of “Eteocypriot” are so sparse, unsuccess-
ful attempts have been made to connect this putative language with Indo-European (Lyd-
ian, Illyrian, etc.); Semitic (Akkadian, Phoenician); and Hurrian/Urartean languages
(Steele 2013: 103).
Egetmeyer (2008, 2009, 2012, 2013) through an analysis of both vocabulary and gram-
mar has made a strong case for a hypothetical distinction between the languages of the
non-Greek Cypriot Syllabic texts from the southwestern part of the island, which he calls
Amathousian, and those from the east, which would be Golgian.
REFERENCES
Akulov, Alexander. 2016. Whether Sumerian Language Is Related to Munda? Cultural
Anthropology and Ethnosemiotics 2: 23–29.
Algaze, Guillermo. 2012. The End of Prehistory and the Uruk Period. The Sumerian
World, ed. by Harriet Crawford, 68–94. London: Routledge.
Ancillotti, Augusto. 1981. La lingua dei Cassiti. (Unicoplil Univesitario, 103). Milan:
Unicopli.
André, Beatrice and Mirjo Salvini, 1989. Réflexions sur Puzur-Inšušinak. Iranica Anti-
qua 29:53–78.
Aristar, Anthony Rodrigues. 1995. Binder-Anaphors and the Diachrony of Case Dis-
placement. Double Case: Agreement by Suffixaufnahme, ed. by Frans Plank, 431–447.
Oxford: Oxford University Press.
Arnaud, Daniel. 1995. Les traces des ‘Arabes’ dans les textes syriens du début du IIe
millénaire à l’époque néo-assyrienne: esquisse de quelques themes. Présence arabe
dans le croissant fertile avant l’Hégire. Actes de la table ronde internationale organ-
isée par l’Unité de recherche associée 1062 du CNRS, études sémitiques, au Collège
de France, le 13 novembre 1993, ed. by Hélène Lozachmeur, 19–22. Paris: Editions
Recherche sur les Civilisations.
Attinger, Pascal. 1993. Eléments de linguistique sumérienne. La construction de
du11/e/di “dire.” (Orbis biblicus et orientalis, Sonderband). Fribourg: Academic
Press/Göttingen: Vandenhoeck & Ruprecht.
Baldi, Philip and B. Richard Page. 2006. Review of Vanneman 2003. Lingua 116:
2183–2220.
Balkan, Kemal. 1954. Kassitenstudien 1. Die Sprache der Kassiten. (American Oriental
Series, 37). New Haven: American Oriental Society.
Ball, Charles James. 1913. Chinese and Sumerian. London: Humphrey Milford.
Ball, Charles James. 1918. The Relation of Tibetan to Sumerian. Proceedings of the Soci-
ety of Biblical Archaeology 40: 95–105.
Bavant, Marc. 2014. A Case Study of Basque, Old Persian and Elamite. Doctoral disser-
tation, University of Amsterdam.
Beaulieu, Paul-Alain. 1988. Swamps as Burial Places for Babylonian Kings. N.A.B.U.:
Nouvelles Assyriologiques Brèves et Utilitaires 1988: 36–37.
46 Piotr Michalowski
Benedict, Warren C. 1960. Urartians and Hurrians. Journal of the American Oriental
Society 80: 100–104.
Ben-Dor Evian, Shirly. 2015. “They were thr on land, others at sea . . .” The Etymology
of the Egyptian Term for “Sea-Peoples.” Semitica 57: 57–75.
Ben-Dor Evian, Shirly. 2017. Ramesses III and the ‘Sea-peoples’: Towards a New Philis-
tine Paradigm. Oxford Journal of Archaeology 36: 267–285.
Blažek, Václav. 2002. Elam: A Bridge between Ancient Near East and Dravidian India?
Mother Tongue 7: 123–146.
Blažek, Václav. 2008–2009. On the North Picenian Language. Talanta 40–41: 173–180.
Bleichsteiner, Robert. 1928. Beiträge zur Kenntnis der elamischen Sprache. Anthropos
23: 167–198.
Boese, Johannes. 2010. Ḫašmar-galšu. Ein kassitischer Fürst in Nippur. Festschrift für
Gernot Wilhelm anläßlich seines 65. Geburtstages am 28. Januar 2010, ed. by Janette
C. Fincke, 71–78. Dresden: ISLET.
Bomhard, Allan R. 1997. On the Origin of Sumerian. Mother Tongue 3: 1–16.
Bomhard, Allan R. 2008. Reconstructing Proto-Nostratic: Comparative Phonology,
Morphology, and Vocabulary. (Leiden Indo-European Etymological Dictionary Series,
6). Leiden: Brill.
Bomhard, Allan R. 2015. A Comprehensive Introduction to Nostratic Comparative Lin-
guistics with Special Reference to Indo-European, vol. 1 (2nd ed.). Charleston.
Bomhard, Allan R. and John C. Kerns. 1994. The Nostratic Macrofamily: A Study in
Distant Linguistic Relationship. Berlin: Mouton de Gruyter.
Braun, Jan. 1994. Chattskij i abchazo-adygskij. Rocznik Orientalistyczny 49: 15–23.
Braun, Jan. 2001. Sumerian and Tibeto-Burman. Warsaw: Agade.
Braun, Jan. 2004. Sumerian and Tibeto-Burman: Additional Studies. Warsaw: Agade.
Braun, Jan. 2009. Kassite and Dravidian. Warsaw: Agade.
Braun, Jan and Piotr Taracha. 2007. Review of Soysal 2004. Bibliotheca Orientalis 114:
193–200.
Brinkman, John A. 1969. The Names of the Last Eight Kings of the Kassite Dynasty.
Zeitschrift für Assyriologie und Vorderasiatische Archäologie 59: 231–246.
Brinkman, John A. 1976–1980. Kassiten. Reallexikon der Assyriologie und Vorderasi-
atischen Archäologie, vol. 5, ed. by E. Ebeling et al., 464–473. Berlin: Harrassowitz.
Brinkman, John A. 1977. Notes on the Kassite-Akkadian Vocabulary (BM 92005 =
82–89–18,5637). N.A.B.U.: Nouvelles Assyriologiques Brèves et Utilitaires 1977: 102.
Chirikba, Viacheslav A. 1996. Common West Caucasian: The Reconstruction of Its Pho-
nological System and Parts of Its Lexicon and Morphology. Leiden: Research School
CNWS.
Civil, Miguel. 2013. Remarks On AD-GI4 (A.K.A. “Archaic Word List C” or “Tribute”).
Journal of Cuneiform Studies 65: 13–67.
Cogan, Mordechai and Haim Tadmor. 1977. Gyges and Ashurbanipal: A Study in Literary
Transmission. Orientalia, Nova Series 46: 65–85.
Colless, Brian. 1998. The Canaanite Syllabary. Abr Nahrain 35: 26–46.
Dahl, Jacob L. 2005. Complex Graphemes in Proto-Elamite. Cuneiform Digital Library
Journal 2005/3: 1–15.
Dahl, Jacob L. 2009. Early Writing in Iran: A Reappraisal. Iran 47: 23–31.
Dahl, Jacob L. 2013. Early Writing in Iran. Oxford Handbook of Ancient Iran, ed. by
Daniel T. Potts, 233–263. Oxford University Press.
Dahl, Jacob L., Cameron A. Petrie, and Daniel T. Potts. 2013. Chronological Parame-
ters of the Earliest Writing System in Iran. Ancient Iran and Its Neighbours: Local
Ancient Near Eastern and European isolates 47
Developments and Long-Range Interactions in the 4th Millennium bc, ed. by Cameron
A. Petrie. (British Institute of Persian Studies Archaeological Monographs Series, 3),
353–378. Havertown: Oxbow Books.
Daniels, Peter T. 1996. The Byblos Syllabary. The World’s Writing Systems, ed. by Peter
T. Daniels and William Bright, 29–30. Oxford: Oxford University Press.
Davies, Anna Morpurgo and Jean-Pierre Olivier. 2012. Syllabic Scripts and Languages in
the Second and First Millennia bc. Parallel Lives: Ancient Island Societies in Crete and
Cyprus: Papers Arising from the Conference in Nicosia Organized by the British School at
Athens, the University of Crete and the University of Cyprus, in November-December 2006,
ed. by Gerald Cadogan, Maria Iacovou, Katerina Kopaka, and James Whitley. (British
School at Athens Studies, 20), 105–118. London: British School at Athens.
Davis, Brent. 2011. Cypro-Minoan in Philistia? Kubaba 2: 40–74.
Davis, Brent. 2013. Syntax in Linear A: The Word-Order of the ‘Libation Formula.’ Kad-
mos 52: 35–52.
Davis, Brent, Aren M. Maeir, and Louise A. Hitchcock. 2015. Disentangling Entangled
Objects: Iron Age Inscriptions from Philistia as a Reflection of Cultural Processes.
Israel Exploration Journal 65: 140–166.
Del Olmo Lete, Gregorio. 2012. Ugaritic and Old(-South)-Arabic: Two WS Dialects?
Dialectology of the Semitic Languages: Proceedings of the IV Meeting on Compara-
tive Semitics, Zaragoza, 06/9–11/2010, ed. by Federico Corriente, Gregorio del Olmo
Lete, Ángeles Vicente and Juan-Pablo Vita. (Aula Orientalis, Supplementa, 27), 5–23.
Sabadell: AUSA.
Desset, François. 2012. Premières écritures iraniennes: les systèmes proto-élamite et
élamite linéaire. (Dipartimento di Studi Asiatici, Università degli Studi di Napoli
“L’Orientale,” Series Minor, 76). Naples: Università degli Studi di Napoli “L’Orientale.”
Desset, François. 2014. A New Writing System Discovered in 3rd Millennium bce Iran:
The Konar Sandal ‘Geometric’ Tablets. Iranica Antiqua 49: 83–109.
Desset, François. 2016. Proto-Elamite Writing in Iran. Archéo-Nil 26: 67–104.
Di Carlo, Pierpaolo. 2006. L’enigma nord-piceno: saggi sulla lingua delle stele di
Novilara e sul loro constesto culturale. (Università degli Studi di Firenze: Studi,
Quaderni del Dipartimento di Linguistica, 7). Florence: Dipartimento di Linguistica –
Università di Firenze.
Diakonoff, Igor M. 1967a. Elamskij jazyk (The Elamite Language). Jazyki drevnej pered-
nej Azii (Languages of the Ancient Near East), ed. by Igor M. Diakonoff, 85–112.
Moscow: Nauka.
Diakonoff, Igor M. 1967b. Chattskij (“Protochettskij”) jazyk (The Hattic [“Protohattic”]
Language). Jazyki drevnej perednej Azii (Languages of the Ancient Near East), ed. by
Igor M. Diakonoff, 166–176. Moscow: Nauka.
Diakonoff, Igor M. 1997. External Connections of the Sumerian Language. Mother
Tongue 3: 54–62.
Diakonoff, Igor M. 1999. More on External Connections of the Sumerian Language.
Mother Tongue 5: 141–144.
Doménech-Carbó, Antonio, María Teresa Doménech-Carbó, Monserrat Lastras Pérez,
Miquel Herrero-Cortell. 2015. Detection of Archaeological Forgeries of Iberian Lead
Plates Using Nanoelectrochemical Techniques: The Lot of Fake Plates from Bugarra
(Spain). Forensic Science International 247: 79–88.
Duhoux, Yves. 1982. L’Étéocrétois: les textes – la langue. Amsterdam: J. C. Gieben.
Duhoux, Yves. 2000. How Not to Decipher the Phaistos Disc: A Review. American Jour-
nal of Archaeology 104: 597–600.
48 Piotr Michalowski
Duhoux, Yves. 2009. Eteocypriot and Cypro-Minoan 1–3. Kadmos 48: 39–75.
Dunajevskaja, Irina M. 1959. Porjadok razmeschtschenija prefiksov chattskogo glagola
(The Order of the Prefixes of the Hattic Verb). Vestnik Drevnej Istorii 67: 20–37.
Dunajevskaja, Irina M. and Igor M. Diakonoff. 1979. Chattskij (“Protochettskij”) jazyk
(The Hattic [“Protohattic”] Language). Jazyki Azii i Afriki, 3. Jazyki drevnej pered-
nej Azii: nesemitskie, iberijsko-kavkazskie jazyki, paleoaziatskie jazyki, ed. by G.D.
Sanžeev, 79–83. Moscow: Nauka.
Dunand, Maurice. 1945. Byblia Grammata. Documents et recherches sur le dévelope-
ment de l’écriture en Phénicie. Beirut: République Libanaise, Ministère de l’Éducation
National des Beaux-Arts.
Dunand, Maurice. 1978. Nouvelles inscriptions pseudo-hiéroglyphiques découvertes à
Byblos. Bulletin de Musée de Beyrouth 30: 52–58.
Edzard, Dietz Otto. 2003. Sumerian Grammar. (Handbuch der Orientalistik, Erste Abtei-
lung, Nahe und der Mittlere Osten, 71). Leiden: Brill.
Egetmeyer, Markus. 2004. À propos des inscriptions égéennes découvertes au Levant.
Antiquus Oriens. Mélanges offerts au professeur René Lebrun, vol. I, ed. by Michael
Mazoyer and Olivier Casabonne. (Collection Kubaba, Série Antiquité, 5), 229–248.
Paris: L’Harmattan.
Egetmeyer, Markus. 2008. Langues et écritures chypriotes: nouvelles perspectives.
Comptes rendus des séances de l’Académie des inscriptions et Belles-Lettres 152:
997–1020.
Egetmeyer, Markus. 2009. The Recent Debate on Eteocypriot People and Language.
Pasiphae 3: 69–90.
Egetmeyer, Markus. 2012. “Sprechen Sie Golgisch?” Anmerkungen zu einer überseh-
enen Sprache. Études mycéniennes 2010. Actes du XIIIe colloque international sur les
textes égéens, Sèvres, Paris, Nanterre, 20–23 septembre 2010, ed. by Pierre Carlier,
Charles De Lamberterie, Markus Egetmeyer, Nicole Guilleux, Françoise Rougemont,
Julien Zurbach. (Biblioteca di “Pasiphae,” 10), 427–434. Pisa: Fabrizio Serra editore.
Egetmeyer, Markus. 2013. From the Cypro-Minoan to the Cypro-Greek Syllabaries: Lin-
guistic Remarks on the Script Reform. Syllabic Writing on Cyprus and Its Context, ed.
by Philippa M. Steele, 107–131. Cambridge: Cambridge University Press.
Eichner, Heiner. 2012. Neues zur Sprache der Stele von Lemnos (Erster Teil). Journal of
Language Relationship 7: 9–32.
Eichner, Heiner. 2013. Neues zur Sprache der Stele von Lemnos (Zweiter Teil). Journal
of Language Relationship 10: 1–42.
Eidem, Jesper. 1992. The Shemshāra Archive 2: The Administrative Texts. (Historisk-
filosofiske Skrifter, 15). Copenhagen: Munksgaard.
Englund, Robert K. 1998. Texts from the Late Uruk Period. Mesopotamien: Späturuk-
Zeit und Frühdynastische Zeit, ed. by Pascal Attinger and Marcus Wafler. (Orbis Bib-
licus et Orientalis, 160/1), 15–233. Freiburg: Academic Press/Göttingen: Vandenhoeck
and Ruprecht.
Englund, Robert K. 2008. The Smell of the Cage. Cuneiform Digital Library Journal
2009/4: 1–27.
Eska, Joseph F. 2014. Comments on John T. Koch’s Tartessian-as-Celtic Enterprise. Jour-
nal of Indo-European Studies 42: 428–438.
Facorellis, Yorgos, Marina Sofronidou, and Giorgos Hourmouziadis. 2014. Radiocarbon
Dating of the Neolithic Lakeside Settlement of Dispilio, Kastoria, Northern Greece.
Radiocarbon 56: 511–528.
Fähnrich, Heinz. 1981. Das Sumerische und Kartwelsprachen. Georgica 4: 89–101.
Ancient Near Eastern and European isolates 49
Hooker, James. 1992. Early Balkan ‘Scripts’ and the Ancestry of Linear A. Kadmos 31:
97–112.
Høyrup, Jens. 1992. Sumerian: The Descendant of a Proto-Historical Creole? An Alter-
native Approach to the Sumerian Problem. AIΩN. Annali del Dipartimento di Studi del
Mondo Classico e del Mediterraneo Antico. Sezione linguistica. Istituto Universitario
Orientale, Napoli 14: 21–72.
Iosad, Pavel. 2010. Kassistkii jazyk (“The Kassite Language”). Jazyki Mira. Drevnie
reliktovye jazyki Perednej Azii (Languages of the World: Ancient Languages of the
Near East), ed. by N.N. Kazanskii, A.A. Kibrik and J.B. Korjakov, 184–187. Moscow:
Academia.
Izre’el, Shlomo. 1988. Review of Mendenhall 1986. Journal of the American Oriental
Society 108: 519–521.
Jagersma, Bram. 2010. A Descriptive Grammar of Sumerian. Doctoral dissertation, Uni-
versiteit Leiden.
Jaritz, Kurt. 1957. Die kassitische Sprachreste. Anthropos 52: 850–898.
Kammenhuber, Annalies. 1969. Das Hattische. Altkleinasiatische Sprachen, ed. by
Johannes Friedrich. (Handbuch der Orientalistik, Erste Abteilung, II. Band: Keilschrift-
forschung und alte Geschichte Vorderasiens, 1. Und 2. Abschnitt, Lieferung 2), 428–
546, 584–588. Leiden: E. J. Brill.
Kaneva, Irina Trofimovna. 2006. Shumerskii iazyk (“The Sumerian Language”) (2nd ed.).
St. Petersburg: Tcentr Peterburgskoe Vostokovedenie.
Kassian, Alexei. 2009. Hattic as a Sino-Caucasian Language. Ugarit-Forschungen 41:
309–447.
Kassian, Alexei. 2010. Chattski jazyk (The Hattic Language). Jazyki Mira. Drevnie relik-
tovye jazyki Perednej Azii (Languages of the World: Ancient Relict Languages of the
Near East), ed. by N.N. Kazanskii, A.A. Kibrik and J.B. Korjakov, 168–184. Moscow:
Academia.
Kaufman, Stephen A. 1989. Review of Mendenhall 1986. Bulletin of the American
Schools of Oriental Research 276: 85–86.
Kaufman, Terrence. 2015. Notes on the Decipherment of Tartessian as Celtic. (Journal
of Indo-European Studies Monograph Series, 62). Washington, DC: Institute for the
Study of Man.
Keetman, Jan. 2005. Die altsumerische Vokalharmonie und die Vokale des Sumerischen.
Journal of Cuneiform Studies 57: 1–16.
Kenoyer, Jonathan Mark. 2006. The Origin, Context and Function of the Indus Script:
Recent Insights from Harappa. Proceedings of the Pre-Symposium of RIHN and 7th
ESCA Harvard-Kyoto Roundtable, ed. by Toshiki Osada and Noriko Hase, 9–27.
Kyoto: Research Institute for Humanity and Nature.
Kenoyer, Jonathan Mark and Richard H. Meadow. 2000. The Ravi Phase: A New Cultural
Manifestation at Harappa. South Asian Archaeology, 1997: Proceedings of the Four-
teenth International Conference of the European Association of South Asian Archae-
ologists, Held in the Istituto italiano per l’Africa e l’Oriente, Palazzo Brancaccio,
Rome, 7–14 July 1997, vol. 1. ed. by Maurizio Taddei and Giuseppe De Marco. (Serie
Orientale Roma, 90), 55–76. Rome: Istituto italiano per l’Africa e l’Oriente.
Kenoyer, Jonathan Mark and Richard H. Meadow. 2010. Inscribed Objects from Harappa
Excavations: 1986–2007. Corpus of Indus Seals and Inscriptions, Vol. 3: New Mate-
rial, Untraced Objects, and Collections Outside India and Pakistan, ed. by Asko Par-
pola, B.M. Pande and Petteri Koskikallio. (Memoirs of the Archeological Survey of
India, 96), xliv-lviii. Helsinki: Suomalainen Tiedeakatemia.
52 Piotr Michalowski
Khačikjan, Margaret. 1998. The Elamite Language. (Documenta Asiana, 4). Rome: Con-
siglio Nazionale delle Ricerche, Istituto per gli Studi Micenei ed Egeo-anatolici.
Khačikjan, Margaret. 2010. Elamskij jazyk (The Elamite Language). Jazyki Mira. Drevnie
reliktovye jazyki Perednej Azii (Languages of the World: Ancient Relict Languages of
the Near East), ed. by N.N. Kazanskii, A.A. Kibrik, and J.B. Korjakov, 95–118. Mos-
cow: Academia.
Klinger, Jörg. 1994. Hattisch und Sprachverwandtschaft. Hethitica 12: 23–40.
Klinger, Jörg. 2005. Hattisch. Sprachen des Alten Orients, ed. by Michael P. Streck, 128–
134. Darmstad: Wissenschaftliche Buchgesellschaft.
Klochkov, I. S. 1999. Signs on a Potsherd from Gonur (On the Question of the Script
Used in Margiana). Ancient Civilizations from Scythia to Siberia 5: 165–175.
Knapp, A. Bernard and Ioannis Voskos. 2008. Cyprus at the End of the Late Bronze
Age: Crisis and Colonization, or Continuity and Hybridization? American Journal of
Archaeology 112: 659–684.
Knapp, A. Bernard and Stuart W. Manning. 2016. Crisis in Context: The End of the
Late Bronze Age in the Eastern Mediterranean. American Journal of Archaeology 120:
99–149.
Kober, Alice E. 1948. The Minoan Scripts: Fact and Theory. American Journal of Archae-
ology 52: 82–103.
Koch, John T. 2014. On the Debate over the Classification of the Language of the
South-Western (SW) Inscriptions, also Known as Tartessian. Journal of Indo-
European Studies 42: 335–427.
Komoróczy, Géza. 1976a. On the Idea of Sumero-Hungarian Linguistic Affiliation: Crit-
ical Notes on a Pseudo-Scholarly Phenomenon. Annales Univerisitatis Scientarium
Budapestinensis de Rolando Eötvös nominatae. Sectio historica 17: 259–303.
Komoróczy, Géza. 1978. Das Rätsel der sumerischen Sprache als Problem der Frühges-
chichte Vorderasiens. Festschrift Lubor Matouš, I. Teil, ed. by Blahoslav Hruška and
Géza Komoróczy. (Assyriologia, 4–5), 225–252. Budapest.
Komoróczy, Géza. 1976b. Sumer és magyar? Budapest: Magvetö Kiadó.
Krebernik, Manfred. 2006. Elamisch. Sprachen des Alten Orients, ed. by Michael P.
Streck, 159–182. Darmstad: Wissenschaftliche Buchgesellschaft.
Labat René. 1951. Structure de la langue élamite (état présent de la question). Con-
férences de l’Institut de Linguistique de Paris 9: 23–42.
Lambert, Wilfred G. 1987. A Vocabulary of an Unknown Language. M.A.R.I. Annales de
Recherches Interdisciplinaires 5: 409–413.
Laursen, Steffen Terp. 2010. The Westward Transmission of Indus Valley Sealing Tech-
nology: Origin and Development of the ‘Gulf Type’ Seal and Other Administrative
Technologies in Early Dilmun, c.2100–2000 bc. Arabian Archaeology and Epigraphy
21: 96–134.
Lawler, Andrew. 2007. Ancient Writing or Modern Fakery? Science 317, no. 5838:
587–589.
LeBlanc, Paul D. 2013. Indus Epigraphic Perspectives: Exploring Past Decipherment
Attempts and Possible New Approaches. MA Thesis, University of Ottawa.
MacGinnis, John. 2012. Evidence for a Peripheral Language in a Neo-Assyrian Tablet
from the Governor’s Palace in Tušhan. Journal of Near Eastern Studies 71: 13–20.
Machinist, Peter. 2000. Biblical Traditions: The Philistines and Israelite History. The
Sea Peoples and Their World: A Reassessment, ed. by Eliezer D. Oren. (University
Museum Monograph, 108), 53–83. Philadelphia: University Museum of the University
of Pennsylvania.
Ancient Near Eastern and European isolates 53
Madjidzadeh, Youssef. 2011. Jiroft Tablets and the Origin of the Linear Elamite Writ-
ing System. Cultural Relations between the Indus and the Iranian Plateau during the
Third Millennium bce: Indus Project, Research Institute for Humanities and Nature,
June 7–8, 2008, ed. by Toshiki Osada and Michael Witzel (Harvard Oriental Series,
Opera Minora, 7), 217–243. Cambridge: Department of South Asian Studies, Harvard
University.
Maeir, Aren M., Louise A. Hitchcock, and Liora Kolska Horwitz. 2013. On the Con-
stitution and Transformation of Philistine Identity. Oxford Journal of Archaeology
32:1–38
Maeir, Aren M., Louise A. Hitchcock. 2017. The Appearance, Formation and Trans-
formation of Philistine Culture: New Perspectives and New Finds. The Sea Peoples
Up-To-Date: New Research on the Migration of Peoples in the 12th Century BCE, ed.
by Peter M. Fischer and Teresa Bürge, 149–162. Vienna.
Magnelli, Adalberto and Giuseppe Petrantoni. 2013. L’eteocretese di Dreros e il semitico:
nuove considerazioni. Myrtia 28: 17–29.
McAlpin, David W. 1974. Toward Proto-Elamo-Dravidian. Language 50: 89–101.
McAlpin, David W. 1975. Elamite and Dravidian: Further Evidence of Relationship. Cur-
rent Anthropology 16: 105–115.
McAlpin, David W. 1981. Proto-Elamo-Dravidian: The Evidence and Its Implications.
(Transactions of the American Philosophical Society, 71/3). Philadelphia: American
Philosophical Society.
McAlpin, David W. 2003. Velars, Uvulars, and the North Dravidian Hypothesis. Journal
of the American Oriental Society 123: 521–546.
McAlpin, David W. 2015. Brahui and the Zagrosian Hypothesis. Journal of the American
Oriental Society 135: 551–586.
Meadow, Richard H. and Jonathan Mark Kenoyer. 2005. Excavations at Harappa 2000–
2001: New Insights on Chronology and City Organization. South Asian Archaeology
2001: Proceedings of the Sixteenth International Conference of the European Associ-
ation of South Asian Archaeologists, Held in Collège de France, Paris, 2–6 July 2001,
ed. by Catherine Jarrige and Vincent Lefèvre, 207–225. Paris: Editions Recherche sur
les Civilisations.
Mendenhall, George E. 1985. The Syllabic Inscriptions from Byblos. Beirut: American
University of Beirut.
Merlini, Marco and Gheorghe Lazarovici. 2008. Settling Discovery Circumstances, Dat-
ing and Utilization of the Tărtăria Tablets. Acta Terrae Septemcastrensis 7: 111–195.
Michalowski, Piotr. 1983. History as Charter: Some Observations on the Sumerian King
List. Journal of the American Oriental Society 103: 237–248.
Michalowski, Piotr. 2008. Sumerian. The Ancient Languages of Mesopotamia, Egypt,
and Aksum, ed. by Roger D. Woodard, 19–46. Cambridge: Cambridge University
Press.
Michalowski, Piotr. In press. The Sumerian Language. Handbook of Ancient Mesopota-
mia, ed. by Gonzalo Rubio. De Gruyter.
Moqaddam, Azhideh. 2009. Ancient Geometry and “*Proto-Iranian” Scripts: South
Konar Sandal Mound Inscriptions, Jiroft. From Daēna to Dîn. Religion, Kultur und
Sprache in der iranischen Welt. Festschrift für Philip Kreyenbroek zum 60. Geburt-
stag, ed. by Christine Allison, Anke Joisten-Prushke and Antje Wendtland, 53–103.
Wiesbaden: Harrasowitz Verlag.
Moran, William L. 1988. Review of Mendenhall 1986. The Catholic Biblical Quarterly
50: 508–510.
54 Piotr Michalowski
Long-Range Interactions in the 4th Millennium bc, ed. by Cameron A. Petrie. (British
Institute of Persian Studies Archaeological Monographs Series, 3), 293–336. Haver-
town: Oxbow Books.
Poebel, Arno. 1923. Grundzüge der sumerischen Grammatik. Rostock: Selbstverlag des
Verfassers.
Possehl, Gregory L. 1990. Revolution in the Urban Revolution: The Emergence of Indus
Urbanization. Annual Review of Anthropology 19: 261–282.
Rao, Rajesh P.N., Nisha Yadav, Mayank N. Vahia, Hrishikesh Joglekar, Ronojoy Adhikari,
and Iravatham Mahadevan. 2010. Entropy, the Indus Script, and Language: A Reply to
R. Sproat. Computational Linguistics 36: 795–805.
Rao, Rajesh P.N., Rob Lee, Nisha Yadav, Mayank Vahia, Philip Jonathan, and Pauline Ziman.
2015. On Statistical Measures and Ancient Writing Systems. Language 91: e189–e205.
Reade, Julian E. 1978. Kassites and Assyrians in Iran. Iran 16: 137–143.
Reiner, Erica. 1969. The Elamite Language. Altkleinasiatische Sprachen, ed. by Johannes
Friedrich. (Handbuch der Orientalistik, Erste Abteilung, II. Band: Keilschriftforschung
und alte Geschichte Vorderasiens, 1. Und 2. Abschnitt, Lieferung 2), 54–118. Leiden:
E. J. Brill.
Rollston, Christopher A. 2014. Northwest Semitic Cursive Scripts of Iron II. “An Eye for
Form”: Epigraphic Essays in Honor of Frank Moore Cross, ed. by Jo Ann Hackett and
Walter E. Aufrecht, 202–234. Winona Lake: Eisenbrauns.
Rubio, Gonzalo. 1999. On the Alleged Pre-Sumerian Substratum. Journal of Cuneiform
Studies 51: 1–16.
Rubio, Gonzalo. 2005. On the Linguistic Landscape of Early Mesopotamia. Ethnicity in
Ancient Mesopotamia: Papers Read at the 48th Rencontre Internationale, Leiden, 1.-4.
July 2002, ed. by Wilfred H. van Soldt. (PIHANS, 102), 316–332. Leiden: Nederlands
Instituut voor het Nabije Oosten.
Rubio, Gonzalo. 2006. Shulgi and the Death of Sumerian. Approaches to Sumerian Liter-
ature: Studies in Honour of Stip, ed. by Piotr Michalowski and Niek Veldhuis. (Cune-
iform Monographs, 35), 167–179. Leiden: Brill.
Rubio, Gonzalo. 2007a. Sumerian Morphology. Morphologies of Asia and Africa, vol. 2,
ed. by Alan S. Kaye, 1327–1379. Winona Lake: Eisenbrauns.
Rubio, Gonzalo. 2007b. The Languages of the Ancient Near East. A Companion to the
Ancient Near East, ed. by Daniel Snell, 79–109. Oxford: Blackwell.
Sassmannshausen, Leonhard. 1999. The Adaptation of Kassites to the Babylonian Civi-
lization. Languages and Cultures in Contact: At the Crossroads of Civilizations in the
Syro-Mesopotamian Realm; Proceedings of the 42th RAI, ed. by Karel van Lerberghe
and Gabriela Voet. (Orientalia Lovaniensia Analecta, 96), 409–424. Leuven: Peeters.
Sassmannshausen, Leonhard. 2014. Kassitische herrscher und ihre Namen. He Has
Opened Nisaba’s House of Learning: Studies in Honor of Åke Waldemar Sjöberg on
the Occasion of His 89th Birthday on August 1st 2013, ed. by Leonhard Sassmann-
shausen. (Cuneiform Monographs, 46), 165–199. Leiden: Brill.
Sathasivam, Arumugam. 1965. Sumerian: A Dravidian Language (Sumerian Studies, 1).
Berkeley.
Sathasivam, Arumugam. 1969. Linguistics in Ceylon (II): Tamil. Linguistics in South
Asia: Current Trends in Linguistics, Volume 5, ed. by Thomas Albert Sebeok, 752–759.
The Hague: Mouton.
Schmandt-Besserat, Denise. 1984. Review of Shann W.W. Winn, Pre-Writing in South-
eastern Europe: The Sign System of the Vinca Culture CA. 4000 B.C. American Jour-
nal of Archaeology 88: 71–72.
56 Piotr Michalowski
Valério, Miguel. 2014a. The Interpretative Limits of the Southwestern Script. Journal of
Indo-European Studies 42: 439–467.
Valério, Miguel. 2014b. Seven Uncollected Cypro-Minoan Inscriptions. Kadmos 53:
111–127.
Van Soldt, Wilfred H. 1980. MA and ḪUR in Kassite Texts. Revue d’Assyriologie et
d’archéologie orientale 74: 77–80.
Vennemann, Theo. 2003. Europa Vasconica, Europa Semitica, ed. Patrizia Noel and
Aziz Hanna (Trends in Linguistics, Studies and Monographs, 18). Berlin: Mouton de
Gruyter.
Vidale, Massimo. 2007. The Collapse Melts Down: A Reply to Farmer, Sproat and Wit-
zel. East and West 57: 333–366.
Weeden, Mark. 2015. The Land of Walastin at Tell Tayınat. N.A.B.U. Nouvelles Assyri-
ologiques Brèves et Utilitaires 2015: 65–66.
Whittaker, Gordon. 2008. The Case for Euphratic. Bulletin of the Georgian National
Academy of Sciences 2: 156–168.
Whittaker, Gordon. 2012. Euphratic: A Phonological Sketch. The Sound of Indo-European:
Phonetics, Phonemics, and Morphophonemics, ed. by Benedicte Nielsen Whitehead,
Tomas Olander, Birgit Anette Olsen, and Jens Elmegård Rasmussen. (Copenhagen Stud-
ies in Indo-European, 4), 577–606. Copenhagen: Museum Tusculanum Press.
Wilcke, Claus. 2005. ED Lú A und die Sprache(n) der archaischen Texte. Ethnicity in
Ancient Mesopotamia: Papers Read at the 48th Rencontre Internationale, Leiden, 1.-4.
July 2002, ed. by Wilfred H. van Soldt. (PIHANS, 102), 430–445. Leiden: Nederlands
Instituut voor het Nabije Oosten.
Wilhelm, Gernot. 1978. Ist das Elamische eine Ergativsprache? Archäologische Mittei-
lungen aus Iran 11: 7–12.
Wilhelm, Gernot. 1982. Noch einmal zur behaupteten Ergativität des Elamischen.
Archäologische Mitteilungen aus Iran 15: 7–8.
Woods, Christopher. 2010. The Earliest Mesopotamian Writing. Visible Language: Inven-
tions of Writing in the Ancient Middle East and Beyond, ed. by Christopher Woods.
(Oriental Institute Museum Publications, 32), 33–50. Chicago: The Oriental Institute
of the University of Chicago.
Zadok, Ron. 1984. The Elamite Onomasticon. (ANNALI supplemento, 40). Naples: Isti-
tuto universitario orientale.
Zadok, Ron. 1987. Peoples from the Iranian Plateau in Babylonia during the Second
Millennium B.C. Iran 25: 1–26.
Zadok, Ron. 1995. On the Current State of Elamite Lexicography. Studi Epigrafici e Lin-
guistici sul Vicino Oriente Antico 12: 241–252.
Zadok, Ron. 2005. Lulubi. Encyclopedia Iranica. www.iranicaonline.org/articles/lulubi
(31 January 2016).
Zadok, Ron. 2015. Kassites. Encyclopedia Iranica. www.iranicaonline.org/articles/
kassites (31 January 2016).
Zanotti, David G. 1983. The Position of the Tărtăria Tablets within the Southeast Euro-
pean Copper Age. American Journal of Archaeology 87: 209–213.
Zavaroni, Adolfo. 2005. The Camunic Inscriptions: A Phonological Framework. General
Linguistics 43: 87–105.
Zólyomi, Gábor. 2006. Sumerisch. Schriften und Sprachen des Alten Orients (2nd ed.),
ed. by Michael Streck, 11–43. Darmstadt: Wissenschaftliche Buchgesellschaft.
CHAPTER 3
1 INTRODUCTION1
I think it’s appropriate to ask what the purpose of our genetic classification is.
I believe that most historical linguists value the classifications because they help us
find out about the histories of the languages in a family. We reconstruct parts of their
common protolanguage and then use those reconstructions to study and compare the
changes that have occurred in the various daughter languages. In other words, to be
useful to a historical linguist, a hypothesis of genetic relationship must be fruitful: a
valid genetic grouping will permit reconstruction and thus lead to be a better under-
standing of the member languages and their histories. If a genetic hypothesis does
not lead to new insights of these kinds, therein it is sterile and, within linguistics,
useless.
(Thomason 1993, p. 494; emphasis added [JAL])
The classification and comparison of languages is not the ultimate goal of diachronic
linguists. Their main task is to describe and explain the development of languages or
families studied in their different phases (whether documented or not). Comparison2 is of
no scientific interest except when, undertaken in strict conditions, its objective is to illu-
minate the structure of the languages under study and the changes produced within them,
especially irregularities, exceptions, fases sparitas (stages with little or no attestation,
impossible or difficult to investigate in the language (or dialect) itself).
In language families with a long developed tradition of diachronic research, such
as Indo-European (IE), Uralic, or Semitic, demonstration of genetic relationships has
not signified the culmination of comparativists’ work3 but rather the start of their true
vocation as historical linguists. Their work must be based on the regularity of phonetic
change and on homologies, not on analogies, similarities, and superficial or casual resem-
blances, such as the spurious “similarities”, that, as Trask (1996: 220) shows, one could
find between Ancient Greek and Hawaiian or between Hungarian and Basque (1997:
412–415), which only make for an amateur entertainment.
That the Basque language is genetically isolated has been an obvious statement for a
long time now: its structural differences with respect to other languages, whether geo-
graphically close or not (Romance, Germanic, Semitic, and so on), are clear for all to see.
In the past, when Bascophiles or Basque apologists have sought to transform historical,
political, or religious issues into linguistic questions they often claimed that in ancient
times Basque (B) was spoken in the whole of the Iberian Peninsula (more so than in Gaul
60 Joseba A. Lakarra
Here we enter into protohistory, a period without any texts written in Basque but with
abundant information, especially in the medieval era, thanks to the large number of names
and toponyms included in Latin and Romance (Navarrese, Gascon, Castillian) texts; here,
Luchaire was a pioneer, as Mitxelena (1964) points out.
Thus, for example, some phonetic changes in composition and derivation and the pres-
ence of the article in the Middle Ages separate (2) and (3); the nature of the (secondary/
primary) corpus distinguish (3) and (4); archaic verb forms (aorist and other extinct
tenses and moods, more numerous synthetic forms) serve to separate (4) and (5); the con-
sequences of Larramendi’s work – 1729, Grammar, 1745 Sp-B-Latin Dictionary – mark
the boundary between (5) and (6); and the unification or standardization of the language
is the landmark between (7) and (8).
Dialectal differences,11 which to uneducated enthusiasts and speakers may appear
great, are few for comparativists (cf. Mitxelena 1964, 1981), so that the origins of the
initial dialectal divergence must be dated close to its early documentation. Mitxelena
dated Old Common Basque (OCB), toward the 5th–6th centuries (see 9th section), a
thousand years after Late Proto-Basque (LPB), which is a stage of language defined as
“the language that the Romans encountered”.
3 CRITIQUE OF COMPARISONS
The Basque language has been the object of many attempts to link it genetically to languages
nearby and distant in both space and time. However, none of them has achieved the standards
demanded by the comparative method, and above all, they have not achieved the objectives
of diachronic comparison; namely, such attempts have been of no use when it comes to illu-
minating aspects of the structure and evolution of the language, and therefore, they are inad-
missible by the comparative method as it has been developed in truly established language
families (see, inter alia, Campbell 2013, Trask 1996, Watkins 1990 and Meillet 1925 and
Mitxelena 1963); see Campbell 2011 and supra chapter 1 about Basque-Aquitanian relation.
No standard evidence of genetic relationship has ever been provided (nor attempted)
discovering phonetic rules (sound correspondences) relating – for example – Iberian and
Basque or elaborating the historical grammar, and the few “promising” cognates (homo-
phones) have dwindled to such an extent that they have all but disappeared from the liter-
ature (cf. de Hoz 2010–2011), either as a result of changes in reading and/or interpretation
within Iberian or because of advances in Basque linguistics and philology that make them
impossible; See also chapter 2, section 3.4.
The different hypotheses regarding the genetic relationships of Basque – classic ones
such as Basque-Iberian, Basque-Caucasian, more recent ones as Basque-Uraloaltaic, Vas-
conic, Basque-Indo-European, etc. (see Mitxelena 1964, Trask 1997 and Lakarra 2017:
Section 2) – share multiple characteristics that discredit them immediately:
1) They start from the simple yet false idea according to which, given that neither
Basque nor other isolate language belongs to IE, Semitic, Uralic, or other well-stablished
families – with known histories and acceptably reconstructed protolanguages, i.e.,
impossible to be manipulated on the whim of the amateur of the moment – all of them,
and particularly B and one or another of the remaining languages, must belong to the
Basque and the reconstruction of isolated languages 63
same family. Since the demonstration of this a priori assumption is a something good, a
goal in itself, the objectives, methods, and criteria of comparative-historical linguistics
are not sufficient to cause such “discoverers” to desist.
2) In cases in which a minimum attempt has been made to prove an argument, analyz-
ing alleged cognates in B and one or more other language, the Basque part of the
argument (as well as often the other part, of course) is strewn with errors: erroneous,
dialectal, or later meanings and forms; nonexistent loanwords, words, or variants;
flawed and arbitrary morphemic analyses; and so on (cf. Campbell 1988 and 2013).12
3) No attention is paid to the body of work on Basque historical linguistics or the exist-
ing literature on the language(s) that are compared with it.
4) As a consequence, after many decades of such comparative efforts, no light has been
shed on any aspect of the historical phonology or grammar of B (nor of the other
languages).
5) Frequently, the false illusions or statements derived from those essays are perpetu-
ated in later works: thus, for example, a claim was still recently made about the sup-
posed abundance of initial vowel in B (cf. Odriozola 2016) that, explained arbitrarily
as old articles, Schuchardt used to underpin his B-Hamitic-Semitic edifice, which
had already been demolished 90 years before.13
6) In the search for “explanations” for languages compared to Basque, numerous things
are overlooked such as significant variants, underused archaic testimonies, irregular-
ities, generalizations and clarifications about phonetics or grammar, and, in general,
the most important data for reconstructing B: if the relationship of hiri1 ‘city’ and hiri2
‘close, near’ (cf. *her, her(t)si ‘to close’ and Sp cerca1 ‘closure’ and cerca2 ‘near’; see
Corominas and Pascual 1980–1991, s.u.) is not examined, this can only be due to the
age-old and false belief that hiri1 derives from the family of Iberian ILTIR.14
Altogether, not only is there bad comparative practice in much of this work, but the true
nature of comparison is either misinterpreted or ignored. Comparison cannot be a goal in
itself and even less so when the universal standards its practice requires are not fulfilled
(see Section 1).
changes in the language that have taking place during the last thousand years based on
what is known about the evolution of the Romance languages (Mitxelena 1974, Ech-
enique 1984).16
The changes in Basque plosives had been addressed previously: according to Uhlen-
beck, word-initial voiceless consonants were voiced through dissimilation from other
intervocalic consonants. The problem with this hypothesis was that, in addition to the
uncommon character of this phenomenon, there is also voicing in words without an
intervocalic voiceless consonant (gerezi ‘cherry’, gela ‘room’ < Lat ceresia(m), cella(m),
etc.). Gavel (1920: 314ff ) suggested that all word-initial plosives voiced regularly in
a period subsequent to the adoption of the oldest loanwords, because in B there are no
word-initial voiceless consonants, except in recent loanwords, phonosymbolism, after the
loss of vowels, regressive assimilations, and so on.
Martinet17 accepted the important part of Gavel’s explanation: Ancient B had only
voiced plosive phonemes in word-initial position, whereas word-finally only voiceless
plosives were allowed. On the other hand, both voiced and voiceless plosives were found
in intervocalic environment. Nevertheless, the treatment of loanwords demonstrates that
something more than voicing is necessary to characterize the old system. For Martinet,
the starting point would not be the voiceless/voiced opposition, but another very different
one, a fortis/lenis contrast as in Danish, which he had previously studied. There would,
thus, be two series of plosives, strong /P, T, K/ and weak /p, t, k/. The strong plosives
would be realized as aspirated [ph, th, kh] word-initially and as plain [p, t, k] intervocally.
The weak plosives would be produced as soft voiceless [po, to, ko] in word-initial position,
and as fricative [β, δ, γ] between vowels.
Thus, the initial plosives in Latin loanwords, both voiced and voiceless, were adapted
to PB as lenis phonemes, given that the aspirated allophones of the fortis ones were
very different from the Latin sounds. In intervocalic position, each Latin plosive would
have its corresponding PB phoneme: voiceless ones strong and voiced ones weak. Word-
initially the strong phonemes would not be used in the adaptation of any Latin loanwords,
and later they would disappear through the influence of surrounding languages.18
replaced by one or the other, not by both, for that reason we should assume that this oppo-
sition was not the basic one.19
Lastly, Mitxelena points out that the data on Aquitanian inscriptions coincide with this
analysis: the written symbols and <CC> represent strong plosives (vs. <TH>, aspirated).
Likewise, the distribution of the plosives shows that p was infrequent and a variant of b in
strong position (i.e. on initial in the second member of the compound (Aquit Seniponnis
vs. usual -bon(n)) and after the sibilant: cf. Aquit Andoxponni.
Mitxelena (1957) extended Martinet’s argument to the whole system, believing that
the strong/weak opposition affected all the consonants, except /h/. The complete recon-
struction would be as follows (where, as in modern Basque orthography, z represents
a voiceless pre-dorso-alveolar or dental fricative [s̪ ] and s is a voiceless apico-alveolar
fricative [ʂ]):
Strong ― T K TZ TS N L R
h
Weak b d g z s n l r
The fact that more phonemes do not appear does not mean that there were no more
sounds but rather that it is impossible or unnecessary to reconstruct them with the avail-
able data for the nuclear phonological inventory. Mitxelena dispensed with (subsequently
phonologized) sounds such as the strong bilabial, palatals, historically expressive or auto-
matically generated (secondary) sounds, the /m/, nonexistent outside loanwords and pho-
nosymbolisms, except as an allophone in /b/ in nasal contexts. Vowels are not mentioned
either in the original argumentation (1957) or in the summary of the Fonética histórica
vasca (FHV, since they had hardly changed since Aquitanian: cf. Nescato, Cison, Sembe,
Ummesahar. . . .). Thus the five vowels of modern B are reconstructed as such, with three
levels of height and with no distinction based on quantity; the additional vowel [y] of
Zuberoan comes from *u. The approximants [j] and [w] are later, emerging in different
places: [j] from e- in old verbs (joan ‘to go’ < *e-oan; cf. ebili ‘to walk’ < *e-bil-i) and
[w] in loanwords when it is not a case of a final diphthong.
The plosive and sibilant subsystems seem to coincide in neutralization points (in initial
and final position, after a sonorant and before a consonant), reflecting the same opposi-
tion: we would have two series of sibilants opposed to one another by point and means of
articulation: apical and laminal, fricative and affricate.
The weak/strong distinction persists historically in rhotics (trill and tap), but in con-
trast to what happens with all other consonants, these do not appear word-initially. The
existence of /L/ and /N/ is guaranteed by their behavior in the intervocalic context and
by Aquitanian and medieval written symbols. Intervocalically not all the n’s and l’s
behave the same way as one can see clearly in loanwords, and traces remain in inherited
words; thus, if from Lat angelu, gula and caelu, we get aingeru ‘angel’, gura ‘desire’,
and zeru ‘sky’ (Z zelü); we could have a weak -l- in the causative (-ra-) but not in ilhe/ule
‘hair’ (> **irhe/**ure) from *enon-le > *e.ole > *eule, etc. With ahate ‘duck’ and lehoi
(< *leohe) ‘lion’, the result of Lat anate(m) and leone(m), a strong /N/ would be needed
by both arrano ‘eagle’ and baino ‘than . . . (comparative)’, but not *ardano ‘wine’
(> ardao, arno, ardan-) or *bini ‘grain’ (> bihi), or *seni ‘child’ (> sehi, sein) or bai(n)
a ‘but’.20 There is prosthesis before initial r in late Lat ropam > arropa, late Lat ratonem
> arratoi, etc. (as in Aragonese and Gascon via Basque substract); the muta cum liquida
consonant clusters are broken up, with a vowel before /r/ – garau ‘grain’, daraturu ‘drill’,
66 Joseba A. Lakarra
boronte ‘front’ < Lat granu(m), taratru(m), fronte(m), etc., and with the plosive deleted
before the lateral: cf. laket ‘pleasant’, loria, lau ‘flat’ < Lat placet, gloria(m), planu(m),
etc.21
Since PB, there has been a tendency to neutralize the fortes/lenes opposition in sibi-
lants and sonorants except intervocalically, in favor of lenis in initial position and fortis in
final position:22 cf. gorputz, bake < Lat corpus, pacem, and zeru < Rom tselu; in inherited
words cf. gazi ‘salty, savory’/gatz ‘salt’.
Mitxelena recognized a phonemic character to aspiration in PB, /h/ being the only con-
sonant outside the strong/weak opposition. Gavel (1920) had a different opinion, arguing
in favor of an adventitious late character of /h/, which would not have existed in southern
dialects. Analyzing Aquitanian and other medieval peninsular testimonies – and the then
recently discovered inscription in Lerga (Navarre) – Mitxelena observed that the old
/h/ (a clear difference with respect to its neighbor languages) appeared throughout the
historical territory and that, although already lost in Navarre by the 10th century through
Romance influence,23 it was attested until the 13th–14th centuries in Alava and Rioja, and
even in Bizkaia and Gipuzkoa.
Therefore, zahar ‘old’, ahuntz ‘goat’, zuhur ‘wise, prudent’, etc. are more archaic than
za(a)r, a.untz (awntz, ajntz), zur, etc., since they retain the structure and number of syl-
lables of a previous stage, in the same way as ahate ‘duck’, ohore ‘honor’, etc. are older
than the corresponding contracted forms (a(a)te, ôre, etc.).24 Mitxelena established four
etymological origins for the historical /h/: (1) PB plosives in absolute initial position,
(2) Latin-Romance f-, (3) intervocalic lenis *n, and (4) PB *h.25 Many etymological h’s
disappear in historical periods of the language (cf. FHV 525 and 219–220), for example,
those situated after the accent, or the first of two h’s within a root. Historically, there is no
h beyond the second syllable, but there is in Aquitanian and even a thousand years later
in Medieval Basque.
The accent is a difficult point in the reconstruction of LPB. The most common pattern
in modern B is phrasal, rather than word-level and only weakly contrastive. This sys-
tem cannot be very old, given that the evolution of consonants depends on whether they
appear in initial or in medial position. There have been two main proposals regarding the
old accentual system:
1 Martinet: demarcative accent on the initial syllable, which would explain the distri-
bution of plosives.
2 Mitxelena: accent on the 2nd syllable, to explain the modern distribution of h (never
after the 2nd, nor two h’s in the same word).26
Later (1995 and ff.) Hualde has placed the old accent on the last syllable of the phrase
(lexically contrastive accent arising later in borrowings and morphologically complex
words), but his argument seems to correspond more to OCB than to LPB.27
Mitxelena (1979, FHV 2) proposed for PB the syllabic structure (C)V(W)(R)(S)(T),
similar to that observed in Iberian. He established two restrictions: (1) in word-initial
position only one of the following consonants could appear: b-, g-, s-, z-, n-, l-; (2) most
likely, not all the segmental slots in a syllable were ever filled (geurtz ‘next year’ would
be the closest).28
As regards morphemes, Mitxelena assumed that the Canonical Root Structure (CRS)
of Iberian onomastic compounds and derivatives was “[2 + 2]” and “[2 + 1]” and that the
maximum size of roots was two syllables and proposed a similar structure for Old Basque.
Basque and the reconstruction of isolated languages 67
root was monosyllabic in ancient times. There is barely anything else until Uhlenbeck
(1947 [1942]) and Lafon (1950). One can scarcely say anything positive about the latter,
since he ignores internal reconstruction and proclaims that comparison with the Cauca-
sian languages – supposed relatives – is the only existing recourse for analyzing B roots.
Mitxelena, in FHV, does not use this term, nor does he derive any important or clear con-
sequences derived from it, though he notices, for example, the restriction against homor-
ganic consonants in rhotics and sibilants within the root: erur ‘snow’ > elur/edur, berar
‘grass’ > belar/bedar, and sasoi < Sp sazón, frantses < francés as sinetsi ‘to believe’
< zinhetsi (1545), but, across morpheme-boundaries, erro-aren ‘of the root’. Uhlenbeck
deserves praise for calling for an analysis of root models, although he did so in order to
buttress his theory of the polygenesis of the B language, where Biz and the other dialects
would stem from different languages. Perhaps for this reason, he did not have any follow-
ers in the study of the structure of the root.
In Lakarra (1995), we point out various restrictions on historical B roots – **VC and
**CV in autonomous monosyllabic lexemes, **TVTV in disyllables were not permitted –
and we suggest that they could be explained as originating from the root CVC; thus import-
ant lexical and morphological results were achieved:
contribute to a safer and deeper reconstruction than the atomism that underlies the slogan
“every word has its own history”.
For the study of root models, we classify (Lakarra 2008d) the words documented
historically – not reconstructions, even those clearly and universally accepted like *e-thor
‘to come’, *e-dan ‘to drink’, *e-khar ‘to bring’, etc. – in five groups: (1) loanwords,
(2) later variants, (3) compounds and derived forms, (4) forms due to onomatopoeia and
phonosymbolism, and (5) of unknown etymology. Later, productivity, phonotactic, and
geographical filters are applied to those included in (5), the only ones potentially belong-
ing to the oldest stages of the language.
In the last 15 years, disyllabic forms have been reclassified from (5) to (1) or (3) –
rarely to (2) or (4) – and progress in research is moving in this direction. Adding the
non-controversial reconstructions to the list of roots, we would obtain the result that
monosyllables of unknown etymology would increase by almost 100% – with hardly any
loanwords or derived words – in contrast to disyllabic cases. Thus, the clear difference
between CVC and the other models is increased even more, ruling out any disyllabic
forms CRS for Old Proto-Basque (OPB). In reality, given that the geographical filter is
established based on distribution in modern dialects, the results correspond at the earli-
est to OCB or to later stages, since we have been very lenient when it comes to filtering
innovations (cf. Lakarra 2008d and here Section 9).31
A formal etymology does not provide the exact origins of specific words, but its value
as an initial diagnosis seems clear: if, for example, fede belongs to the CVCV type, which
is a root type with multiple loanwords and very few inherited words (none with f-), it is
difficult for this word to be included in (3) or (5); if otso ‘wolf’ is VCV and -so ‘*big, older’
is repeated in amaso ‘grandmother’, alabaso ‘granddaughter’, and atso ‘old woman’, it is
highly likely that it derives from *hortz-so [‘fang’ + __], with loss of r in the consonant
cluster: cf. *hertz-bu(n) > esku ‘hand’ or *intzaurtzedi > intsausti ‘walnut grove’, etc.
5.4 CVC canonical root structure and the phonology of old Proto-Basque
Phonological analysis of the monosyllabic root leads to the proposal of a new consonant
system for OPB (cf. Martínez Areta 2006, Lakarra 2011b, 2017). Mitxelena (FHV) sug-
gested that strong consonants in PB could come from old groups, but only the research
related to CRS has provided sufficient proof (= etymologies) based on sonorants and
sibilants to think that this is the correct direction in which to go.
On the one hand, in a CVC root, there is no internal position for consonants, precisely
that in which the Mitxelenian system maintained the fortes/lenes opposition; namely, in
OPB the sibilants would have had four allophones and two phonemes (one dorsal and the
other apical), not four phonemes as in the later system, since there is no contrast but rather
complementary distribution between fricatives and affricates. In reality (cf. Section 4.2),
alternations like gatz ‘salt’/gazi ‘salty’ show that previously there were fricatives also in
final position, and Latin loanwords like gorputz (< corpus) are witness to the fact that, at
the start of the Common Era (i.e., after LPB), affrication applied in final position. Insofar
as we know, word-medial affricates come from consonant clusters – see otso earlier –
or from affrications in final position of the first element: atzo ‘yesterday’ < hatz ‘trace,
behind’ (cf. haz-i ‘to grow, seed’) + -o ‘COMPLETIVE’.
The argument is similar for liquids and nasals, with the exception that there are no
rhotics in word-initial position: cf. baiNo ‘but’/baina ‘except, save’ < *ba(da)(d)in ‘if it
70 Joseba A. Lakarra
were’ + -no ‘until’/+ -a to beLe ‘crow, raven’ < *bel-le (cf. bel-tz ‘black’); erro ‘root, teat’
[< *to ‘hang up’] < *e-ra-don [with nr > R] (*eradon > *edaron > *enaron > *eanron
> erro(n); cf. errun [to ‘lay eggs’], arrau(n/l)tza ‘egg’); see Lakarra 2017, in progress-b
and Begiristain in progress).
The consonant system of OPB would then be as follows: th, kh, b, d,32 g, l, n, r, s̪ , ʂ, h,
i.e., five plosives, two sibilants, tree sonorants, and h.33 Insofar as vocalism is concerned,
we find no reason to modify Mitxelena’s reconstruction (a, e, i, o, u); with respect to
diphthongs, it is possible that in OPB there were none since in LPB much fewer would be
reconstructed than those documented historically, with it being plausible to see previous
hiatuses emerging as a result of the deletion of old consonants for almost all of them
(cf. Lakarra 2010). There could be many more diphthongs arising from the loss of inter-
vocalic consonant than those Mitxelena reconstructed, such as in -do.i/lohi. See the end
of Section 6.3 on the development of consonants from LPB onward.
any evidence that the remote ancestor of B was ever spoken outside the historical territory
of the B language described in Section 2.35 Meanwhile, if haitz ‘rock, crag’ appears to be
in the word family aizto ‘knife’, aitzur ‘hoe’, etc., this does not sustain the old-fashioned
idea that Basque is a Neolithic language: as Gorrochategui (1998, 2002) explained, there
would be no more reason for this than for any other languages, like German, in which
the same sort of thing is found. What is more, it is very likely that ‘rock, crag, etc.’ were
not the old meanings of haitz, which appears to be a compound of *han ‘big’, cf. han-di
‘big’ [< ‘*to become big’] and animal names like ahari ‘ram’, akher ‘male goat’, ahuntz
‘goat’ (see details earlier), etc.
Many plant and tree names have been taken from Latin or Romance (porru ‘leek’, kip-
ula ‘onion’, piper ‘pepper’, baba ‘bean’, leka green bean, pod’, olho ‘oat’, gerezi ‘cherry’,
mertxika ‘peach’, etc.) but not other plants or trees such as haritz ‘oak’, arte ‘holm oak’,
gari ‘wheat’, garagar ‘barley’, ardantze ‘vineyard’, and animals such as behi ‘cow’,
ahuntz ‘goat’, asto ‘donkey’, zezen ‘bull’, behor ‘mare’, zaldi ‘horse’, idi ‘ox’, ardi ‘sheep,
(dialectal) flea’, ahardi ‘sow’, or txerri ‘pig’, and not just wild ones like hartz ‘bear’, otso
‘wolf’, orein ‘deer’, orkatz ‘roe deer’, etc. It is worth emphasizing that in the names of
colors – the peak of Dixon’s (1977) adjective hierarchy – we find participles (gorr-i ‘red’,
zur-i ‘white’, hor-i ‘yellow’ [cf. e-thorr-i ‘come’, har-i-tu ‘take’ (1545) with a pleonastic
participial suffix], derivatives (bel-tz ‘black’), compounds (ur-din ‘grayish’ [<‘*to become
water’]) and loanwords (berde ‘green’, marroi ‘brown’, azul ‘blue’, gris ‘gray’), i.e. ways
of substituting for adjectives in languages with closed-class types; see Dixon (1977).
The same critique of IE languages by Benveniste (1935) can be applied to Mitxele-
na’s etymologies: i.e., we find monosyllables, disyllables, and polysyllables; protoforms
with initial and final vowels and consonants, with and without consonant clusters; etc.
It is difficult to find a system in them or to believe that many of his etymologies are
contemporary.
New analyses have explored the phonology and morphology (phonetic changes,
restrictions, relative chronologies, etc.) of the words reconstructed by Mitxelena more
deeply. Thus, his reconstructed *ardano ‘wine’ and *enazur ‘bone’, trisyllabic forms –
recall that he thought that the OB roots were disyllabic – may lead even further: to *e-da-
ra-dan-o and *berna-zur, respectively. In the former, we get *dan from the verb e-dan
‘to drink’ and the same prefixal amalgam (ar-) in arbin, arran-, arrats, etc., old verbs
with applicative/directional + causative. In the latter, insufficiently considered phonetic
changes such as R – R > Ø – R (cf. ezker ‘left (hand)’/eskuin ‘right’ in *herz-bu(n)-
ger/*herz-bu(n)-on(e) (*erskuin after assimilation and simplification) or *b- > Ø – not
just in front of o/u as Mitxelena contended – would lead us to identify a first disyllabic
element taken in a loanword (berna ‘leg’ < Lat perna ‘ham’; cf. Eng bone = Ger Bein
‘leg’), as well as the inherited root zur ‘lumber’, both autonomously known.
Comparative etymology – parallels in the formation and development of B words –36
may lead us to discover the longed for “motivation” of Benveniste (cf. de Lamberterie
2000). Thus, for example, bi could not always have been ‘two’ but, rather, something like
‘above, over’ or, even better, ‘franchissement’ as Benveniste (1954) saw it in the family
of Lat pons-pontis, etc.;37 cf. zubi ‘bridge’ [< *‘lumber-over’], azpi ‘beneath’ [< *hatz-bi
‘traces/fingers-over’], ibi ‘ford’ [< *hur-bi *‘water-over’], etc. (cf. Lakarra 2015c). It is
strange that different domestic animal names – zaldi ‘horse’, idi ‘ox’, ardi ‘sheep’ (dialec-
tically ‘flea’), ahardi ‘sow’ – are formed based on di-/-di (<*din) ‘to come’, as in certain
African languages (cf. Dimmendaal 2011).38
The development of Monosyllabic Root Theory (Lakarra 1995) facilitates the dis-
covery of very old morphological processes of word formation, like reduplication and
72 Joseba A. Lakarra
prefixation, and allows one to broaden old lexical families – a crucial instrument in recon-
struction – or the establishment of other unknown ones:
*das : lats : adats : aldats : jatsi : arrats
‘stream’ ‘mane’ ‘incline’ ‘to descend’ ‘dusk’
*den : lehen : eden : ezten : eten : arren
‘before’ ‘poison’ ‘sting’ ‘to break’ ‘please’
*dur : lur : ____ : (h)andur : urri : (tx)inaurri
‘land’ ‘ ‘cruel’ ‘scarce’ ‘ant’
Naturally, the products of these processes belong to strata prior to, for example, the
transparent goibel ‘sad’ or ikertzaile ‘investigator’ (composition and suffixation); see
Section 8 on the a quo of prefixation and reduplication. New loanwords have been
discovered, sometimes such apparently pure words as alu ‘vulva’, eskatu ‘to ask for’,
zemai ‘threat’, alhatu ‘to graze’, alhaba ‘daughter’, ilhoba ‘grandchild; niece/nephew’
(both of the last two with the suffix -ba of family words), oihu ‘shout’, aupa ‘go!’,
etc. (the latter five from Gascon; cf. Lakarra 2015a), the families of other previously
known ones (such as the cited berna) have been extended, and other ones assumed by
Mitxelena have been confirmed (abagada-une ‘occasion’ < Sp vegada, dollor ‘bad,
poor’ < Rom trollo ‘bad fish,’ etc.), obtaining the exact and previously unknown origin
(cf. Lakarra 2008d).39 There is no doubt that more loanwords are yet to be discovered,
although the main line of research should not be in that direction, as Mitxelena already
intuited 50 years ago.
Irregularities for an SOV language have been pointed out (cf. Lakarra 2005, 2006a)
such as the Noun-Adjective order (which, however, Greenberg 1963 and follower typol-
ogists considered irrelevant).41 We may add that we do not have single CVC roots until
late, quite distinct from the CV(C)CV pattern in Uralic (see Bakrò-Nagy 1992), there is
no vowel harmony as in Uralic and other agglutinative families, and there are indications
of VO order, in contrast to the SOV of Uralic, Turkic, Mongol, etc. Nor does the first-
syllable accents of these language families appear to be old in B, which explains the scar-
city and late character of suffixes and postpositions (cf. Lakarra 1997a, Sarasola 1997).
In the oldest part of the case system (the “indeterminate declension”)42 a biunivocal
relationship between form and function does not appear to have existed, unlike in Dra-
vidian and other agglutinative languages, but rather, as in Tibeto-Burman (cf. Bhat 2000)
there was a kind of general locative (modern inessive marker -n) that can be found in
archaisms like barru-a-n-goak (15th c.) Be-n-goa (onomastic); and deictics like ha-n-dik
‘from there’, heme-n-dik, ‘from now on’ etc.
The final-syllable accent, i.e. on the monosyllabic root of old disyllables, implies the
existence of prefixes, too, in the noun phrase, besides those on the verb pointed out by
Trask (1977), together with a few others like the already mentioned *da-. It is not just
that, at a certain point, the previously prefixal language became suffixal, but rather that
some of these prepositions and prefixes – za-, le-, da-, de-, etc.: cf. basa-tza ‘muddy
place’, saltzai-le ‘sell-er’, etxe-r-a(t) ‘(to go) home’, elur-te ‘snowfall’, have cognates
in just as many other words where they are suffixes (cf. Lakarra 2006b), so that we must
conclude that they emigrated to the right before becoming fossilized.
Perhaps more attention than that received in Basque grammars (it does not appear in any
of them) should be given to the structure of the sociative coordination X-COMITATIVE
(-gaz/-kin) Y-case = “X-case1 and/with Y-case1”, as in the following examples:
This fits perfectly what, by other means, we reconstruct for OPB (and perhaps later):
monosyllabic words, without a case system, or verb inflection, with prepositions and
prefixes and without postpositions or suffixes; VO order, not SOV, a closed adjective
category, impersonal verb and without TAM, etc., i.e., much more close to the isolate
74 Joseba A. Lakarra
than to the agglutinative type. This is all very different from what we find in present and
historical Basque.
Since 2005 we have argued that diachronic holistic typology can give us some indica-
tion about the existence in the history of Basque – as in Munda (Donegan 1993; Donegan
and Stampe 1983, 2004) or Tani (Post 2006, 2009) – of a drift from a structure similar to
the Mon-Khmer toward one approximating more that of the modern Munda languages:
MUNDA Phrase Accent: Falling (initial); Word Order : Variable – SOV, AN, Postposi-
tional; Syntax : Case, Verb Agreement; Word Canon : Trochaic, Dactylic; Mor-
phology : Agglutinative, Suffixing, Polysynthetic; Timing : Isosyllabic, Isomoric;
Syllable Canon : (C)V(C); Consonant : Stable, Geminate Clusters; Tone/Register :
Level Tone (Korku only); Vocalism : Stable, Monophthongal, Harmonic.
MON-KHMER Phrase Accent: Rising (Final); Word Order : Rigid – SVO, NA, Prepo-
sitional; Syntax : Analytic; Word Canon : Iambic, Monosyllabic; Morphology :
Fusional, Prefixing or Isolating; Timing : Isoaccentual; Syllable Canon : (C)V- or
(C)(C)´V(C)(C); Consonantism : Shifting, Tonogenetic, Non-Geminate Clus-
ters; Tone/Register : Contour Tones/Register; Vocalism : Shifting, Diphthongal,
Reductive
(Donegan and Stampe 2004: 3, 16)
As in other instances of drift, the phonological evolution is consistent with the morphol-
ogy and basic syntax of the language (see Lakarra 2005): development of nasal vowels,
voiceless plosives in initial position, word-initial vowels and open syllables, and so on.
Finally, in the same way as Dravidian – which doubles its consonantal inventory in the
drift from the protolanguage to modern languages – we go from 11 consonants in Old
Proto-Basque to 16 in Late Proto-Basque, and to some 20–22 in the modern dialects.
in old Latin loanwords (although there are in inherited words of lVC shape [sarats ‘sauce’
< *sa-latz, [Lat salicem > B zarika]; cf. Lakarra 2015b) – phonological changes seem to
be more recent or even later than OCB. That is consistent with what we have seen about
the later character of disyllabic roots and the survival of multiple CVCs after OPB.44
7.1 Introduction
Meillet, Kurylowicz, and other leading diachronic linguists pointed out the importance
for linguistic reconstruction of the development from lexical to grammatical morphemes
and from grammatical morphemes to even more grammaticalized forms. Whether it is
to be understood as a primary process or as following from more basic phenomena, and
whether or not it is always unidirectional (cf. Campbell ed. 2001, Fischer et al. eds. 2004,
etc.), grammaticalization is takes place in languages everywhere, and its study may con-
tribute greatly to Basque historical linguistics, as it has contributed to knowledge of the
history of many other languages.
Certain grammaticalizations are known to have taken place in Basque. Among rela-
tively well-understood phenomena, we find the development of articles from demonstra-
tives (see Manterola 2015), the more recent development of the Gipuzkoan interrogative
particle al and the quite diverse and interesting evolution of the auxiliary verbs, etc.
The holism of the phenomenon (phonetic and semantic erosion) and multiple parallels
in different geographical and genetic languages (see Heine and Kuteva 2002) leads us to
acknowledge its effects on markers like those of the dative (-i < *nin ‘GIVE’); unfinished
aspect (da- < dar ‘SIT’), the plural -de < *den ‘FINISH’); the prosecutive/ablative and
adjectival suffix (-ti < *din45 ‘COME’); superiority comparison (-ago < *ha ‘demostr. of
3rd level’ + -go ‘TO PASS’); and the adverbial suffixes of mood (completive) -to and -ro
as well as the causative ra- (< *lo-), from *don ‘PUT’; the old comparative and distant
familiarity suffix (-so < *san ‘TO SAY’); the conjunction (da); the modal (-la); and the
coordinations (e-ta ‘and’, e-do ‘or’); etc. (cf. Lakarra 2013b, 2017).
It is obvious that without the help offered by analyzing grammaticalization, many of
these reconstructions would still be unknown.
languages. We may add that da- (cf. Section 7.5) can be taken to be the grammaticaliza-
tion of *dar ‘SIT’ (*e-darr-i > jarri ‘to sit’), the best-known source of this type of marker
in verb and the locative on the NP (B -a/-t: Zarautz-a ‘to Zarautz’, hibaira-t ‘to the river’).
Therefore, da-go [< *dar-*gon ‘SIT’-‘STAY’] came to be a kind of asymmetric serial
verb (see Aikhenvald 2006). We have testimonies of grammaticalization of typical serial
verbs (cf. Lakarra 2008a, 2017) in many nominal cases: dative -i, ergative -k < *ga < *gon
‘to stay’, locative -a <*dar ‘SIT’, raino ‘until’ < *r-a-(d)in-no [epenthesis-SIT-COME-
GO], etc.
The prefixes on these verbs are fossils and have not extended to new roots since the
prehistoric era (the end of LPB?) – there are none in verbs taken from Latin – and
in the case of the causative it was substituted before the first texts by the suffix
-erazo/-arazi. There are those who have seen in the destinative -ra (mendi-ra ‘to the
mountain’, egite-ra ‘to do’, etc.) the origin of the causative (e-ra-bil-i ‘to use’ < ‘*to
make walk’), but this is unlikely: the two categories do not correspond to the same
network of grammaticalization (cf. Heine and Kuteva 2002) and, what is more, we
would have a case of suffix (active) > prefix (fossil) in a language with drift toward
agglutination and SOV order. It is, moreover, unnecessary because the old auxiliary
PUT *lon (> i-ro-) is enough to explain causative -ra after -o > -a in composition and
derivation, and regular VlV > VrV in loanwords and inherited words until the Early
Middle Age.48
As we have seen (in Sections 7.1 and 7.2), da- (< *dar) had locative and (indeter-
minate) aspectual value, typical in the grammaticalization of SIT. Trask (1977) found
this morpheme in conjugated forms (dago, dakhar, etc.), but it is also present in some
non-conjugated ones, which explains the enormous abundance of -a- after yod in those
pointed out by Mitxelena (FHV) and also in certain adjective root and nouns like la-bur
‘short’, la-bar ‘edge of cliff’, etc.
The reconstruction of elements preceding the old verb root shows the value of com-
bining the notions of CRS and grammaticalization: we are able to reduce the polyform-
ism assumed by Lafon and others and we are able to get a more precise idea of the old
morphology and syntax, which functioned to the left and not to the right49 as in historical
times (see Trask 1977, Lakarra 2008c, 2017).
Basque and the reconstruction of isolated languages 77
1 There can be no disyllabic or polysyllabic PG: -tate and -tasun are loanwords (Latin
-tatem) or secondary amalgams (-tasun < -t-ar-zu-n). Likewise, -heta (archaic vari-
ant of the toponymic suffix -eta) and -zaha (the same for -za) are not PG but fusions
of other morphemes: *he + -ta (< da), *-za + -ha, etc.
2 There are no simple PGs with codas: *-gan was not the old inessive morph, but an
amalgam -ga + -n (pace Jakobsen, Trask, and de Rijk); sociative -kin < *-k-i-de +
-n, etc.
3 There are no CV lexical morphemes but rather this is the Canonical Morpheme
Structure of the PGs: lo (< *don) ‘to sleep’, su ‘fire’ (< sun-/sur-), etc. have lost -C2
in composition or for other reasons (e.g. reanalysis).
4 All CV morphemes come from CVC, i.e. they are PGs. E.g. da-/-da < *dar ‘to sit’,
-di/-ti < *din ‘to come’, etc.
5 V-, -V < CV and C-, -C < CV. The suffix and prefix in the dative -i/i- come from
*ni- (< *nin ‘to give’) and -t ‘1st per.sing.’, -k ‘2nd per.sing.’ in -da, -ga. The e- in
old verbs comes from *Ce- (*he-), cf. (h)eta, *her ‘close, closed’. The instrumental
marker -z is reconstructed as *-zV (cf. za ‘pl.’ and *zan ‘to be’, plus the pleonastic
-zaz < *-za + za) and the agreements n- ‘1st per.sing’, z- ‘2nd per.pl.’ in the pronouns
ni, zu, etc.
6 -VC < *(C)V#C(V). This is a subcase of (2) and (5); thus -ak (nom.pl., ERG.sg. and
ERG.pl50 is -(h)a + g(a). Manterola (2015) reconstructs as *ha (without coda, not
*har, as it has been to date) as the demonstrative and third-level article. The finals
on -VC that Uhlenbeck (1942) took to be suffixes (-ats, -ar, etc.) are roots that have
lost C- in composition (cf. adats ‘mane’, aldats ‘incline’, ordots ‘male’, etc.), not
suffixes or amalgams.
7 In -rV (DAT. -ri, adlative -ra, GEN. -ren, etc.), the -r is epenthetic between a stem
in -V and a case marker V-. It would apparently enter into (4), but r- is impossible in
roots and words, and it is unlikely that all the -rVs in declension come from -lV(C)
by *VlV > VrV. -V (< *CV < *CVC) is the true suffix with later epenthetic r. Note
that the stems in -V necessary for the change mentioned could only be developed
later in a language with CVC Canonical Root Structure.
8 Postpositions and other morphemes may be differentiated: morpheme < CVC
root/postposition < words (and constructions greater than CVC: buru ‘head’, begi
‘eye’, aurre ‘front’, atze ‘back’, ondoren ‘after’). They could also be based on
disyllables, and their internal structure (composition, derivation, as well as loan-
words) is much more obvious compared to other morphemes; the degree of gram-
maticalization and the antiquity of postpositions is very little (cf. Hualde & Ortiz
de Urbina 2003).
The combined study of CRS and CMS and grammaticalization may offer yet more
advances in the reconstruction of PB (see Lakarra 2013b, 2016, 2017, 2018).
began with Mitxelena), both because the necessary philological work has only been
undertaken recently (publication of important texts after 1975, DGV in 1987–2005, for
example) and because important developments in historical linguistics have had very
little impact in the field of Basque studies until quite recently.
Still, we do have important prehistorical and protohistorical phenomena (reduplica-
tion, prefixes) and changes (*d- > l-, T- > D-, -n- > -h-, *h2/3 > h1, prefix → suffix, etc.
(see supra) that we can attempt to connect in a relative chronology, thereby making some
progress in establishing periods and strata in the development of Basque (cf. Lakarra
2015b). Thus, the existence of reduplication for patterns such as *dVC, *nVC, *zVC and
*gVC (see Section 5.4) but not for *lVC is probably related to the prehistorical change
(already noted by Mitxelena 1957) *d- > l-: i.e., reduplication had ceased to be produc-
tive before that rule came into force. If sarats ‘willow’ comes from *sa-latz,51 with the
prefix sa- and root latz ‘rough, coarse, harsh’ (< *datz), then clearly that type of prefixa-
tion survived until a more recent time than reduplication and subsequent to *d- > l-, but
neither of these phenomena was in force when Basque-Latin contact began, since they are
not present in any loanword, however old it may be.
Bearing in mind that there are only aspirated voiceless plosives in initial position in
old verb roots – not plain voiceless or pure aspiration: ekharri ‘to bring’, ekhusi ‘to see’,
ethorri ‘to come’, not **ekarri, or **eharri, etc. –52 it is possible that such consonants
are archaic, maintained without suffering *Th- > h- by the addition of the prefix and not
later aspirations like those of many loanwords (khoroa ‘crown’, phike ‘tar, pitch’ < Latin
coronam, picem, etc.) or inherited words (khal-te ‘loss’ < gal-du ‘to lose’). As verbs with
the prefix *e- (> e-, i-, j- according to known contexts) can only be conjugated, we can
infer that the prefix ceased to be added to the CVC bases before Th- > h-, so that the verbs
that developed thereafter, whatever the C- and both in loanwords and inherited words,
lack any synthetic form and are conjugated periphrastically.
Given the characteristics of the corpus of the language (Lakarra 1997a, Ulibarri 2013),
absolute internal chronologies are scarce and uncertain. Thanks to the DGV, we have
many interesting lexical and some morphosyntactic clues: e.g. the first appearance of the
interrogative particle al, the antequem of the Aresti-Linschmann Law on neutral or inten-
sive possessives – widespread up to the 18th century, which then disappeared in diverse
ways among the different dialects – and the placing of oso ‘very’ (after the adjective or at
the end of the phrase up to the same century).53
In phonology, when it comes to studying loanwords, the use of the chronological
sequences elaborated by Straka (1954–57) and others in the evolution of Latin-Romance
offer a good number of absolute and relative chronologies; cf. Guiter (1989):
Research regarding chronology in the inherited lexicon has been more limited although
it may progress with works such as that of Hualde (in press). For instance, based on the
phonetic changes established in FHV and other work by Mitxelena and others, we can
try to establish that -n- > -h- must be prior to h3/2 > h1 (Lat arena > OCB *areha > harea
‘sand’, OCB *enuskara > *ehuskara > heuskara > euskara ‘Basque language’)54 and
prior to those nasal vowels reconstructable for the OCB (ardâô ‘wine’, gaztââ ‘cheese’,
80 Joseba A. Lakarra
etc.). In turn, arrâî ‘fish’ (still attested at the end of the 16th century) predates arrai(n), as
does *lukâîka ‘sausage’ (< Lat lucanica) to lukai(n)ka. And *hVh > ØV(h) is prior to h3/2
> h1 (*hur-bar-bi > *huhbah(b)i > *uhbahi > ibahi > hibai ‘river’).
The old strata and variants are easier to recognize in loanwords (see Mitxelena 1974):
gela ‘room’ < Lat cellam is older than zeru ‘sky’ (< tselu < Lat caelum) due to the fact that
the palatalization of k- is a Romance phenomenon, so that gela must have already existed
in B when caelum ‘sky’ was first palatalized and went on to become an affricate sibilant
later. Zeru has -l- > -r- in contrast to its later variant zelü, but in gela the -l- comes from
a fortis lateral (= Lat geminate) like PB *beLe ‘crow, raven’, not from a singleton Lat
-l- (gula > B gura ‘to want’) or B lenis (toponym Araba, cf. Rom Álava). In baradizu/
paradisu ‘paradise’, there is a voiced/voiceless C in line with the antiquity of the dorsal/
apical sibilant (this in recent loanwords); if we add paraiso we would have the modern -o
with p- and s- but older -u with -z- and with b-.
The words sabel ‘belly’ and zezen ‘bull’ (prefixation and reduplication; cf. Sec-
tion 5.2.2) are much less transparent in their structure than ogibide ‘job’ < *‘bread’-
‘way’55 (later composition) and Hirutasun [‘Trinity’ < hiru ‘*three’ + -tasun ‘-ity’]
(modern suffixation). Prefixation and reduplication are not just fossilized by the time of
the first texts written in B, but they do not even apply to the oldest known loanwords.
Derivative suffixes, on other hand, are scarce outside the highest literary language still in
the 18th century (cf. Lakarra 1997a, Sarasola 1997).
In verbal inflection, the older nature of prefixes with respect to suffixes is clear. Regard-
ing nominal inflection, the number and size of case markers in the noun phrase has
grown substantially, due in part to the phenomenon known as surdeclinaison (zaldi-ar-
en-tzat = horse-the-GEN-for; cf. Lafitte 1944). In the transitive auxiliaries of periphrastic
forms in irrealis moods, iron ‘can’ (from the same root as the causative prefix -ra-) appears
to have been more widespread at some time, although in modern times it has only survived
in eastern areas; *ezan (today central-eastern) is documented in the far western and south-
ern areas in the 16th and 17th centuries; egin ‘to do, make’ is of general use as a main verb,
but only in western varieties (Biz, A, G) is it used as an auxiliary. Among the transitive aux-
iliaries used with dative agreement (*nin, *edutsi, *eradun) the first appears to be the oldest
and most grammaticalized (widespread but barely attested in Biz), the second is an innova-
tion from Biz and A, and the third is documented not just in eastern areas, as in the present,
but also in G and A, even if only in plural in these latter dialects (cf. Ariztimuño 2013).56
As for the article – on which the singular and plural case systems are based – not only
was it missing in LPB but also in Aquitanian and in Pyrenean Basque (see Manterola
2015). Its grammaticalization is more recent than in Hispanic Romance languages (after
the 8th century).
diachronic linguistics, have still not put an end to opinions based mainly on the modern
situation of the language.
The dialects established by Bonaparte (1866), Mitxelena (1964), and Zuazo (1998)
are recent, and there is little difference among them, without offering the possibility of
taking us back to the speech of the old Vasconic or Vascoid tribes (at the beginning of
the Common Era or the end of the previous age) or to LPB. Mitxelena (1981) postulated
the notion that convergence of the most differentiated forms of speech beginning in PB –
which Aquitanian and the Pyrenean B did not undergo – could have taken place in centu-
ries after the weakening and fall of the Roman Empire, when the B-speaking populations
resisted Visigoths and Franks.
It is clear that all B dialects have shared many innovations after PB in phonology
(voicing of initial consonants, lenition of sonorants, metathesis of /h/, development of
nasal vowels, neutralizations and deletion of vowels in final position in the first element
of compound and derived words, etc.) and grammar (the article – initially including three
degrees of deixis –, intensive personal pronouns, the plural, most definite declension and
almost all indefinite declension, the tense system, the distribution of synthetic and peri-
phrastic forms, most of the auxiliaries, the allocutive, verb periphrasis, the development of
-zu ‘you’ from plural to singular, etc.). We must assume that if all B dialects came directly
from PB, their differences would be much greater, almost certainly having become differ-
ent languages.58 We would have to assume therefore that the historical dialects known to
us are the result of fragmentation of a common language dating from approximately the
5th to 6th centuries in the Common Era, which would have been the product of a conver-
gence process among distinct and previously more disperse B forms of speech.
Yet Mitxelena based his ideas essentially on non-linguistic arguments – i.e., that con-
ditions after the fall of the Roman Empire would be those most suited to reducing the
dependency on foreign powers and reinforcing cohesion and internal organization (fol-
lowing the historical model of Barbero and Vigil (1965), which most historians currently
reject) – in order to demonstrate the need to assume an OCB, without attempting a precise
definition of such a protolanguage, and in particular its differences (relevant innovations)
with regard to LPB, an indispensable task for its justification from a linguistic point of
view and something that has only recently received some attention.
We tentatively present here a series of phonological innovations that could have
occurred between LPB and OCB or, at the very least, prior to the fragmentation of the
latter and that perhaps may serve to differentiate both protolanguages:
On the other hand, without refuting the existence of horizontal innovations among the B
varieties – which have never ceased to be in contact – it is obvious that the flat dialectal
tree commonly used the field of B linguistics is implausible and antihistorical (cf. Aus-
tronesian or IE). Furthermore, it impedes the establishment of historical and geographical
timelines59 and hierarchies in the evolution of features and varieties.
82 Joseba A. Lakarra
Dialectal classification must be based on the oldest innovations and the bipartite
branchings that they produce. It cannot depend on the number of traits that could be used
to separate one dialect from another. Unquestionably, the dating of all the historical dia-
lects of B – Biz, NHNa, and Z, for example – cannot be the same insofar as the particular
innovations that each one of them shows are very different in their age (old in the case
of Z, more recent in the case of NHNa). Thus, for example, considering phenomena like
(a) voicing (or devoicing) of plosives after l/n, (b) palatalizations after (V)i, (c) dissimila-
tion a + a > ea, and (d) grammaticalization of egin as AUX, it would appear that:
The tree derived from these and other innovations would be closer to (2) (below) than
to (1), despite the fact that this is the partition with the greatest number of followers
(cf. Bonaparte, Lacombe, Uhlenbeck, etc.), especially among enthusiasts who do
not make excessive use of the existing philological documentation on old Biz and A
(cf. Lakarra 1996, Mounole 2015, Mounole and Lakarra 2017;61 Mitxelena rejected this
classification explicitly on many occasions (1958, 1964, 1981) and there are still no argu-
ments to change this opinion.
Following a suggestion by Mitxelena (FHV), in Lakarra (2014) we defended the idea
that voicing after l and n (alde ‘side’, handi ‘big’) is an archaism and that the innovation
is the devoicing found in R and Z. For this, in addition to Mitxelena’s observations, we
are supported by the B substrate in Gascon (cf. Rohlfs 1977) and the parallel of sibilants
in an identical context, which are realized as fricative (= lenis) and not as affricate (=
OCB OCB
fortes) unlike in the modern Western dialects. For that reason, R and Z are the inno-
vators – this is perhaps the earliest dialectal innovation (= right tree in figure 3.1) that
we know of – bearing in mind that affricates were opposed to fricatives and voiceless
plosives to voiced plosives as fortes and lenis, respectively, in the previous system; cf.
Section 4.2).62
As for the question of the OCB homeland, we believe we must locate innovations
chronologically and geographically rather than considering the quantity of existing mod-
ern dialects in this or that territory (cf. Janhunen 2009 for Uralic, for example). In partic-
ular, in the territory of Bizkaia, Gipuzkoa, and Araba, the dialectal division seems much
simpler than in Navarre, in the geographical area that runs from Pamplona toward the
north, up to the modern French border. Perhaps that would be the area where OCB devel-
oped and from which it spread. It is not advisable to interpret the most modern forms
of speech, situated at the lowest level of the tree, as decisive. We should look instead to
the root of the tree when searching for the homeland of OCB, given that the first frag-
mentations are found in the highest branches: see Janhunen’s (2009) conclusions on the
homeland of Proto-Uralic.
Returning to our case, the place of Samoyedic, Finno-Ugric, and Proto-Uralic would
be occupied, respectively, by Zuberoan-Roncalese (= Old Eastern Basque), Eastern Low
Navarrese-Salazarese (= Easternmost Navarrese), and OCB, with the oldest isogloss sit-
uated between Z-R and NoELNa-Sal – during an era that, for the moment, we cannot
specify – so that we should locate in that specific place63 the proto-homeland of OCB, as
shown in Figure 3.1 above.
10 CONCLUSIONS
The main proof for genetic relationship among languages lies in the help it offers for
the reconstruction of a common protolanguage and for studying the history of the lan-
guages in the family. The strength of the demonstration cannot be based on the quantity of
alleged superficial analogies without regular phonetic connections or the reconstruction
of homologies.
We consider it essential that specialists in Basque, distancing themselves from unjus-
tified allegiance to remote agendas, analyze the facts of B diachrony according to the
best philology and the most productive theories and methods of linguistic change and
reconstruction, as Meillet and Mitxelena asserted.64 Any advances via the expansion of
materials – languages or protolanguages related to ours, pre-Latin loanword strata – do
not appear any nearer, than they were some decades ago, so it is reasonable and neces-
sary to opt for the application of more efficient theories and methods (cf. Haas 1969),
in order to arrive at a more complete and deeper reconstruction of PB and the pre-
history of the language. We defend the notion that – besides the usual internal recon-
struction methods masterfully used by Mitxelena a half century ago – research on the
Canonical Form of roots and morphemes; Diachronic Holistic Typology (subordinate
to the search for homologies, not dedicated to pure analogies); and Grammaticalization
processes may continue to contribute important advances in reconstruction. Finally,
the elaboration of chronologies and periodizations – including the establishment of
a minimum number of necessary (intermediate) protolanguages for the reconstruc-
tion of the prehistory of the language (as is the case with Old Common Basque, Late
Proto-Basque, and Old Proto-Basque) – are unavoidable topics, as we find in any other
language or family.
84 Joseba A. Lakarra
NOTES
1 Work connected to the projects “Monumenta Linguae Vasconum (IV): Textos Arcai-
cos Vascos y Euskera Antiguo” (FFI2012–37696) and “Monumenta Linguae Vasco-
num (V)” [FFI2016-76032-P], Consolidated Research Group “Historia de la Lengua
Vasca y Lingüística Histórico-Comparada” by the Basque Government (GIC.IT698–
13) and UFI (Training and Research Unit) UFI 11/14 at the UPV/EHU.
Acknowledgements: I thank Borja Ariztimuño, Joaquín Gorrochategui, José
Ignacio Hualde, Julen Manterola, and Blanca Urgell for numerous and interesting
observations and corrections of form and content, although I have not necessarily
accepted all of their suggestions; all remaining errors are my own. The map and the
dialectal genealogies were produced by Adur Larrea, with the collaboration of Céline
Mounole, with important observations from Gidor Bilbao and Ricardo Gomez. I have
found Lyle Campbell as rigorous and generous an editor as anyone could want.
Abbreviations: C = consonant, R = sonorant, S = sibilant, T = plosive, V = vowel,
p = person, sing = singular, pl = plural, ERG = ergative, DAT = dative, ABS = absolutive,
GEN = genitive, part = participle, TAM = Tense-Aspect-Mode, PREP = preposition(al),
PG = primary grammaticalization, SG = secondary grammaticalization. Languages: B =
Basque (language), OPB = Old Proto-Basque, LPB = Late Proto-Basque; OCB = Old
Common B; IE = Indo-European, Rom = Romance; Lat = Latin, Sp = Spanish, Eng =
English, Ger = German; Basque dialects: A = Alavese, Aez = Aezkoan, Biz = Bizkaian,
EA = Eastern Alavese, ELNa = Eastern Low Navarrese, G = Gipuzkoan, FEB = Far
Eastern B, FWB = Far Western B, L = Lapurdian, LNa = Low Navarrese, NaB = Navar-
rese B, NeEB = Near Eastern B, NeWB = Near Western B, NoELNa = North-Eastern
Low Navarrese, NoUNa = North-Upper Navarrese, NoWLNa = North-Western Low
Navarrese, OCeEB = Old Central-Eastern B, ONaORB = Old Navarrese-Oriental B,
OCeB = Old Central B, OCeWB = Old Central-Western B, OEB = Old Eastern B,
OWB = Old Western B, R = Roncalese, Sal = Salazarese, SUNa = South Upper Navar-
rese, UNa = Upper Navarrese, WA = Western Alavese, WLNa = Western Low Navar-
rese, Z = Zuberoan. Others: DGV = Diccionario general vasco (Mitxelena & Sarasola
1987–2005), FHV = Fonética Histórica Vasca (Mitxelena 1961).
Symbols: * = reconstructed form; ** = Undocumented and impossible.
2 Following Ringe (2003), we understand that the comparison between dialects is
not internal but rather comparative reconstruction. In our case, its result would be
OCB (see Section 9), but it is far from being approached systematically, most likely
because internal reconstruction is much more compelling on almost all fronts.
3 It is impossible to demonstrate that languages are not related, that there is no genetic
relationships, and therefore, it is the “believers” – as Mitxelena used to say – who are
obliged to offer proofs (standard ones, and not just any old thing).
4 All of them defended the polygenesis of all languages, like Boas and Trubetzkoy as
well (Lakarra 2008d).
5 That is, not non-professionals in linguistics but people who are unfamiliar with the
methods and aims of historical linguistics and philology and those who have no expe-
rience as regards the real history of any language or family, whether they are linguists
by profession or not: this is not uncommon.
6 For Vovin (1994), Japanese had the dubious honor of having been the language on
whose origins the most ridiculous things had been said; we do not know if he had
thought about B when he argued this. In Campbell’s (2013) extensive list of non-
proven genetic relationship hypotheses, the extremely high proportion of combina-
tions in which B appears is striking.
Basque and the reconstruction of isolated languages 85
7 For the current sociolinguistic situation see Barreña et al. (2013); Hualde and Ortiz
de Urbina (2003) is the most complete grammar in English on contemporary Basque;
Mitxelena-Sarasola (1987–2015) the obligatory lexicographical, historical, and
dialectal source; and Hualde, Lakarra, and Trask (1995), Trask (1997), and Martínez
Areta (2013) are the most up-to-date monographs on the history of the language (par-
ticularly prehistory and internal developments). Gorrochategui, Igartua and Lakarra
(eds. 2017), besides being an examination of prehistory and protohistory, is the most
complete available treatment of strictly speaking historical eras.
8 In the school system, the “D” teaching model (in Basque with Spanish as a subject) is
the most common choice now whereas the “A” model (instruction in Spanish with B as a
subject) is fairly marginal. The University of the Basque Country awards degrees in both
B and Spanish, with B being used at the university level for the first time in 1978; there
are both television and radio stations (in the BAC) wholly in B as well as local television
and radio stations in the language, which moreover is used increasingly on the Internet.
9 Studied in the last third of the 19th century by Luchaire and later by Mitxelena (1954b)
and Gorrochategui (1984, etc.). In the southern part, one should add an important
inscription found in Lerga (Navarre) and a few others discovered in the historical
Vascons’ territory, as well in La Rioja and Soria (cf. Gorrochategui 2011a).
10 Rico (1982) points out the curious syntactic order of many Romance glosses, more
befitting B than Romance.
11 The two best-known dialectal classifications are that of Bonaparte (B, G, SHNa, NHNa,
L, WLNa, EaLNa, and Z dialects) and that of Zuazo (1998); see Martínez Areta (2013).
These two classifications refer to situations around 1860 and 1990 and were not done from
a diachronic point of view. The observations of Mitxelena (1958, 1961–1977, and 1964)
are interesting insofar as he distances himself from Bonaparte by differentiating (as Azkue
had done previously) Z and R; he also separates, for phonetic reasons, Aez and Sal from
LNa and adds the southern dialect (documented in Landucci 1562 and Lazarraga [~1600]).
See Section 9 and Lakarra (2014), Mounole (2015), Mounole and Lakarra (2017).
12 See Blust (2014) on the Proto-Ongan-Austronesian hypothesis of Blevins (2007); it
is difficult not to see similarities among diverse errors revealed there and those com-
mitted in her B-IE hypothesis (Blevins 2013).
13 Nor does the supposed existence of two datives -o/-a in 3rd p., which would come
from those old articles, have any real foundations – (pace Rijk 1981); actually, o
a/__C; cf. deutso ‘3p.ERG-3p.DAT’ : deutsala ‘3p.ERG-3p.DAT + -la’, jako ‘3p.
ABS-3p.DAT’ : jakan ‘3p.ABS-3p.DAT + -la’, etc. (Mitxelena 1954a).
14 Attempts have been made to justify other details but not the h-, despite the fact that
this is etymologically present (cf. Lakarra 2009b, 2015a, etc.). Moreover, an internal
explanation exists (*her ‘to close’ + -i ‘part.’, cf. Lakarra 2010, 2013c), and ILTIR
could mean ‘river’, not ‘city’ (cf. De Hoz 2010–2011).
15 The influence of a lack of clear and persistent fragmentation in the process of the
dialectalization of Basque remains to be studied (cf. IE and Austronesian languages,
for example); see Section 9.
16 For example, the -n- > -h- change that Mitxelena understood as simply prior to the
first medieval onomastical testimonies (9th–10th centuries) can be taken back at least
six centuries (cf. Lakarra 2014) from the dating by Chambon and Greub (2002) of
Proto-Gascon in the 5th century, which shares this and other features with its Aquita-
nian substrate, as has been acknowledged since Luchaire (cf. Gorrochategui 1984).
17 Martinet also addressed sibilants, although his argument, which was complex and
yielded few results, has not had any followers. On his ideas about old accentuation,
see Section 4.2.5.
86 Joseba A. Lakarra
18 Although Martinet and his successors attributed the disappearance of aspirated fortes
to external influence, this is unnecessary and unlikely, hypercharacterization being
sufficient to explain it (*th-, *kh- > h : *b°, *d°, *g° b-, d-, g-) after centuries of
the weak phoneme systematically adapting loanwords. The rare exceptions (cauea
> habia ‘nest’, *kar > harri ‘stone’) are very problematic; nor are the g- and k- in
deictics in Navarrese speech forms (gau, gori, gura and kau, kori, kura ‘this’, ‘that’
(close), and ‘that’ (far) versus common hau, hori, hura) obviously archaic but, rather,
much later innovations (cf. Lakarra 2014, 2017); See Egurtzegi (2018) for more argu-
ments on behalf of [aspirated = fortes] vs [non-aspirated = lenes] plosives and Lakarra
(2017) for some new cases for *Th- > h-.
19 This argument should now be revised: Hualde (1997a) proposed f < *wh (see some
of the etymologies in Lakarra 2009a) and, therefore, the chronology of these forms
would be later, even subsequent to the change au > ai in Z and R: cf. Z aihairi ‘din-
ner’, gaiherdi ‘midnight’ versus general afari, gauerdi.
20 *Ba(d)in + -a / *ba(d)in + -no > bai(n)a ‘but, yet’ / baino ‘only, but’ and *arran + -i /
*arran + -no > arrain ‘fish’ / arrano ‘eagle’. -no < *non ‘to move’ as in comparisons
in other languages (cf. Heine and Kuteva 2002).
21 Mitxelena mentioned it occasionally, but little importance has been given to another
change (CrV- > CVr): cf. Rom. trollo (a little regarded fish, see Corominas-Pascual,
s.u.) > *torllo > *dorllo > B dollor ‘bad, poor’.
22 The same thing occurs in older testimonies with occlusives. Such a tendency is much
less complete in final position than in initial position, and in sibilants and vibrants
than in laterals and nasals. See FHV and now Begiristain (2015) and Lakarra (2017).
23 See the new southern (peninsular) data on /h/ in demonstratives with a later chronol-
ogy in Manterola (2015).
24 Not all intervocalic h’s are, however, etymologically prior. Lur ‘ground, soil, earth’
has a late and very minor variant luur in Biz (cf. zoor ‘debt’ in this same dialect). In
the DGV, luhur appears as attested in the modern LNa dialect of Baigorri, but not in
any other modern or old dialect with /h/. It is thus extremely unlikely that these are
old forms (pace Blevins 2013).
25 Only this latter one is h < *h. As Janhunen (2007) demonstrates, it is typical for /h/
to originate in different sources (“secondary laryngeals”), both in Uralic and in other
languages. Lakarra (2015a) adds three other sources of /h/: (5) *-r > -h, (6) /h/ in Gas-
conisms, and (7) hVR- > VRh- in Gasconisms and inherited words. There are, also,
some -b-, -d-, -g- > -h- in Contemporary Low Navarrase (cf. Camino 2014) and -r- >
-h- in Modern and Contemporary Zuberoan.
26 For the OPB and LPB accent, it should be remembered that the monosyllabic root
(accentuated at the outset) was not initial but final with the important typological
consequences that this implied (prefixes, not suffixes, etc.); cf. Section 6.2 ff.
27 In Hualde (1997b), one finds a synthesis of many of Hualde’s works on accentology
of the modern dialects (particularly the western ones) that revolutionized studies on
the subject. Hualde (2012) himself later continued with studies on accents in varieties
such Goizueta (Navarre). Other recent essays on the history of the accent are Martínez
Areta (2009), Elordieta (2011a-b), and Egurtzegi and Elordieta (2013).
28 Artiagoitia’s (1990) model is much more restrictive (CVC plus a extrametrical -C).
Forms with CC-, -CC, and -VV- are common in phonosymbolisms (brast ‘abrupt
start’, dzaust ‘dive’, etc.). Given that the canonical form of these usually (cf. IE, etc.)
approaches the mirror image of lexemes, we have here additional proof of the CVC
structure of the oldest B lexicon.
29 The expansion of derivation is a later phenomenon and belongs in great measure to
literary B; see Lakarra (1997a), Sarasola (1997), etc.
Basque and the reconstruction of isolated languages 87
30 Of great interest – although it corresponds to a much later stage – is the type of com-
plex reduplication ikusi-makusi examined by Igartua (2013), whose distant origins
could be in Turkish and surrounding languages and which would have been transmit-
ted via Arabic and Romance.
31 See the results there corresponding to fossils and loanwords in the most common
patterns.
32 Mitxelena pointed out that *d- > l, prior to LPB, may explain the lack of d- in inher-
ited terms; now (cf. Lakarra 2006b) we reconstruct multiple *dVC roots: apart from
the already known, *dun ‘must, to have to’ and *din ‘to become’, *don ‘to put, to
hang’, *dar ‘to sit, to get’, *den ‘to finish’, *dats ‘to go down’, etc.
33 As Meillet and Mitxelena demonstrated on multiple occasions, the number of recon-
structed phonemes is the minimum necessary to account for the morphemes in the
protolanguage and its historical cognates. There could, of course, have been other
phonemes that have not left sufficient traces or evidence of their existence. Of more
interest are Igartua’s works on aspiration (/h/), beginning with Igartua (2002), in
which he brilliantly related the change of aspiration and the root, or that of Igartua
(2008, 2015) on rhinoglottophilia, completing the known etymological character of
/h/ – derived from intervocalic /n/ – with numerous typological parallels.
34 The work of Morvan (2009) is a parody of proper etymological method: he is unaware
of testimonies and previous philological efforts, as well as the potentialities of internal
reconstruction, which he attempts to replace by a recourse to supposed Siberian, Dravid-
ian, Uralo-Altaic, and Amerindian genetic relationships, on his whim (see Lakarra 2017).
35 Probably in a much-reduced area; see Janhunen (1982 and 2009) on Proto-Uralic.
36 To which one should add everything related to grammaticalization; see Section 7.
37 In Lakarra (2010), we arrive at similar conclusions in another way: analyzing the
formation of numerals in B, in the same way as in many other languages, we observe
that counting began on the index finger, ignoring the thumb and, therefore, the middle
finger, that which is “uppermost,” was the second one; cf. Epps (2006) for Amazonian
languages, de Lamberterie (2000) for IE languages, etc.
38 There are insufficiently researched or unknown others like hor-tz ‘fang’ (cf. Sp can-
ino), ipurdi ‘bottom’ < *ibi-erdi ‘central ford’, laur ‘four’ < labur ‘short’, zur ‘wood’
and its derivative zuhur ‘wise’ or aretx ‘oak/tree’ in Old Biz, with parallels in IE
(cf. de Lamberterie 2000).
39 Blanco (2014) marks the beginning of lexicological analysis of the archaic corpus
(up to 1600); this analysis of the lexicon in the old and classic eras (1600–1745) is
extended by Blanco in his thesis (in progress).
40 See Dixon (2002) for the Australian languages and Austerlitz (1976) for the definition
of “agglutinative” in the Uralic, Turkic, Mongolic, and other languages of Eurasia:
these languages share common features such as suffixation, SOV order, vowel har-
mony, etc., as well as disyllabism of roots
41 Yet, as in English (AdjN), a more harmonic previous order (SOV in Old English,
SVO/VSO in PB: Trask 1997 / Gómez 1994). Add to this the lack of an adjective
open class, as in Tibeto-Burman (as opposed to the Dravidian agglutinative, see Bhat
2000). As pointed out in Section 5.5, there are reasons to think that the adjective was
not an open class in Tibeto-Burman, but, on the contrary, in modern Tani – and in
historical B –, adjectives do belong to an open class (cf. Post 2006).
42 The two determinate declensions are based on the grammaticalization of the article
(after the 8th century for the singular, later for the plural); cf. Mitxelena (1971) and
Manterola (2015).
43 Stassen, by the way, contends that in B there are no “WITH-language” structures
and is therefore badly informed by his B sources: as a fase sparita it appears in Old
88 Joseba A. Lakarra
Biz, A, G, and, at least, in Na and L oral ballads (see Lakarra 2008a). See Lakarra (in
progress-b) for more consequences of the reconstruction of the COMITATIVE for PB
morphology and syntax (sociative, modal adverbs, ‘abstracts objects’, dative flags,
etc.); for parallels, see Lord (1973, 1993: West African languages) and Chapell et al.
(2011: Chinese).
44 Note that the existence of disyllables and polysyllables in Aquitanian inscriptions
does not demonstrate widespread root disyllabism; only later – after disyllabic inputs
(not outputs) became the majority in word formation – could we speak of disyllabism
as widespread CRS. Cf. Feng (1997) and Duanmu (1999) for monosyllabic to disyl-
labic change in the history of Chinese. See Lakarra (2018) for a preliminar analysis of
some –n / -r / -l / -h / -ø alternations (and derived etymologies) in archaic CVC-word
formation.
45 The oldest allomorph (present as fase sparita in adjectives like hordi ‘drunk’, geldi
‘still’, handi ‘big’, etc.) also concurs with the remote future/potential -di (daidi, leidi,
etc.) as in other languages; cf. Heine and Kuteva (2002 s.u. COME) for parallels of
all these grammaticalizations.
46 Combinations of applicative/directional prefixes + causative are polymorphous (ar-,
jar-, inar-, ihar- or ur-, as well as eroan/eraman ‘to take’) as in Bantu but with the
difference that in that family both suffixes and their combinations are not very old
fossils as in B but instead are completely functional (cf. Good 2005, etc.).
47 Considered of unknown origin; it could come from a preposition similar to the English
to and similar forms in other languages. Of identical origin seem to be the V- in the
negation *eze, the supposed epenthetic -e- in local cases of consonant declension or
the conjunction e-ta, mentioned above; see Lakarra (in progress-b).
48 There are more than enough reasons for the destinative -ra to be an epenthetic -r- + -a
‘case marker’, in the same way as -ri (< *-r-i) in DAT or -ren in GEN (<*-r-e-n), i.e.,
such allomorphs are later reanalyses; see Section 7.5.
49 That is, it had prepositions and prefixes (both in verbs and in nouns, but not postposi-
tions and suffixes) as in the historical period.
50 The split ergativity in plural of demonstratives shows (cf. Manterola 2015) that the
distinction erg.pl./nom.pl. is secondary (Late Medieval), given that the article – the
base of both the singular and the plural in declension – is also a later development
among them (almost certainly after the 9th century).
51 Although this looks similar to Latin salix ‘willow’, it is not a loanword in Basque:
1) the sibilant in old Latin loans is z- (not the -tz of this case); 2) Latin-Romance
words borrowed are nearly alwasy borrowed in the accusative form, which would be
salice(m) in this case; 3) -ice or -icV does not palatalize (nor change to a sibilant nor
change the sound of the voiceless stop in words that pass into Basque); and 4) there
already exists a completely regular loan in Basque from that Latin or Indo-European
form, zarika ‘willow’. There are other arguments as well, but in short, Latin cannot
be the source of this Basque word.
52 Although e-ho(n) ‘to grind, crush’ does exist, though from *e-non, with *-n- > -h-,
subsequent to *Th- > h-; this verb does not possess any synthetic forms in contrast to
what happens with ekharri ‘to carry’, ekhusi ‘to see’, ethorri ‘to come’.
53 These last two chronological results clash directly with the “testimony” of the
Iruña-Veleia inscriptions (dating from the 3rd–4th centuries according to their sup-
posed “discoverers”. They are not, of course, the only (linguistic, epigraphic, or any
other type) oddities present in what constitutes one of the greatest modern European
hoaxes; see Gorrochategui (2011a and 2011b). Gorrochategui (2002) offers some
interesting approaches to dating Basque.
Basque and the reconstruction of isolated languages 89
54 The name of the language – from *enausi ‘to speak’ (Irigoyen 1977), perhaps better
*enotsi-hara – has nothing to do with the old ethnonym Auscii as has been suggested.
The nasal vowels in [êûskera] (< enusquera >, Esteban de Garibay, 16th century)
confirm the etymological character of the h- in heuskara ‘B language’: cf. Z harea, R
âria ‘sand’, ainzto ‘knife’, etc.
55 The connection between ‘job’ and ‘bread’ has to do with ‘the way to get bread’
extending in meaning to ‘to earn money’.
56 As is argued in Lakarra (2014), converting supposed “elections” like B egin ‘make’ :
Central *ezan ‘can’ : Eastern *iron ‘can’ or B eutsi ‘to hold up’: Central *nin ‘give’:
Eastern *eradun ‘to have for’, etc. into a series of innovations contributes to dating
and to giving history to features that have usually been addressed as belonging to the
timeless essences of the dialects. Thus in the first case, Biz shares the first three pro-
cesses with the language as a whole, in the fourth with most of the forms of speech
(all of them except Eastern ones), in the fifth with the Western ones as a whole, and
in the sixth with some parts of A and G: it does not appear that, diachronically, such
a choice confers it with a distinct personality, neither within the Western forms of
speech nor within all of them as a whole.
57 See a fuller treatment of the methodological questions explored here in Lakarra
(2014); we are far from having achieved the treatment that the numerous aspects and
implications of the issue demand.
58 We can suposse a family of two elements (PB and Aquitanian; cf. Campbell 2011)
through a series of similar considerations, including parallels in the languages like the
Germanic languages (see now Stiles 2013); however, in reconstructive practice such
an option has not been especially important; it seems preferable to assume that Aqui-
tanian is the brother of OCB and Pyrenean B and not LPB, the source of all of them.
59 The lack of attention to the chronology of innovations is noticeable in the case of G,
whose existence prior to Larramendi (18th c.) is debatable; nevertheless, it is typical
to find it counterposed to Biz, as if their modern differences were ab initio and not
much greater in recent centuries than in earlier ones.
60 There has been no research on dating the origin of G as a distinct dialect, but clearly
it is one of the most recent ones, based on the defining features of Zuazo (1998: 217):
1 Instability of the organic -a; 5 Change f > p;
2 Root -e- in pres. of *edun ‘(to) have’; 6 nor ‘who’/zein ‘which’ > zein ‘who,
which’;
3 Root -e- in present of izan ‘to be’; 7 Conjugated forms as nijoa ‘I’m going’
from joan ‘to go’;
4 Change d > r [intervocalic]; 8 Interrogative particle al.
G shares with A (2), (3), (4), and (5), and (8) does not appear until 1785; (1) also
appears to have spread in the 18th–19th centuries, the point of becoming a differenti-
ating and marking characteristic with respect to the other Southern dialects. However,
the innovations shared by G and A and with B are much older.
61 The bipartite classifications of Bonaparte and Lacombe (synchronic) and Uhlenbeck
(linked to supposed polygenesis) – both similar to our left tree – have little to do with
the conduct of diachronic dialectology and linguistics; see Mitxelena (1964), Lakarra
(1996 and 2014), among others.
62 Lately, Camino (2011, 2014, etc.) has also maintained that the first split involves
eastern forms of speech and has offered some indications of proof in this regard.
63 This hypothetical model of fragmentation of OCB could be compatible with ongoing
historical work (cf. Pozo 2016) that contends that, in the 5th century, an important
90 Joseba A. Lakarra
political entity emerged between Pamplona and the Pyrenees, directly related to the
later Kingdom of Pamplona.
64 Although for reasons of space I have only referred tangentially here to the philolog-
ical part, the importance of its development for advances in B diachrony is essen-
tial. Besides the monumental Mitxelena and Sarasola (1987–2005) and Lakarra,
Manterola and Segurola (2017), see among others Mitxelena (1958), Gorrochategui
(1984), Lakarra (1997a), Mounole and Lakarra (2017), Ulibarri (2013), and Urgell
(2013).
REFERENCES
Agud, Manuel and Antonio Tovar. 1988–1995. Materiales para un diccionario eti-
mológico de la lengua vasca (A-Orloi). Donostia-San Sebastián: Supplements of the
Anuario del Seminario de Filología Vasca Julio de Urquijo – International Journal of
Basque Linguistics and Philology, 7 vols.
Aikhenvald, Alexandra Y. 2006. Serial Verb Constructions in Typological Perspective.
Serial Verb Constructions. A Cross-Linguistic Typology, ed. by Alexandra Y. Aikhen-
vald and Robert M. W. Dixon, 1–68. Oxford: Oxford University Press.
Arbelaiz, Juán José 1978. Las etimologías vascas en la obra de Luis Michelena. Tolosa:
Kardaberatz.
Ariztimuño, Borja. 2013. Finite verbal morphology. In Martínez Areta, 359–427.
Artiagoitia, Xabier. 1990. Sobre la estructura de la sílaba en (proto)vasco y algunos
fenómenos conexos. Anuario del Seminario de Filología Vasca Julio de Urquijo -
International Journal of Basque Linguistics and Philology 24, no. 2: 327–349.
Austerlitz, Robert. 1976 [1970]. L’aglutination dans les langues de l’Eurasie septentrio-
nale. Études Finno-ougriennes 13: 7–12.
Azkue, Resurrección María de. 1923–1925. Morfología vasca. Reed. Bilbao: La Gran
Enciclopedia Vasca.
Bakrò-Nagy, Marianne Sz., 1992, Proto-Phonotactics. Phonotactic investigation of the
Proto-Uralic and Proto-Finno-Ugric consonant system. Wiesbaden: Studia Uralica 5,
Harrassowitz Verlag.
Barbero, Abilio and Marcelo Vigil. 1965. Sobre los orígenes sociales de la Reconquista:
cántabros y vascones desde fines del Imperio Romano hasta la invasión musulmana.
Sobre los orígenes sociales de la Reconquista, ed. by Abilio Barbero and Marcelo
Vigil, 13–103. Ariel: Barcelona 1974.
Barreña, Andoni, Ane Ortega, and Estibalitz Amorrortu. 2013. The Basque Language
Today: Achievements and Challenges. In Martínez Areta, 11–29.
Begiristain, Alazne. 2015. Gogoetak Mitxelenaren erronbo sistemaz eta honen historiaur-
reaz. Ikuspegi berri baterantz. BA Thesis, UPV/EHU.
Begiristain, Alazne. In progress. FHV-ko kontsonante-taldeekiko atalaz. MA Thesis,
UPV/EHU.
Benveniste, Émile. 1935. Origines de la formation des noms en indo-européen. Paris:
Maisonneuve.
Benveniste, Émile. 1954. Problémes sémantiques de la réconstruction. Reed. Problèmes
de linguistique générale, ed. by Émile Benveniste, 289–307. Paris: Gallimard, 1966.
Bhat, D. N. Shankara. 2000. Dravidian and Tibeto-Burman: A Typological Comparison.
International Journal of Dravidian Linguistics 29: 9–40.
Blanco, Endika. 2014. Euskara Arkaikoaren Lexikoiaz. Master Thesis, UPV/EHU.
Basque and the reconstruction of isolated languages 91
Donegan, Patricia and David Stampe. 2004. Rhythm and the Synthetic Drift of Munda.
The Yearbook of South Asian Languages and Linguistics 2004, ed. by Rajendra Singh,
3–36. Berlin and New York: de Gruyter.
Duanmu, Sam. 1999. Stress and the Development of Disyllabic Words in Chinese. Dia-
chronica 16: 1–36.
Echenique, Maria Teresa. 1983. Historia lingüística vasco-románica (2nd ed.). Madrid:
Paraninfo, 1987.
Egurtzegi, Ander. 2014. Towards a Phonetically Grounded Diachronic Phonology of
Basque. PhD, UPV/EHU.
Egurtzegi, Ander. 2018. Herskarien ustezko ahoskabetasun asimilazioa eta euskal her-
skari zaharren gauzatzea. Forthcoming in a festschrift edited by Joseba Lakarra &
Blanca Urgell.
Egurtzegi, Ander and Gorka Elordieta. 2013. Euskal azentueren historiaz. In Gómez et al.,
163–186.
Elmendorf, William. 1997. A Preliminary Analysis of Yukian Root Structure. Anthropo-
logical Linguistics 39: 74–91.
Elordieta, Gorka. 2011a. Euskal azentuaren bilakaera: hipotesiak eta proposamenak. In
Sagarna et al., 989–1014.
Elordieta, Gorka. 2011b. Euskal azentu eta intonazioari buruzko ikerketa: status quaes-
tionis. In Lakarra et al., 389–428.
Epelde, Irantzu. (ed.). 2014. Euskal dialektologia: lehena eta oraina. Bilbo: Supplements
of the Anuario del Seminario de Filología Vasca Julio de Urquijo – International Jour-
nal of Basque Linguistics and Philology, no. 69.
Etxepare, Bernat. 1545. Linguae Vasconum Primitiae. Edition by Patxi Altuna, Bilbao:
Euskaltzaindia.
Epps, Patience. 2006. Growing a Numeral System: The Historical Development of
Numerals in a Amazonian Language Family. Diachronica 23: 259–288.
Feng, Shengli. 1997. Prosodic Structure and Compound Words in Classical Chinese. New
Approaches to Chinese Word Formation: Morphology, Phonology and the Lexicon in
Modern and Ancient Chinese, ed. by Jerome L. Packard, 197–260. Berlin: Mouton de
Gruyter.
Fischer, Olga, Muriel Norde, and Harry Perridon (eds.). 2004. Up and Down the
Cline – The Nature of Grammaticalization. Amsterdam and Philadelphia: John
Benjamins.
Forni, Gianfranco. 2013. Evidence for Basque as a IE Language. Journal of Indo-Euro-
pean Studies 41, nos. 1–2: 39–180.
García Uriz, Eneko. 2016. Hitz hasierako herskari ahoskabeak euskararen historian. BA
Thesis, UPV/EHU.
Gavel, Henri. 1920. Éléments de phonétique basque. Paris: Revue Internationale des
Études Basques 12.
Gómez, Ricardo. 1994. Euskal aditz morfologia eta hitzordena: VSO-tik SOV-ra. La
langue basque parmi les autres, ed. by Jean-Baptiste Orpustan, 93–114. Baigorri:
Izpegi.
Gómez, Ricardo and Koldo Sainz. 1995. On the Origin of the Finite Forms of the Basque
Verb. In Hualde, Lakarra and Trask, 235–274.
Gómez, Ricardo, Joaquín Gorrochategui, Joseba Andoni Lakarra, and Céline Mounole
(eds.). 2013. Koldo Mitxelena Katedraren III. Biltzarra (Gasteiz, 8–11/X/2012).
Vitoria-Gasteiz: UPV/EHU.
Good, Jeffry. 2005. Reconstructing Morpheme Order in Bantu: The Case of Causativiza-
tion and Applicativization. Diachronica 22: 3–57.
Basque and the reconstruction of isolated languages 93
Hualde, José Ignacio. 2008. Acentuación y cronología relativa en la lengua vasca. Oihe-
nart 23: 199–217.
Hualde, José Ignacio. 2012. Two Basque Accentual Systems and the Notion of Pitch-
Accent Language. Lingua 122: 1335–1351.
Hualde, José Ignacio. (forthcoming). Dialektologia dinamikoa. In a volume of Lapurdum
edited by Irantzu Epelde.
Hualde, José Ignacio, Joseba Andoni Lakarra, and Larry Trask (eds.). 1995. Towards a
History of Basque Language. Amsterdam and Philadelphia: John Benjamins.
Hualde, José Ignacio and Jon Ortiz de Urbina (eds.). 2003. A Grammar of Basque. Berlin:
Mouton de Gruyter.
Igartua, Iván. 2002. Euskararen hasperena ikuspegi tipologiko eta diakronikotik. Erramu
Boneta: Festschrift for Rudolf P.G. de Rijk, Supplements of the Anuario del Seminario
de Filología Vasca Julio de Urquijo – International Journal of Basque Linguistics
and Philology no. 44, ed. by Xabier Artiagoitia, Patxi Goenaga, and Joseba Andoni
Lakarra, 366–389. Bilbao: UPV/EHU.
Igartua, Iván. 2008. La aspiración de origen nasal en la evolución fonológica del eus-
kera: un caso de rhinoglottophilia. Anuario del Seminario de Filología Vasca Julio
de Urquijo - International Journal of Basque Linguistics and Philology 42: 171–189.
Igartua, Iván. 2013. La reduplicación compleja en euskera: notas acerca de su formación y
sus paralelos en otras lenguas. Fontes Linguae Vasconum. Studia et Documenta 45: 5–30.
Igartua, Iván. 2015. Diachronic Effects of Rhinoglottophilia, Symmetries in Sound
Change, and the Curious Case of Basque. Studies in Language 39, no. 3: 635–663.
Irigoyen, Alfonso. 1977. Geure hizkuntzari euskaldunok deritzagun izenaz. Euskera 22,
no. 2: 513–538.
Janhunen, Juha. 1982. On the structure of Proto-Uralic. Finnisch-ugrische Forschungen
44: 23–42.
Janhunen, Juha. 1997. Problems of Primary Root Structure in Pre-Proto-Japonic. Inter-
national Journal of Central Asian Studies 2, (ed. in chief Choi Hab-Woo). The Interna-
tional Association of Central Asian Studies Institute of Asian Culture and Development.
Janhunen, Juha. 2007. The Primary Laryngal in Uralic and Beyond. Sámit, sánit, sát-
nehámit. Riepmočála Pekka Sammallahtii miessemánu 21. beaivve 2007, 253, 203–
227. Helsinki: Suomalais-Ugrilaisen Seuran Toimituksia = Mémoires de la Société
Finno-Ougrienne.
Janhunen, Juha. 2009. Proto-Uralic: What, Where, and When? The Quasquicentennial
of the Finno-Ugrian Society, ed. by Jussi Ylikoski, 57–78. Helsinki: Mémoires de la
Société Finno-Ougrienne 258.
Krajewska, Dorota. In progress. *Historical syntax of relatives in Basque. PhD, UPV/EHU.
Lafitte, Pierre. 1944. Grammaire basque (navarro-labourdin littéraire). Facsimil, Donostia-
San Sebastián: Elkar 1979.
Lafon, Réné. 1943. Le système du verbe basque au XVIème siècle. Donostia-San
Sebastián: Elkar 1980.
Lafon, Réné. 1948. Sur les suffixes casuels -ti/-tik en basque. In Lafon 1999, 199–207.
Lafon, Réné. 1999. Vasconiana. Bilbao: Iker 11, Euskaltzaindia.
Lakarra, Joseba Andoni. 1995. Reconstructing the Root in Pre-Proto-Basque. In Hualde
et al., 189–206.
Lakarra, Joseba Andoni. 1996. Refranes y Sentencias: ikerketak eta edizioa. Bilbao:
Euskaltzaindia.
Lakarra, Joseba Andoni. 1997a. Euskararen historia eta filologia: arazo zahar, bide berri.
Anuario del Seminario de Filología Vasca Julio de Urquijo – International Journal of
Basque Linguistics and Philology 31: 447–535.
Basque and the reconstruction of isolated languages 95
Lakarra, Joseba Andoni. 2015a. Hiru hasperen haboro. Eridenen du zerzaz kontenta.
Sailkideen omenaldia Henrike Knörr irakasleari, ed. by Ricardo Gómez and Maria
Jose Ezeizabarrena, 349–378. Bilbao: UPV/EHU.
Lakarra, Joseba Andoni. 2015b. Saratsola eta (aitzin)eusk(ar)en geruzak. In Beatriz
Fernández and Pello Salaburu (eds.), Ibon Sarasola. Gorazarre. Homenatge. Home-
naje, Donostia-San Sebastián: UPV/EHU, 419–439.
Lakarra, Joseba Andoni. 2015c. Bi eta bere askazia. Forthcoming in a festschrift.
Lakarra, Joseba Andoni. 2016. Gramatikalizazioa, morfemen forma kanonikoak eta ber-
reraiketa morfologikoaren bide berriak. Txipi Ormaetxea omenduz. Hire bordatxoan,
ed. by Gotzon Aurrekoetxea, Jesus Mari Makazaga, and Patxi Salaberri, 175–192.
Leioa UPV/EHU.
Lakarra, Joseba Andoni. 2017. Prehistoria de la lengua vasca. In Gorrochategui, Igartua
and Lakarra. Forthcoming.
Lakarra, Joseba Andoni. 2018. CVC berreraikiaz: aurretik eta atzetik. Forthcoming in a
festschrift ed. by Joseba A. Lakarra and Blanca Urgell.
Lakarra, Joseba Andoni. In progress-a. Prehistoria del comitativo e implicaciones para la
reconstrucción de la morfología y la sintaxis protovasca. Ms. UPV/EHU.
Lakarra, Joseba Andoni. In progress-b. Protogenitivo, protolocativo y sintaxis del proto-
vasco. Ms. UPV/EHU.
Lakarra, Joseba Andoni, Joaquín Gorrochategui, and Blanca Urgell (eds.). 2011. II Con-
greso de la Cátedra Luis Michelena. Bilbao: UPV/EHU.
Lakarra, Joseba Andoni and José Ignacio Hualde (eds.). 2006. Studies in Basque and
Historical Linguistics in Memory of Robert Larry Trask (=Anuario del Seminario de
Filología Vasca Julio de Urquijo – International Journal of Basque Linguistics and
Philology XL, 1–2). Bilbao: UPV/EHU.
Lakarra, Joseba Andoni, Julen Manterola, and Iñaki Segurola. 2016. Los estudios eti-
mológicos vascos: historia y perspectivas. Etimología e historia en el léxico del
español. Estudios ofrecidos a José Antonio Pascual (Magister bonus et sapiens), ed.
by Mariano Quirós, 843–869. Madrid/Frankfurt: Iberoamericana/Vervuert.
Lakarra, Joseba Andoni, Julen Manterola, and Iñaki Segurola. 2017. Euskal hiztegi his-
toriko etimologikoa. Bilbao: Euskaltzaindia.
Lamberterie, Charles de. 2000. Problèmes sémantiques de la reconstruction en indo-
européen. Théories contemporaines du changement sémantique, ed. by Jacques
François, 109–134. Leuven: Peeters.
Larramendi, Manuel de. 1729. El imposible vencido. Arte de la lengua bascongada. Sal-
amanca. Facsimil Hordago, Donostia-San Sebastián, 1979.
Larramendi, Manuel de. 1745. Diccionario Trilingüe del castellano, bascuence y latín,
Donostia-San Sebastián. Facsimil Txertoa, Donostia-San Sebastián, 1984.
Lord, Carol. 1973. Serial verbs in transitions. Studies in African Linguistics 4, no. 3, 269–296.
Lord, Carol. 1993. Historical Change in Serial Verb Constructions. Amsterdam: John
Benjamins.
Madariaga, Juán. 2008. Apologistas y detractores de la lengua vasca. Donostia-San
Sebastián: Fundación para el estudio del Derecho Histórico y Autonómico de Vasconia.
Manterola, Julen. 2015. Euskararen morfologia historikorako. Artikuluak eta erakusleak.
Towards a history of Basque morphology: articles and demonstratives. PhD, UPV/EHU.
Martinet, André. 1950. De la sonorisation des occlusives initiales en basque. Word 6:
224–233.
Martinez Areta, Mikel. 2006. El consonantismo protovasco. PhD, UPV/EHU.
Basque and the reconstruction of isolated languages 97
Martinez Areta, Mikel. 2009. El acento protovasco. Anuario del Seminario de Filología
Vasca Julio de Urquijo - International Journal of Basque Linguistics and Philology
38: 135–206.
Martinez Areta, Mikel. (ed.). 2013. Basque and Proto-Basque. Frankfurt: Peter Lang.
Meillet, Antoine. 1925, La méthode comparative en linguistique historique, Paris. Reed.
Klincksieck 1970.
Michelena, Luis = Koldo Mitxelena.
Mitxelena, Koldo. 1950. De etimología vasca. In Mitxelena 1988, 439–444.
Mitxelena, Koldo. 1951. La sonorización de las oclusivas iniciales. In Mitxelena 1988,
203–211.
Mitxelena, Koldo. 1954a. Nota sobre algunos pasajes de los Refranes y Sentencias de
1596. In Mitxelena 1988, 792–798.
Mitxelena, Koldo. 1954b. De onomastica aquitana. In Mitxelena 1987, 409–445.
Mitxelena, Koldo, 1956, La lengua vasca como medio de conocimiento histórico, Zumar-
raga 6, 49–70.
Mitxelena, Koldo. 1957a. Las antiguas consonantes vascas. Reed. in Mitxelena 1988,
166–189.
Mitxelena, Koldo, 1957b, Basque et roman. Reed. in Mitxelena 1988, 106–115.
Mitxelena, Koldo. 1958. Introducción [a Landucci (1562)]. In Mitxelena 1988, II,
762–782.
Mitxelena, Koldo. 1961/77. Fonética histórica vasca (2nd revised ed.). Donostia-San
Sebastián: Supplement of the Anuario del Seminario de Filología Vasca Julio de
Urquijo – International Journal of Basque Linguistics and Philology no. 4.
Mitxelena, Koldo. 1963. Lenguas y protolenguas. In Mitxelena, Donostia-San Sebastián:
Supplement of the Anuario del Seminario de Filología Vasca Julio de Urquijo –
International Journal of Basque Linguistics and Philology no. 20, 1990.
Mitxelena, Koldo. 1964. Sobre el pasado de la lengua vasca. In Mitxelena 1988, 1–73.
Mitxelena, Koldo. 1971. Toponimia, léxico y gramática. In Mitxelena 1987, 141–167.
Mitxelena, Koldo. 1973. Apellidos vascos (3rd ed.). Donostia-San Sebastián: Txertoa.
Mitxelena, Koldo. 1974. El elemento latino-románico en la lengua vasca. In Mitxelena
1987, 195–219.
Mitxelena, Koldo. 1977. Notas sobre compuestos verbales vascos. In Mitxelena 1987,
311–335.
Mitxelena, Koldo. 1979. La langue ibère. Reed. in Mitxelena 1985, 341–356.
Mitxelena, Koldo. 1981. Lengua común y dialectos vascos. In Mitxelena 1987, 35–55.
Mitxelena, Koldo. 1987. Palabras y Textos, ed. Joaquín Gorrochategui. Bilbao: UPV/
EHU.
Mitxelena, Koldo. 1988, Sobre historia de la lengua vasca, ed. Joseba Andoni Lakarra.
Donostia-San Sebastián: Supplements of the Anuario del Seminario de Filología Vasca
Julio de Urquijo – International Journal of Basque Linguistics and Philology 10, 2 vols.
Mitxelena, Koldo. 2011–2012. Luis Michelena. Obras Completas, ed. Joseba Andoni
Lakarra and Iñigo Ruiz Arzalluz, Bilbao/Donostia-San Sebastián: Supplements of the
Anuario del Seminario de Filología Vasca Julio de Urquijo – International Journal of
Basque Linguistics and Philology, 15 vols.
Mitxelena, Koldo and Ibon Sarasola. 1987–2005. Diccionario general vasco. Orotariko
Euskal Hiztegia. Bilbao: Euskaltzaindia.
Morvan, Michel. 2009. Dictionnaire étymologique basque-français-espagnol. www.
lexilogos.com/basque_dictionnaire.htm.
98 Joseba A. Lakarra
M. Ahmad, and H.M. Yakasai, 579–598. Port Hardcourt: M. and J. Grand Orbit
Communications.
Stassen, Leon. 2000. AND-languages and WITH-languages. Linguistic Typology 4: 1–54.
Stiles, Patrick V. 2013. The Pan-West Germanic Isoglosses and the Sub-Relationships of
West Germanic to Other Branches. NOWELE 66: 1, 5–38.
Straka, Georges. 1951–1954. Observations sur la chronologie et les dates de quelques
modifications phonétiques en roman et en français prélittéraires. Revue des Langues
Romanes 71: 247–307.
Thomason, Sandra G. 1993. Copying with Partial Information in Historical Linguistics.
Historical Linguistics 1989: Papers from the Ninth International Conference on His-
torical Linguistics, ed. by Henk Aertsen and Robert J. Jeffers, 485–496. Amsterdam:
John Benjamins.
Thomason, Sandra G. 2001. Language Contact: An Introduction. Edinburgh: Edinburgh
University Press.
Tovar, Antonio. 1981. Mitología e ideología sobre la lengua vasca. Madrid: Alianza
Editorial.
Trask, Robert Larry. 1977. Historical Syntax and Basque Verbal Morphology.
Anglo-American Contributions to Basque Studies: In Honor of Jon Bilbao, ed. by Wil-
liam A. Douglass, Richard W. Etulain, and William H. Jacobsen Jr., 203–217. Reno:
University of Nevada.
Trask, Robert Larry. 1996. Historical Linguistics. London and New York: Arnold.
Trask, Robert Larry. 1997. The History of Basque. London: Routledge.
Trask, Robert Larry. 1998. The Typological Position of Basque: Then and Now. Lan-
guage Sciences 20: 313–324.
Trask, Robert Larry. 2008. Etymological Dictionary of Basque, ed. by Max W. Wheeler.
Sussex: University of Sussex.
Uhlenbeck, Cornelius. 1942. Les couches anciennes du vocabulaire basque. Eusko-
Jakintza 1 (1947): 543–581. French translation of Dutch original by Georges Lacombe.
Ulibarri, Koldo. 2013. External History. Sources for Historical Research. In Martínez
Areta, 89–117.
Urgell, Blanca. 2013. Euskal filologia: zer (ez) dakigu 25 urte beranduago? In Gómez
et al. (eds.), 533-570.
Vovin, Anatoly. 1994, (Art.-Review) Long Distance Relationships, Reconstruction Meth-
odology and the Origins of Japanese, Diachronica 11: 1, 95–114.
Watkins, Calvert. 1984. L’apport d’Émile Benveniste à la grammaire comparée. Émile
Benveniste aujourd’hui. Actes du Colloque international du CNRS, ed. by Guy Serbat,
I, 3–11. Louvain: Peeters.
Watkins, Calvert. 1990. Etymologies, Equations, and Comparanda: Types and Values,
and Criteria for Judgment. Patterns of Change, Change of Patterns: Linguistic Change
and Reconstruction Methodology, ed. by Philip Baldi, 289–304. Berlin and New York:
Mouton de Gruyter.
Zuazo, Koldo. 1998. Euskalkiak gaur. Fontes Lingua Vasconum. Studia et Documenta
30: 191–233.
CHAPTER 4
AINU
Thomas Dougherty
1 INTRODUCTION
Ainu (アイヌイタㇰ Aynu Itak) is a dormant language isolate previously spoken on the
northernmost Japanese island of Hokkaidō, as well as southern half of the Russian island of
Sakhalin, and in the disputed Kuril Islands. Map 4.1 depicts the historically attested range of
the Ainu language in Northeast Asia. See Vovin (2009) for discussion of its suspected range
in to the south on the Japanese main island of Honshū based on toponymical evidence.
This chapter primarily provides a typological overview of the Hokkaidō varieties of
Ainu, highlighting some salient morphosyntactic phenomena found in Ainu. It focuses
mainly on data from Ainu oral literature, as it is the most well-preserved genre of Ainu
usage. In addition, proposed relationships and language contact are touched upon, as well
as a brief discussion of the extant scholarly materials and primary sources of Ainu.
None of these claims has been accepted by the scholarly community at large. Vovin (1993)
specifically argues against several and proposes very initial (but likely possibly spurious)
comparisons for the two affiliations he suggests, with Austroasiatic and/or Hmong-Mien
(Vovin 1993: 190–209).
While most of the claims with regards to a genetic relationship are questionable, lan-
guage contact between Ainu and Japanese on the one hand and Ainu and Nivkh on the
other are well-substantiated, albeit understudied.
Ainu has had a relatively close contact relationship with Japanese throughout the
recorded history of both languages. Vovin’s (2012 and 2013) translations of Old
Japanese poetry dating to the 700s ce detail some proposed Ainu loans in Eastern Old
Ainu 101
Japanese poetry (Vovin 2012: 11–12 et passim; Vovin 2013: 13–15 et passim). Among
these are place names, for instance, Eastern Old Japanese (henceforth EOJ) Töya [təja] <
Ainu to ya ‘lake shore’ – but some involve even function words, including the relativizer
EOJ siNda [sinda] ‘when’ < Ainu hi-ta ‘time-loc’ (Vovin 2012: 12).
Other, more well-known examples of loanwords from Ainu into Japanese are the kind
of lexical items we might expect to enter a language of higher prestige from a language of
lower status: animals found in areas traditionally inhabited by the Ainu and not colonized
by the Japanese until quite late in the history of Japan, such as the ‘sea otter’ (Enhydra
lutris): Ainu rakko > Japanese ラッコ rakko. By and in large, though, the direction of con-
tact seems to be from Japanese into Ainu. For example, material items not found tradition-
ally among the Ainu but important for Ainu-Japanese trade, such as ‘tobacco’ (Ainu tabako
< Japanese たばこ tabako) are common and expected loanwords from Japanese into Ainu.
102 Thomas Dougherty
Similar, but less well-studied contact exists between Ainu and Nivkh. For instance, the
reindeer (Rangifer tarandus) is not found in areas traditionally inhabited by the Ainu. The
Nivkh, however, living at the mouth of the Amur River on the Asian mainland and in
the north of Sakhalin, encountered various Tungusic peoples (such as the Ul’ta [or Orok])
who herd domesticated reindeer. Thus, the Sakhalin Ainu word tunakay ‘reindeer’ is a
loan from Nivkh. Compare the Amur Nivkh form чолӈи [cʰolŋi] ‘reindeer’ and Sakhalin
Nivkh form тлаӈи [tlaŋi] ‘reindeer’. Note also that this word has been subsequently
borrowed from Ainu into Japanese, as トナカイ tonakai.
3.1 Orthographies
Ainu is written in two orthographies: one based on the Latin alphabet, the other based
on a modified version of the Japanese katakana syllabary. Some materials, such as Nak-
agawa and Nakamatsu (2004 and 2007) – textbooks for basic instruction in Ainu – are
written in parallel in both the katakana script and the Latin script. Scholarly work tends
to be only in the Latin script (e.g., Nakagawa 2000).
Although extended katakana for Ainu are a part of the Unicode standard, many fonts
do not support them, and workarounds (such as using a smaller font for the half-size
katakana used to write final stop consonants) are often needed.
3.2 Dictionaries
The earliest dictionary of Ainu is the Moshiogusa, compiled in 1792 by Chōshaburō
Abe and his interpreter, Uehara Kumajirō. Several manuscripts of this dictionary survive,
though little modern scholarship has examined it (see, for instance, Sato 2007). Of spe-
cial interest is an appendix of Ainu texts, which, to the best of my knowledge, remains
untranslated and unanalyzed.
The Rev. John Batchelor’s dictionary, first compiled in 1889, is well known but has
many mistaken translations and transcriptions, as well as “ghost lexemes” which do not
appear in any other source and should mostly be avoided in favour of the other dictionar-
ies mentioned here. The most widely available version is the second edition (Batchelor
1905), which has been digitized, though there were two later editions, for four total.
Mashiho Chiri produced what is likely the most important dictionary of Ainu, Bun-
rui Ainugo Jiten (Categorical Ainu Dictionary). Chiri’s intention was to publish this
in several volumes, but unfortunately, only two volumes (Chiri 1953 and 1954) were
published before he passed away. An additional volume, Chiri (1962) was published
posthumously. Shirō Hattori and Mashiho Chiri published an Ainu dialect dictionary,
Ainugo Hōgen Jiten, in 1964. This work builds on some of the last fieldwork done in
consultation with speakers of Ainu and forms the basis for the two reconstructions of
Proto-Ainu (see below). Additionally, this work has not only Japanese glosses of the
Ainu vocabulary, but English glosses as well, and will be of interest to non-Japanese-
speaking researchers.
Ainu 103
3.3 Grammars
There are several grammars of Ainu. The early Japanese-language grammars (Kindaichi 1931,
Chiri 1942, and Chiri 1974) are short – what today might be considered sketch grammars –
but are still valuable due to the fact that the authors were in consultation with native speakers.
Chiri and Kindaichi (1973) provide a brief grammatical sketch of Sakhalin Ainu, while Mura-
saki (1977 and 1978) describe the Raychiska variety of Sakhalin in more detail.
Refsing (1986) is a much longer grammar of Ainu, focusing primarily on the Shizunai
dialect of southern Hokkaidō, as spoken by consultants Refsing worked with in 1980
and 1981 (Refsing 1986: 65). The grammar itself is idiosyncratic, lacking a phonological
and phonetic analysis and instead focusing on a peculiar compromise of grammatical
approaches based on the early work by the likes of Chiri and Kindaichi, as well as on
later work on Japanese (not Ainu) by Bruno Lewin (1959) and Samuel E. Martin (1975).
In addition to these, the first half of Shibatani (1990) is a description of the Ainu lan-
guage. However, it is mostly compiled from secondary sources, and while it does serve
as a valuable English-language resource on Ainu, there are some issues especially with
regards to Ainu morphology which perhaps more recent works approach differently and
more comprehensively.
Tamura (2000) is the most recent English-language grammar of Ainu. While not quite
as long as Refsing (1986), it is perhaps the compromise first choice for scholars who do
not speak Japanese but are interested in starting to work with Ainu.
4 INTERNAL DIVERSITY
Ainu is primarily divided into three groups: varieties found on Hokkaidō, varieties found
in the Kuril Islands, and varieties found on Sakhalin Island. From the sources, Vovin
(1993) identifies and deals with data from the following dialects:
It is unclear if the varieties were mutually unintelligible with one another, thus perhaps
more properly rendering them into two or more separate Ainu languages (Hokkaidō Ainu
and Sakhalin Ainu), rather than a single Ainu language isolate. Claims have been made
in both directions. For example, Refsing (1986: 53) remarks that Sakhalin and Hokkaidō
varieties of Ainu are not generally mutually intelligible. But by and large, this is mostly
an academic issue, as sometimes the decisions about whether we are dealing with dialects
of a single language or with distinct but closely related languages is a very difficult and
uncertain one. It is therefore also the case that often the difference between a language
isolate (with its varieties) and a language family is not clear (see Campbell, this volume),
especially in the case of those that lack a standardized variety. This is true even more so
given that Ainu is a dormant language, making the testing of mutual intelligibility no
longer possible.
There are two reconstructions of Proto-Ainu, based on applying the comparative
method to forms from the various dialects. The earlier of the two works is Vovin
(1993), which, as mentioned above, also entertains a number of hypotheses about
long-distance relationships between Ainu and other languages but does not accept
any of them as conclusive (Vovin 1993: 210–211). Additionally, it includes a partial
English translation of I. G. Voznesenskii’s Ainu-Russian glossary (Vovin 1993: 215).
Alonso de la Fuente (2012) is the more recent reconstruction, and builds in part on
Vovin’s reconstruction.
In addition to Vovin’s (1993) and Alonso de la Fuente’s (2012) reconstructions of Ainu,
there are also a handful works applying computational methods to the problem of sub-
grouping Ainu. These include a lexicostatistical study by Hattori and Chiri (1960), a
unique clustering algorithm developed by Asai (1974), and a more recent article using
Bayesian maximum likelihood methods of phylogenetic inference borrowed from biol-
ogy by Lee and Hasegawa (2013).
5 TYPOLOGICAL OVERVIEW
This section gives a typological overview of Ainu, beginning with its phonology, then
moving on to several typologically salient morphosyntactic phenomena found in Ainu.
5.1 Phonology
The phonology of Ainu is in and of itself relatively unremarkable, with a typically sized
inventory of consonants and vowels, and no cross-linguistically uncommon sounds, but
in comparison to the two proposed language areas which it borders (the Siberian language
Ainu 105
area and the Altaic language area),1 this “normality” of Hokkaidō Ainu (as well as other
varieties of Ainu and the reconstructions of Proto-Ainu) is typologically unusual.
5.1.1 Vowels
Ainu has five vowels, /i/, /e/, /a/, /o/, /u/, detailed in Table 4.1.
5.1.2 Consonants
Ainu has 11 consonants, detailed in Table 4.2. Its consonantal and phonotactic sys-
tem is unusual in comparison to the Altaic and Siberian language areas. For instance,
Siberian-type languages tend to have a four-way nasal contrast, between /m/, /n/, /ɲ/, and
/ŋ/ (Anderson 2006: 268–272). Ainu only has a two-way nasal contrast, between /m/ and
/n/. Similarly, in Altaic-type languages, rhotics cannot appear in absolute word-initial
position. They may do so freely in Ainu, for example, the word rakko ‘sea otter’.
Close i u
Mid e o
Open a
Ainu pitch accent is based on a relative rise in pitch, unlike Japanese, for instance,
where pitch accent is characterized by a fall in pitch (McCawley 1968). In the Chitose
dialect of Hokkaidō, for example, there is always a rise in pitch from a relatively low
pitch in the preceding unaccented syllable to a relatively high pitch in the accented syl-
lable (Nakagawa and Nakamoto 2004: 15). Further rules, such as closed syllables and
stems attracting accent, have been proposed (see, for instance, Nakagawa and Nakamoto
2004: 15–17), but significant counter-examples exist, meaning that there is no consensus
on the phonological reality of these rules (see discussion in Vovin (1993) on the recon-
struction of the Ainu prosodic system).
5.2 Morphosyntax
Ainu has several salient morphosyntactic features which merit some extended discus-
sion, including its system of verbal agreement, noun incorporation, three applicative mor-
phemes, and its system of marking evidentiality.
Relative clauses, and other diagnostic tools, all point towards Ainu being a typical
nominative-accusative language with a relatively fixed SV/AOV word order.
Note that Ainu verbal agreement markers – described in more detail below – show
several different alignment patterns. First person singular agreement markers are in a
nominative-accusative pattern, with a prefix shared by first person singular subjects of
intransitive and transitive verbs, and a separate prefix for first person singular direct
objects of transitive verbs. Second person singular and plural core arguments (subjects
of intransitive and transitive verbs, as well as direct objects of transitive verbs) display a
neutral pattern, distinguishing only plurality. Third person singular and plural core argu-
ments are null-marked and thus can be analyzed as neutral. First person plural and inclu-
sive person agreement markers are in a tripartite pattern, with unique markers for subjects
of intransitive verbs, subjects of transitive verbs, and direct objects of transitive verbs.
5.2.2 Possession
Ainu has three strategies for expressing possession. The first of these is a possessive suf-
fix, which denotes not only possession but also definiteness. The second, a morphological
Ainu 107
strategy, uses the equivalent to verbal agreement markers directly on a noun to express
inalienable possession. Third, Ainu uses a periphrastic construction with the verb kor ‘to
have’ to express alienable possession.
(2) ku-tek-ehe
1sg.poss-hand-poss
‘my hand’ (Refsing 1986: 81; glossing altered to match my own)
order alone, Ainu is an accusative type language, and these different alignments are
apparently secondary phenomena.
First person singular markers have an accusative morphosyntactic alignment, with the
markers for the single argument of intransitive verbs and for the Agent-like argument of
transitive verbs being identical, and with a separate marker for Patient-like argument of
transitive verbs, as seen in examples (4) and (5).
1sg.a k(u)-Ø-
- eci-
1pl.a c(i)-Ø-
2sg.a en- un- - e-Ø-
2pl.a eci-en- eci-un- eci-Ø-
3sg.a
3pl.a en- un- Ø-e- Ø-eci- Ø-Ø- or -
5.2.4 Pluriactionality
Ainu has a closed class of verbs that express pluriactionality, also called verbal number.
At first blush, this system appears to have an absolutive alignment, as it tends to coincide
with the single argument of intransitive verbs and the Patient-like argument of transitive
verbs in terms of number, but it can instead represent the number of times an action is
done, either one (singular) or many (plural).
Verbs that display pluriactionality fall into two classes: suppletive verbs (where sup-
pletion distinguishes the singular and plural form of the verb, as shown in (12a) and (b))
and regular verbs (where suffixes distinguish the singular and plural form of the verb, as
shown in (12c) and (d)). Note that the majority of Ainu verbs are not marked for pluriac-
tionality (shown in (12e) and (f)), and it is one of the grammatical features of Ainu which,
at the time Ainu became dormant, appears to have been undergoing attrition.
Both suffixes display significant allomorphy. For the singular suffix, vowel-final verb
stems take the allomorph -n. Consonant-final verb stems, except those ending in y- or w-,
take the allomorph -i. Finally, glide-final verb stems take the allomorph -e. For the plural
suffix, vowel-final verb stems take the allomorph -p, while consonant-final verb stems
take the allomorph -pa.
5.2.6 Applicatives
Ainu has three applicative morphemes, e-, ko-, and o-. Applicative constructions are sig-
nalled by an overt morphological marker on the verb, which increases its valency by one,
allowing for the coding of a syntactically and semantically peripheral argument as a new
core argument (Bugaeva 2010: 752). The most widely attested kind of applicative con-
struction cross-linguistically promotes a beneficiary argument to direct object, and these
do occur in Ainu, as seen in (17):
Special note should be given to the use of e- as an applicative marker to promote new
Themes to a verb. This does not resemble a prototypical applicative, as Theme is not a
peripheral thematic role, which is what applicatives usually promote into core thematic
roles (Bugaeva 2010: 769).
The applicative ko- most frequently co-occurs with the semantic roles of Addressee,
Goal, Recipient/Beneficiary, Comitiative (Co-Patient), and Malefactive Source (Bugaeva
2010: 759). Examples of applied Addressee objects and Comitative (Co-Patient) appli-
cative objects are given in (20) and (21), respectively. Again, these examples have been
modified to match my own glossing style and interpretation.
Applicative addressee objects with ko- are similar to Content applied objects with e-
but contrast in the fact that Addressee applicative objects tend to be animate, while Con-
tent applicative objects tend to be inanimate (Bugaeva 2010: 782). However, this is only
a tendency (Bugaeva 2010: 782).
The applicative o- co-occurs with the semantic roles of Location and Goal (Bugaeva
2010: 759). Again, these examples have been modified to match my own glossing style
and interpretation.
5.2.7 Evidentiality
Ainu has several evidential markers that come from grammaticalized nouns. These
include hawe, ‘voice; reportative evidential’; humi ‘sound; nonvisual sensory evidential’;
ruwe ‘inferential/factual evidential’ (perhaps from ru ‘path, tracks’ Chiri 1974: 155); and
siri ‘appearance; visual sensory evidential’.
Ainu 113
Abbreviations
1 first person imp imperative
2 second person indf indefinite person
3 third person loc locative
a Agent-like argument neg negative
of transitive verbs
advz adverbializer nmlz nominalizer
agnt agentifier nvis nonvisual sensory evidential
apl applicative o Patient-like argument of transitive
verbs
asoc associative plural pfv perfective aspect
aux auxiliary pl plural
caus causative pol polite
clf classifier poss possessive
comp comparative prog progressive aspect
cop copula q question particle
dat dative rep reportative evidential
dir directive s single argument of intransitive
verbs
emph emphatic sg singular
fact factual evidential vis visual sensory evidential
NOTES
1 We note that “Altaic language area” refers to the languages of this zone that
show areal similarities. This in no way implies acceptance of the disputed and
mostly discarded “Altaic hypothesis” of a genetic relationship shared by these
languages.
2 Note that Bugaeva (2010: 776) also cites the example in this passage, but her and my
interpretation of the second word differ enough that I have cited Nakagawa (2002) here
directly.
114 Thomas Dougherty
REFERENCES
Alonso de la Fuente, José Andrés. 2012. The Ainu Languages: Traditional Reconstruc-
tion, Eurasian Areal Linguistics, and Diachronic (Holistic) Typology. PhD disserta-
tion, Universidad de País Vasco-Euskal Herriko Unibersitatea.
Anderson, Gregory. 2006. Towards a Typology of the Siberian Linguistic Area. Linguistic
Areas: Convergence in Historical and Typological Perspective, ed. by Yaron Matras,
April McMahon, and Nigel Vincent, 266–300. Houndmills, UK and New York: Pal-
grave Macmillan.
Asai, Tōru. 1974. Classification of Dialects: Cluster Analysis of Ainu Dialects. Hoppō
Bunka Kenkyū [Northern Cultures Research] 8: 66–135.
Batchelor, John. 1905. An Ainu-English-Japanese Dictionary: Including a Grammar of
the Ainu Language (2nd ed.). Tōkyō: Methodist Publishing House.
Bugaeva, Anna. 2006. Applicatives in Ainu. Chiba Daigaku Yūrashia Gengo Bunka Ron
Kōza [Chiba University Journal of Eurasian Language and Culture] 9: 185–196.
Bugaeva, Anna. 2008. Reported Discourse and Logophoricity in Southern Hokkaido Dia-
lects of Ainu. Gengo Kenkyū 133: 31–75.
Bugaeva, Anna. 2010. Ainu Applicatives in Typological Perspective. Studies in Lan-
guage 34, no. 4: 749–801.
Chiri, Mashiho. 1942. Ainu Gohō Kenkyū [A Study of Ainu Grammar]. Karafuto-chō
Hakubutsukan Hokōku (Karafuto Museum Report), 4.
Chiri, Mashiho. 1953–1962. Bunrui Ainugo Jiten [Categorical Ainu Dictionary]. Tōkyō:
Nihon Jōmin Bunka Kenkyū-jo.
Chiri, Mashiho. 1974. Ainu Gohō Gaisetsu [An Outline of Ainu Grammar]. Chiri
Mashiho Chosakushū [Mashiho Chiri Collection], vol. 4, 3–197. Tōkyō: Heibonsha.
Chiri, Mashiho and Kyōsuke Kindaichi. 1973. Ainu Gohō Kenkyū – Karafuto Hōgen o
Chūshin ni Shite [Ainu Grammar Research – With a Focus on the Sakhalin Dialect]. Chiri
Mashiho Chosakushū [Mashiho Chiri Collection], vol. 3, 457–586. Tōkyō: Heibonsha.
Dahl, Otto C. 1977. Proto-Austronesian (Scandinavian Institute of Asian Studies Mono-
graph Series 15). Lund: Studentlitteratur.
Greenberg, Joseph. 2000. Indo-European and Its Closest Relatives: The Eurasiatic Lan-
guage Family 1: Grammar. Stanford: Stanford University Press.
Greenberg, Joseph. 2002. Indo-European and Its Closest Relatives: The Eurasiatic Lan-
guage Family 2: Grammar. Stanford: Stanford University Press.
Hattori, Shirō and Chiri Mashiho. 1964. Ainugo Hōgen Jiten [Ainu Dialect Dictionary].
Tōkyō: Iwanami Shoten.
Hattori, Shirō and Mashiho Chiri. 1960. Ainugo shohōgen no kisogoi-tōkeigaku-teki
kenkyū [A lexicostatistical study of Ainu dialects]. Minzokugaku Kenkyū [Journal of
Ethnology] 24, no. 4: 307–342.
Izutsu, Katsunobu. 2004–2005. Ainugo Asahikawa Hōgen Shiryō Shūsei, vols. 1–2. Asa-
hikawa: Hokkaidō Kyōiku Daigaku Asahikawa-kō.
Izutsu, Katsunobu. 2006. I/Yay-Pakasnu: Ainugo no Gakushū to Kyōiku no Tame ni. Asa-
hikawa: Hokkaidō Kyōiku Daigaku Asahikawa-kō.
Izutsu, Katsunobu and Tezuka Yoritaka. 2006. Kiso Ainugo [Basic Ainu]. Sapporo: Sap-
poro Dōshoten.
Kindaichi, Kyōsuke. 1931. Ainu Jojishi Yūkara no Kenkyū [Research on Ainu Epic
Yukar]. Tōkyō: Tōkyō Bunko.
Kindaichi, Kyōsuke and Matsu Kannari. 1959–1975. Ainu Jojishi Yūkara Shū, vols. 1–9.
Tōkyō: Sanseidō.
Ainu 115
Piłsudski, Bronisław. 1912. Materials for the Study of the Ainu Language and Folklore.
Cracow: Imperial Academy of Sciences.
Refsing, Kirsten. 1986. The Ainu Language: The Morphology and Syntax of the Shizunai
Dialect. Aarhaus: Aarhaus University Press.
Refsing, Kirsten (ed.). 1996. The Ainu Library Collection 1: Early European Writings on
the Ainu Language. 10 vols. Richmond: Curzon Press.
Sato, Tomomi. 2007. Moshiogusa no issatsubon ni tsuite [On the One Volume Edition of
the Moshiogusa], 157–170. Hokkaidō Daigaku Bungaku Kenkyū-ka Kiyō [Hokkaidō
University Graduate School of Letters Bulletin] 121.
Shibatani, Masayoshi. 1990. The Languages of Japan. Cambridge: Cambridge Univer-
sity Press.
Simeon, George. 1968. The Phonemics and Morphology of Hokkaido Ainu. PhD disser-
tation, University of Southern California.
Stenberg, Leo. 1929. The Ainu Problem. Anthropos 24: 755–799.
Street, John. 1962. Review of Vergleichende Grammatik der Altaischen sprachen, by
Nikolaus Poppe. Language 38, no. 1: 92–99.
Tamura, Suzuko. 1984. Ainugo Onsei Shiryō [Sound Materials for the Ainu Language].
Tōkyō: Waseda Daigaku Gogaku Kyōiku Kenkyū-jo.
Tamura, Suzuko. 2000. The Ainu Language. Tōkyō: Sanseido.
Uehara, Kumajirō and Chōsaburō Abe. 1792. Moshiogusa [Seaweeds]. Manuscript:
Waseda Library edition: http://archive.wul.waseda.ac.jp/kosho/ho02/ho02_05038/
Vovin, Alexander. 1993. A Reconstruction of Proto-Ainu. Leiden and New York: E. J.
Brill.
Vovin, Alexander. 2009. Man'yōshū to Fudoki ni Mirareru Fushigi na Kotoba to Jōdai
Nihon Rettō ni Okeru Ainugo no Bunpu [Strange Words in the Man'yōshū and Fudoki
and the Distribution of the Ainu Language in the Japanese Islands in Prehistory].
Nichibunken Fooranu Hōkokusho [Nichibunken Forum Reports], 215. URI: http://
publications.nichibun.ac.jp/ja/item/foru/2009-03-12/pub
Vovin, Alexander. 2012. Man’yōshū: Book 14. Folkestone: Global Oriental.
Vovin, Alexander. 2013. Man’yōshū: Book 20. Folkestone: Global Oriental.
CHAPTER 5
BURUSHASKI
Alexander D. Smith
1 INTRODUCTION
The aim of this chapter is to outline key typological aspects of Burushaski, including
phonology, verb and noun morphology, and syntax. Section 1 (below) describes the geo-
graphical context and earlier work on the language and reviews claims of possible genetic
affiliations. Section 2 describes Burushaski’s rich phonological inventory, with 36 conso-
nants, a large number of distinctive fricatives and affricates, and five vowels plus a length
distinction. Section 3 focuses on sentences including constituent order and some aspects
of syntax. Section 4 focuses on Burushaski’s complex verbal morphology, and Section 5,
on case marking. Burushaski marks case with both verb and noun morphology. Verb
suffixes agree with the nominative subject, while verb prefixes show differential object
marking, which agree with nouns under certain semantic restrictions. This sometimes
causes double agreement where verb prefixes and suffixes both agree with the same noun.
Section 6 focuses on the noun, which has a split ergative alignment that typically appears
on the agent in non-future tenses (sentences with a future tense have no ergative marker),
but which may also be used pragmatically to show volition in an intransitive sentence.1
Burushaski is a language isolate spoken in four areas: three in the northern Gilgit-
Baltistan area of Pakistan and one in the Indian state of Jammu and in Indian Kashmir.
The three dialects of Pakistan are Yasin, Hunza, and Nagar, each named after the valleys
in which they are spoken. The Hunza and Nagar dialects are closely related, while the
Yasin dialect is more divergent and isolated from the others. Jammu and Kashmir Buru-
shaski is also isolated and divergent. Most of the documentation for Burushaski focuses
on the Hunza dialect, but several works on the Yasin and Nagar dialects are also avail-
able. Jammu and Kashmir Burushaski is the least documented dialect of Burushaski. In
all there are approximately 100,000 speakers of Burushaski, and of those speakers, only
300 speak the Jammu and Kashmir dialect, which is highly endangered. The Catalogue
of Endangered Languages (http://endangeredlanguages.com) lists Burushaski as “threat-
ened”, meaning that the language is under pressure from Urdu and surrounding languages
but is not yet at risk of being lost in the very near future.
without supporting evidence, that Burushaski was creolized early in its history possi-
bly due to prolonged contact with an ancient Indo-European dialect. Čašule, in several
publications (1998, 2003a, 2003b, 2009a, 2009b, 2010, 2012, 2016), proposed an Indo-
European (non-Indo-Iranian) connection with Burushaski, with connections to the Bal-
kan languages, especially to Phrygian. However, Čašule makes several conflicting claims
(see especially Čašule 2012), including the following:
1 There is either a genetic or contact history between Burushaski and the Aegean
branch of IE.
2 Burushaski is connected to Northern and Western IE.
3 There is a Balkan substratum of shepherd terms.
4 Burushaski subgroups with Macedonian and Balkan-Slavic.
5 Burushaski has been transformed through language contact.
6 Burushaski might form part of a larger Indo-European-Anatolian-Burushaski family.
between Burushaski and any other language family. If any connection does exist, it is
likely so ancient that time has obscured the evidence to the point that a genetic relation-
ship becomes impossible to prove.
1.2 Documentation
In recent years, efforts to document and describe Burushaski have produced a rather
large number of resources, the most widely available of which is arguably the Buru-
shaski Language Documentation Project (Munshi 2015), an online database containing
a grammatical sketch of the Hunza dialect, wordlists in ten semantic categories with
audio recordings for each word in the singular and plural, stories in the four dialects
with audio recordings, video recordings, and transcriptions, and a Burushaski orthogra-
phy using the Latin alphabet. Larger works documenting Burushaski include Lorimer
(1935–1938), a large three-volume work documenting the Yasin dialect, and Berger
(1998), a massive grammar with texts and dictionary of the Hunza and Nagar dialects,
and Munshi (2006) on the Jammu and Kashmir dialect of Burushaski. Other works on
Burushaski are more focused and less documentary in nature; they include Anderson
(1997) on Burushaski phonology; Anderson (2007) on its morphology; Bashir (1985,
2000a) on the semantics of Burushaski verbs and the d̪ -prefix; Morin and Tiffou (1988)
on the Burushaski passive; Smith (2012) on case marking on the verb and noun; Tiffou
and Morin (1982) on split ergativity; and Karim (2013) on middle voice. Holst (2014) is
an important and recent publication on Burushaski, addressing dialect variation, inter-
nal reconstruction, person marking on the verb, and the Burushaski lexicon and seman-
tics. Numerous other works which include Burushaski stories and proverbs are widely
available.
2 PHONOLOGY
Burushaski has a rich inventory of consonants, including contrasts between dental and
retroflex stops, eight affricates, seven fricatives, thirteen plosives, and a retroflex rho-
tacized fricative. Table 5.1 lists the consonant phonemes of Burushaski. The phonemes
tʰ, t, and d are between alveolar and retroflex (Munshi 2006: 59), but I have adopted the
standard of marking the dental series of contrasting stops with a diacritic but with no
diacritic for the near retroflex alveolar stops, similar to the system used in Munshi (2006).
The vowels of Burushaski are more straightforward. It has five vowels, including the
vowel “triangle” i, a, and u, plus o and e. Burushaski also distinguishes long and short
vowels.
Nasal m n ŋ
Aspirated pʰ t̪ ʰ tʰ kʰ
voiceless
plosive
Voiceless p t̪ t k q
plosive
Voiced b d̪ d g
plosive
Aspirated cʰ [ʦʰ] čʰ [ʧʰ] ċʰ [tʂʰ]
voiceless
affricate
Voiceless c [ʦ] č [ʧ] ċ [tʂ]
affricate
Voiced ǰ [ʤ] ż [dʐ]
affricate
Voiceless s š [ʃ] s̩ [ʂ] x h
fricative
Voiced z ɣ
fricative
Trill r
Approximate l y [j] ɻ w
(1) d̪ aɣa-umo
hide-3s.f.n.pst
‘She hid.’
moo-s-t̪ aq-am
3s.f.caus-trn-hide-1s.pst
‘I made her hide.’
(2) d̪ uuɻ-imi
melt-3s.pst
‘it melted.’
d̪ e-s-t̪ uɻ-am
d̪ -trn-melt-1s.pst
‘I melted it.’
(3) a-ar lel a-pim
1s.gen-dat know neg-aux.3s.pst
‘I did not have knowledge.’ (Baadil Jamal 70)
Burushaski 121
3 SENTENCES
Burushaski is a typical head-final language, with agglutinative verb morphology, subject
agreement, and pronoun drop. The basic word order is SOV, although the verb alone
can often convey meaning without an overt subject or object noun phrase. Example
(5) shows typical SOV word order, and (6) shows the same statement expressed without
overt subject and object NPs. Example (7) shows the position of a subject in an intransi-
tive sentence, while (8) shows a transitive sentence exhibiting SOV order without using
pronoun.
3.1 Questions
Burushaski is a language where content question words (so-called wh-question words)
remain in situ and are not fronted, as in many languages. Question words appear in the
same slot as the real nouns that they replace or represent. Yes/no questions are formed by
adding the suffix -a to the end of the verb, after the subject agreement suffix. Examples
of both kinds of questions follow.
(11) un-e besan s̩ i-uma
2s-erg what eat-2s.n.pst
‘What did you eat?’
(12) in-e amul-ar ni-umo
3s-erg where-dat go-3s.f.n.pst
‘Where did she go?
(12) mi-e s̩ i-om-a
1s-erg eat-1p.n.imp-Q
‘Shall we eat?’
4 THE VERB
The verb in Burushaski is morphologically rich, with numerous prefixes and suffixes
that can be attached to it. In this section I will go over some of the basics of Burushaski
verbal morphology, including special verbs, verbs with a Ø stem (where morphemes
are attached to an invisible stem), causative constructions, a transitivising prefix, and
negation.
4.1 Causatives
Causative constructions in Burushaski are marked with a transitivising prefix -s (dis-
cussed more below) and by lengthening the vowel of a verb prefix which agrees with
the cause. Thus, a verb like o-t̪ -imi ‘he chose them’ is made causative by lengthening the
vowel in the prefix o-, creating the causative sentence oo-t̪ -imi ‘he made them choose’.
In these cases, vowel lengthening is the sole indicator of causation as shown in examples
(17)–(19):
(20) d̪ aɣa-umo
hide-3s.f.pst
‘She hid.’
(21) moo-s-t̪ aq-am
3s.f.caus-trn-hide-1.pst
‘I made her hide.’
If the verb is made transitive with the s- transitivizer, the construction can no lon-
ger be classified as unergative (applicable only to intransitive verbs). The pronominal
prefix appears, and the vowel is lengthened to show causation. The s- prefix may also
be attached to unaccusative verbs – intransitive verbs whose subject is patient-like (as
in ‘the door closed’), and the result is consistent with other transitive forms where the
animate causee is marked on the prefix of the verb through vowel lengthening. Compare
124 Alexander D. Smith
example (22) with (23) and example (24) with (25) for contrasts between unaccusative
verbs without the s- transitivizer and unaccusative verbs with the s- transitivizer:
(22) d̪ -i-man-am
d̪ -3s-become-3.prf
‘He was born.’ (Limpi Kiser 114)
(23) in-e hiles-an d̪ -e-s-man-umo
3s.f-erg boy-indf d̪ -3s-trn-become-3s.f.pst
‘She gave birth to a boy.’
(24) mu-waal-umo
3s.f-lose-3s.f.pst
‘She was lost.’
(25) mo-s-pal-am
3s.f-trn-hid-1s.pst
‘I lost her.’
Note that, in this example, the transitivizing s- is combined with the verb prefix mo- to
derive a transitive sentence. If the vowel were to be lengthened, as in moospalam, the
reading would be something along the lines of ‘I made her become lost’. So a derivation
of causative from an intransitive verb must first add the s- transitivizer and later lengthen
the vowel of the obligatory agreement prefix to form the causative.
4.2 Negation
Negative clauses in Burushaski take the verbal prefix a-, which appears before the d̪ -
prefix (see Sections 4.3) and any pronominal agreement prefixes. Alternatively, the neg-
ative can be attached to the phrase-final auxiliary (-bim), although it does not appear that
this is a strict requirement, as negatives can also appear prefixed to the verb despite the
presence of an auxiliary. Some examples are given here:
4.3 The d̪ -prefix
Certain verbs in Burushaski take the prefix d̪ - which has a varying degree of semantic
impact on the verbs that it is attached to. Phonologically, it is followed by an ambiguous
vowel which is subject to vowel harmony (Anderson 2007: 1248–1250, Anderson 1997,
Munshi 2006: 194–197). The exact function of the d̪ -prefix remains somewhat unclear.
On both transitive and intransitive verbs, Bashir (1985: 21, 2000a: 4–11) claims that it
Burushaski 125
brings a focus to the result and defocuses the actors. In my data the d̪ -prefix has two main
functions. The first is associated with a slight change in meaning, such as in adding focus
or transitivity, and the second is associated with the derivation of a completely different
word. The contrast exemplified in (29) and (30) shows the first type, and that of (31) and
(32) shows the second type. Also note that the d̪ -prefix does not create an intransitive verb
in (33), despite some claims in the literature that it is an intransitivizing prefix.
(29) d̪ uuɻ-imi
melt-3s.pst
‘it melted.’
(30) d̪ e-s-t̪ uɻ-am
d̪ V-trn-melt-1.pst
‘I melted it.’
(31) i-man-imi
3-become-3s.pst
‘He became (something).’
(32) d̪ -i-man-imi
d̪ -3s.m-become-3s.pst
‘He was born.’
(33) un-e limpi kisar d̪ -i-c-aa
2-erg Limpi Kiser d̪ -3s.m-bring-2s
‘You brought Limpi Kiser.’ (Limpi Kiser 108)
A good list of verbs that occur in two forms, one with the d̪ -prefix and one without,
along with their corresponding glosses, can be found in Bashir (1985: 21–23). Other
comparative lists of d̪ -prefix verbs and analyses of the d̪ -prefix paradigm can be found in
Munshi (2006:194–197), Anderson (2007: 1248–1250), and Bashir (2000a). It is import-
ant to remember that the d̪ -prefix appears semantically to be a bleached affix, and it is
not clear if further investigation will help to specify its meaning more clearly, other than
revealing further bleaching and loss of former function.
The verb ‘to come’, a verb formed without a stem where the d̪ -prefix and suffix are
affixed to /Ø/, is one of the stranger oddities in Burushaski. It would seem that the d̪ -
prefix alone is responsible for conveying the meaning of the Ø-stem verb ‘to come’ while
the suffix attaches itself to a stem that is not phonologically analyzable.
126 Alexander D. Smith
5 CASE MARKING
Morphologically, case marking differentiates between subject and object in transitive
sentences in two different ways in Burushaski. The first is through use of the ergative
marker which appears on the agent noun phrase. The second way that Burushaski differ-
entiates agent from object is through its verb agreement patterns which mark the subject
of the sentence. Nouns are marked as ergative or absolutive, while verbs have agreement
suffixes that agree with the nominative subject. This type of difference has been described
in Mallinson and Blake (1981), Kroeger (2004: 283), and Simpson (1983; 1991: 155–
161). Specifically, Mallinson and Blake identify twelve theoretical types of case marking
in languages. Their type eight, where noun marking is ergative and verbal agreement is
nominative, is similar to the type seen in Burushaski. Suffix agreement on the verb is
described in Anderson (2007:1258 and Wilson 1996: 4) as being in agreement with the
grammatical subject of the sentence. Verbal suffixes agree with the intransitive subject
and with the transitive agent. I discuss first the different agreement affixes on the verb and
how they align with participants. After, I discuss the case marking of nouns.
TABLE 5.3
VERBAL AGREEMENT SUFFIXES IN PERFECTIVE, FUTURE, PAST, AND
PERFECT
Perfective Future Past Perfect
TABLE 5.4
VERBAL AGREEMENT SUFFIXES IN PRESENT, PLUPERFECT, AND
IMPERFECT
Present Pluperfect Imperfect
(Munshi 2006: 185, Munshi 2015, Lorimer 1935, Berger 1998: 144–158). Tables are
followed by several examples which show the suffixes and how they agree with the gram-
matical subject of the clause with nominative-accusative alignment.
These examples show that the agreement suffixes in Burushaski agree with the subject,
that is, with the agent of a transitive sentence and with the subject of an intransitive sen-
tence. Noun phrases, on the other hand, have an ergative system of case marking and are
described in more detail in Section 6.1.
1st a- mi-/me-
2nd gu-/go- ma-
3rd M/bi/bila i-/e- u-/o-
3rd F mu-/mo- u-/o-
Bashir (1985: 9–12) claims that the prefixes follow active language patterns of agree-
ment. Smith (2012) analyzes the verbal prefixes as differential object markers. Table 5.5
lists the agreement prefixes followed by several examples that show the prefixes in use.
These examples all show intransitive verbs, where the prefix agrees with the intran-
sitive subject. Other examples show that the prefix also often agrees with the patient in
transitive constructions. This is why the prefix has sometimes been analyzed as absolu-
tive agreement. Examples (49) through (51) show transitive sentences with patient agree-
ment on the verb.
There are also constructions where the prefix agrees with the agent itself. Take for
example the instance of experiencer marking in example (52). The prefix agrees with the
subject of the sentence which is also marked directly as ergative. When asked to change
the prefix to agree with the absolutive, the sentence becomes ungrammatical, as shown in
example (53). This would not be expected if the prefix showed true absolutive alignment,
since it is agreeing with an agent.
Verb prefixes typically only agree with animate arguments. Verbs such as ‘eat’ are
almost always associated with inanimate patients and thus do not trigger verb agreement.
However, when the patient is changed from inanimate to animate with the same verb, the
prefix appears, agreeing with the animate patient. Munshi uses the following examples
to show how the animacy constraint on patients works to deny prefix realizations with
inanimate patients but allows animate prefix agreement on the same verb:
In recipient constructions (where the dative is used to mark the recipient), the verbal
agreement prefix agrees with the dative and not with the absolutive. In these constructions,
the absolutive has no agreement marking on the verb (Munshi 2006: 138–140). Example
(56) shows this type of recipient agreement. In the second example (57), attempting to
remove the prefix (which is expected if the prefix must agree with the inanimate absolu-
tive ‘book’ as Ø) creates an ungrammatical sentence.
All intransitive constructions from the texts which have prefix agreement with the
subject are either sate of being constructions or changes in state of being constructions.
130 Alexander D. Smith
However, when the verb involves an event; something happening; describing an action
that the animate subject of an intransitive sentence is actively doing; something along the
lines of walking, running, going, flying; etc., there is no prefix on the verb and thus no
double marking. Consider the following examples (58) through (62), where the subject is
an active participant in the action and cannot be labeled as being affected by the verb in
question. There is no prefix in these cases.
(58) i-gurč-imi
3-submerge-3s.pst
‘He drowned.’ (Bashir 1985: 16) (YB)
(59) gurč-imi
Submerge-3s.pst
‘He dived (to clean his body).’ (Bashir 1985: 16) (YB)
(60) in d̪ aɣa-mo
3s.f hide-3s.f.pst
‘She hid.’
(61) ǰe muarar akʰole huruš-a baa
1 forever here stay-1 AUX.1s.pres
‘I stay here forever.’ (Baadil Jamal 38)
(62) um-e šugulo gaarc-imi
2-gen friend run-3s.pst
‘Your friend ran.’ (Munshi 2006: 130) (J&K)
In these examples, the presence or absence of the verb prefix is determined by whether
the sentences are unergative or unaccusative. The prefix in intransitive sentences appears
on the verb and agrees with the subject of unaccusative verbs, but it does not appear on
the verb or agree with the subject of unergative verbs. Unaccusative verbs are intransitive
verbs where the subject is not an agent, and unergative verbs are intransitive verbs where
the subject acts as an agent or has some sort of intent or initiation with the action being
performed.3 There are four rules that determine whether these prefixes appear on the verb,
which are shown below:
From these examples, it is clear that the classic analysis where the pronominal prefixes
are analyzed as agreeing with the absolutive is not satisfactory. It fails to explain why the
prefix does not agree with the absolutive if an experiencer or recipient argument is pres-
ent, and it does not explain why the prefix fails to appear if the subject of an intransitive
sentence in unergative. Differential object marking (Escandell-Vidal 2009, Haspelmath
2005, Hoop and Malchukov 2007, McGregor 1998, Aissen 2003) seems to be a better
description of the Burushaski pronominal prefixes. Differential object marking is a means
by which languages mark unexpected noun phrases. Speakers expect that most often
objects will be inanimate, that subjects will not be the affected party of a construction,
Burushaski 131
and that the subject will be the agent (Haspelmath 2005: 9). If these expectations are
not met (if the object is animate, or if the subject is the experiencer, recipient, or if the
subject is unaccusative), then the verb will have a prefix that agrees with the unexpected
participant.
6 THE NOUN
This section addresses the rich morphology of Burushaski nouns. Nouns can take a num-
ber of affixes, including pronominal prefixes and case suffixes. They are separated into
four classes, and are inflected for number (singular and plural). I begin with about case
marking on the noun, then continue the discussion from the previous section on case and
agreement affixes on the verb.
According to Tiffou and Morin (1982) and Lorimer (1935: 65), Burushaski is said to
have split ergativity between non-future versus future tenses. The split system itself can
be seen in noun morphology, where the ergative marker /-e/ appears only after agents in
past and present tenses and is absent in future tense. Past tense ‘I saw the girl’ displays
the ergative marker on the first person singular pronoun /je/, /je-e/, while future tense ‘I
will see the girl’ does not have this marker. Examples (68) and (69) come from Lorimer
(1935) and show the same pattern on the second person singular pronoun /un/.
The ergative marker also appears in contexts where its usage is pragmatic, e.g. to dis-
tinguish between volitional and non-volitional actions. The ergative marker can both
appear unexpectedly in intransitive sentences and can be absent in transitive sentences.
In examples (70) and (71), the ergative marker alone is used to denote a volitional action,
even though the sentence is intransitive in both examples. In example (72), the ergative
marker is absent on the agent, though this may be because the agent is not animate. But
in at least one example, shown in example (73), in the Yasin dialect, the ergative appears
on an intransitive agent.
Apart from ergative case, the noun can also take dative, genitive, locative, instrumen-
tal, and ablative case marking suffixes. As stated above, the genitive is homophonous
with the ergative marker, except for genitive feminine pronouns which take the special
-mo suffix. Table 5.6 below lists all of the Burushaski nominal case markers. Note that
the absolutive is listed in some works as a Ø suffix, but I have chosen to simply exclude
the absolutive from this chart, since it has no phonetic form.
TABLE 5.6 NOMINAL CASE MARKING
Ergative -e, -Vː
Genitive -e, -Vː, -mo
Locative -ulo, -ate
Instrumental -ate
Ablative -um, cum
Dative -ar
Burushaski 133
The behavior of the dative case morpheme is very interesting. It can be used to mark the
subject noun phrases of so-called experiencer constructions (see Munshi 2006: 133–134, Bha-
tia 1990, Masica 1976, Mishra 1990, Verma and Mohanan 1990 for more on dative subjects
in South Asia). Some examples of this phenomenon follow (from Munshi 2006: 133–134).
(76) mu-riin-um
3s.f-hand-ptcp
‘her hand’(Baadil Jamal 90)
(77) i-riin-um
3-hand-ptcp
‘his hand.’ (Baadil Jamal 71)
(78) a-kayuwa
1-children
‘my children’ (Daddo Puno 47)
1st a- mi-/me-
2nd gu-/go- ma-
3rd M/bi/bila i-/e- u-/o-
3rd F mu-/mo- u-/o-
134 Alexander D. Smith
6.3 Plural
Burushaski nouns can also take plural suffixes, which vary depending on the class and
phonetics of the noun. Tiffou (1993) claims that there are close to 50 suffixes for marking
plural in Burushaski. As stated earlier, there are four classes: male human; female human;
bi class (for animals and tangible, countable items); and bila class (for nouns which are
typically non-concrete items, sometimes referred to as non-countable) (Munshi 2006:
185, Munshi 2015, Lorimer 1935, Berger 1998: 144–158). Munshi (2015) contains a list
of the most common suffixes, which is reproduced in Table 5.8, and the class of each
noun is marked. Note that the plural suffixes come in three types: (1) male and female; (2)
male, female, and bi class; and (3) bila class. Thus, there is no male-only or female-only
class of nouns, since classes (1) and (2) can include both male and female nouns.
7 CONCLUSION
Although this chapter is focused mainly on the morphology of Burushaski, a number
of other topics were also considered, though many others remain. Burushaski has been
the topic of a large and growing number of works dealing mostly with its interesting
morphosyntax. Projects such as the Burushaski Language Documentation Project show
a strong interest in long-term documentation. The language is still in need of a compre-
hensive grammar (Lorimer 1935 is outdated, and the Burushaski Documentation Project
as well as Berger 1998 have a Hunza/Nager focus). Despite some claims to the contrary,
NOTES
1 Data used in this paper comes from two main sources. First, many examples come
from texts available online at the Burushaski Language Documentation Project. Those
examples come from several texts, “Baadil Jamal”, “Limpi Kiser”, “Daddo Puno”,
“Chine Maghuyo”, “Mattum Ke Burum”, and “Hamale Khattun.” In this paper, exam-
ples from these texts are cited by story and line number. Other examples have no
citation and were elicited directly from Piar Karim, a native Burushaski speaker and
friend, who was working and studying at the University of North Texas while I was
writing my master’s thesis. Data from other sources is cited accordingly. The majority
of this material is from the Hunza dialect, but some examples are from Yasin, Nagar,
and Jammu and Kashmir Burushaski. Examples from a dialect other than the Hunza
dialect are marked accordingly.
2 Sagart’s response to Sino-Caucasian was to propose Sino-Austronesian, which itself
was criticized in Blust 1995. Overall, neither the Sino-Caucasian, Dene-Caucasian, nor
Sino-Austronesian hypotheses are widely accepted by historical linguists specializing
in any of these specific language families. Burushaski seems to have been taken along
for the ride in an argument between distant genetic relationships involving Sinitic.
3 See Wilson (1996) for another discussion of unaccusative and unergative influence in
Burushaski.
4 It has been brought to my attention that dative subjects in Germanic languages have
good support for being treated as subjects, but don’t necessarily satisfy agreement.
This also casts doubts on the conclusions of Smith (2012), which does not treat dative
subjects as true subjects in Burushaski.
REFERENCES
Aissen, Judith. 2003. Differential Object Marking: Iconicity vs Economy. Natural Lan-
guage and Linguistic Theory 21: 435–483.
Anderson, Gregory. 1997. Burushaski Phonology. Phonologies of Asia and Africa
(Including the Caucasus), ed. by A. Kay and P. Daniels, 1021–1041. Winona Lake:
Eisenbrauns.
Anderson, Gregory. 2007. Burushaski Morphology. Morphologies of Asia and Africa, ed.
by A. Kaye, 1233–1275. Winona Lake: Eisenbrauns.
136 Alexander D. Smith
Bashir, Elena. 1985. Toward a Semantics of the Burushaski Verb. Conference on Partici-
pant Roles: South Asia and Adjacent Areas, ed. by A. Zide, D. Magier, and E. Schiller,
1–32. Bloomington: Indiana University Linguistics Club.
Bashir, Elena. 2000a. The d-prefix in Burushaski: Viewpoint and Evidentuality. Paper
presented at the 36th international conference of Asian and North African studies,
Montreal. August 27.
Bashir, Elena. 2000b. A Thematic Survey of Burushaski Research. History of Language
6: 1–15.
Bengtson, John D. 2008. Materials for a Comparative Grammar of the Dene-Caucasian
(Sino-Caucasian) Languages. Aspects of Comparative Linguistics 3: 45–118.
Bengtson, John D. and V. Blažek. 1995. Lexica Dene – Caucasica. Central Asiatic Jour-
nal 39: 11–50, 161–164.
Berger, Hermann. 1956. Mittelmeerische Kulturpflanzennamen aus dem Burushaski.
Indo-Iranian Journal 3: 17–43.
Berger, Hermann. 1998. Die Burushaski-Sprache von Hunza und Nager. Wiesbaden:
Harrassowitz.
Bhatia, Tej K. 1990. The Notion of ‘Subject’ in Punjabi and Lahanda. Experiencer Sub-
jects in South Asian Languages, ed. by M. Verma and K.P. Mohanan, 181–194. Stan-
ford: Center for the Study of Language and Information.
Bleichsteiner, Robert. 1930. Die werschikisch-burischkische Sprache im Pamirgebiet und
ihre Stellung zu den Japhetitensprachen des Kaukasus [The Werchikwar-Burushaski
language in the Pamir region and its position relative to the Japhetic languages of the
Caucasus]. Wiener Beiträge zur Kunde des Morgenlandes 1: 289–331.
Blust, Robert 1995. An Austronesianist Looks at Sino-Austronesian. The Ancestry of the
Chinese Language. Journal of Chinese Linguistics Monograph Series Number 8, ed.
by William S-Y. Wang, 283–298. Berkeley.
Campbell, Lyle. 1997. American Indian Languages: The Historical Linguistics of Native
America. Oxford: Oxford University Press.
Čašule, Ilija. 1998. Basic Burushaski Etymologies. (The Indo-European and Paleobal-
kanic Affinities of Burushaski). Munich-Newcastle: Lincom Europa.
Čašule, Ilija. 2003a. Burushaski Names of Body Parts of Indo-European Origin. Central
Asiatic Journal 47: 15–74.
Čašule, Ilija. 2003b. Evidence for the Indo-European laryngeals in Burushaski and
Its Genetic Affiliation with Indo-European. The Journal of Indo-European Studies
31(1–2): 21–86.
Čašule, Ilija. 2009a. Burushaski Shepherd Vocabulary of Indo-European Origin. Acta
Orientalia 70: 147–195.
Čašule, Ilija. 2009b. Burushaski Numerals of Indo-European Origin. Central Asiatic
Journal 53, no. 2: 163–183.
Čašule, Ilija. 2010. Burushaski as an Indo-European “Kentum” Language: Reflexes of
the Indo-European Gutturals in Burushaski. Munich: Lincom.
Čašule, Ilija. 2012. Correlation of the Burushaski Pronominal System with Indo-European
and Phonological and Grammatical Evidence for a Genetic Relationship. The Journal
of Indo-European Studies 40(1–2): 59–153.
Čašule, Ilija. 2016. Evidence for the Indo-European and Balkan Origin of Burushaski.
(LINCOM Etymological Studies, 5.) Munich: LINCOM GmbH.
Escandell-Vidal, Victoria. 2009. Differential Object Marking and Topicality: The Case of
Balearic Catalan. Studies in Language 33, no. 4: 832–884.
Burushaski 137
Goddard, Ives 1996. The Classification of the Native Languages of North America. Lan-
guages. Vol. 17 of William Sturtevant, ed., Handbook of North American Indians, ed.
by I. Goddard, 290–324. Washington, DC: Smithsonian Institution.
Hamp, Eric P. 2013. The Expansion of the Indo-European Languages: An Indo-
Europeanist’s Evolving View. Sino-Platonic Papers 239: 1–14.
Haspelmath, Martin. 2005. Universals of Differential Case Marking. Handout presented
at the 2005 LSA Institute: Dialogues in Grammatical Theory, Experiment, and Change.
http://wwwstaff.eva.mpg.de/~haspelmt/2.DiffCaseMarking.pdf.
Holst, Jan Henrik. 2014. Advances in Burushaski linguistics. Tübingen: Narr.
Holst, Jan Henrik. 2015. The Origin of Burushaski – An Extremely Brief Report. Unpub-
lished paper, University of Hamburg.
Holst, Jan Henrik. 2016. Typological Features of Burushaski and Their Context. Abstract
of paper to be presented at the “Workshop on typological profiles of language families
of South Asia”, Uppsala, Sweden, September 15–16, 2016.
Hoop, Helen and Andrej Malchukov. 2007. On Fluid Differential Case Marking: A Bidi-
rectional Approach. Lingua, 117: 1636–1656.
Karim, Piar. 2013. Middle Voice Construction in Burushaski: From the Perspective of a
Native Speaker of the Hunza Dialect. Unpublished M.A. thesis, Department of Lin-
guistics, University of North Texas, Denton, TX.
Kroeger, Paul. 2004. Analyzing Syntax. Cambridge: Cambridge University Press.
Lorimer, David. 1935. The Burushaski Language. Oslo: H. Aschehoug.
Mallinson, Graham and Barry Blake. 1981. Cross-Linguistic Studies in Syntax. Amster-
dam: North-Holland.
Masica, Colin P. 1976. Defining a Linguistic Area. Chicago: University of Chicago Press.
McGregor, William. 1998. “Optional” Ergative Marking in Gooniyandi Revisited: Impli-
cations to the Theory or Marking. Leuven Contributions in Linguistics and Philology
87: 491–571.
Mishra, Mithilesh K. 1990. Dative/Experiencer Subjects in Maithili. Experiencer Sub-
jects in South Asian Languages, ed. by M. Verma and K. Mohanan, 105–118. Stanford,
CA: Center for the Study of Language and Information.
Morin, Y.-Ch. and Etienne Tiffou. 1988. Passives in Burushaski. Passives and Related
Constructions, ed. by M. Shibatani, 493–524. Amsterdam: John Benjamins.
Munshi, Sadaf. 2006. Jammu and Kashmir Burushaski: Language Contact and Change
Unpublished doctoral dissertation. University of Texas, Department of Linguistics,
Austin.
Munshi, S. 2015. A Grammatical Sketch of Burushaski. Burushaski language documenta-
tion project: www.ltc.unt.edu/~sadafmunshi/Burushaski/language/grammaticalsketch.
html.
Ruhlen, Merritt. 1994. On the Origin of Languages: Studies in Linguistic Taxonomy.
Stanford: Stanford University Press.
Sagart, Laurent. 1993. Chinese and Austronesian: Evidence for a Genetic Relationship.
Journal of Chinese Linguistics 21: 1–62.
Sagart, L. 1994. Proto-Austronesian and Old Chinese Evidence for Sino-Austronesian.
Oceanic Linguistics 33, no. 2: 271–308.
Simpson, Jane. 1983. Aspects of Warlpiri Morphology and Syntax. Cambridge: MIT Uni-
versity Press.
Simpson, J. 1991. Warlpiri Morpho-Syntax: A Lexicalist Approach. Dordrecht: Kluwer
Academic.
138 Alexander D. Smith
OTHER ISOLATED
LANGUAGES OF ASIA
Stefan Georg
(Arctic Ocean)
Chukchi
Yukaghir
Itelmen
Moscow
(Pacific Ocean)
Ket
Nivk
Krasnoyarsk
Kazakhstan
China
Mongolia
1000 km
600 mi
MAP 6.1 ISOLATED LANGUAGES AND MEMBERS OF SMALL FAMILIES OF NORTH ASIA
Source: Base map adapted from http://d-maps.com/carte.php?num_car=24964&lang=de.
Smith in this volume), Kusunda in Nepal, and the still enigmatic Nahali/Nihali language
of West Central India have defied all attempts to subsume them in one of the bigger fam-
ilies of the world, although many such attempts have been undertaken and continue to be
published. To these we might, with due caution, add the Great Andamanese language on
the archipelago of this name, briefly discussed in Section 3.3.
Proto-Yeniseic
words and paradigms, no texts) could still be collected by M.A. Castrén in the 1840s.
Assan may have simply been identical with this language. Note that, in spite of the super-
ficial resemblance, the name Kott is etymologically not connected with the name Ket;
these two languages do not belong to the same subgroup of the family. Yugh is often
referred to in the literature as Sym-Ket, as opposed to Imbatsk-Ket, after the names of
two prominent rivers. It disappeared in the late 1980s. The number of persons with some
degree of fluency in one of the still extant three dialects of Ket (Southern, Central, and
Northern) can, with some optimism, be estimated at ca. 500 out of an ethnic population
of ca. 1000. The name Ket is a Soviet neologism, based on the noun keʔd ‘human’. Curi-
ously, when using Russian, speakers often refer to their ethnic group as ketó, which is
actually the vocative form of this noun. The name most often used for the language in
Ket itself is òstɨk(an) qà ‘Ostyak language’, which gives some justification to the name
‘Yenisei-Ostyak’ found in the earlier literature; the name Ostyak has a variegated history
as autonym and exonym in the region, and, without ‘Yenisei-’, Ostyak is mostly used
for the Ob’-Ugric (Finno-Ugric, Uralic) ethnic group and language nowadays usually
referred to as Khanty. Yugh differs from Ket mainly in terms of a handful of regular
sound changes, whereas its grammatical system corresponds largely to that of Ket. Kott,
the only other Yeniseian language for which more than very sketchy morphological data
are available, shows a rather different typological makeup in many subsystems (e.g. a
largely suffixing verbal morphology, as opposed to almost exclusively prefixing Yenisei-
Ostyakic (Ket and Yugh)).
The Yeniseic family and its sole survivor stick out from their neighbouring languages,
in fact from all languages and families of native Siberia, by a great number of typolog-
ical peculiarities and areal singularities. These make present-day Ket a true island in
this vast territory, which is mostly occupied by typically “Ural-Altaic” languages of the
exclusively suffixing, agglutinative SOV type, which is so characteristic of Northern and
Central Asia at large.
On the phonological level, a relatively unspectacular system of vocalic and conso-
nantal segmental phonemes, with no “exotic” elements (but with some asymmetries in
the distribution of the feature [+voice] among stops, cf. Tables 6.1 and 6.2), is accom-
panied by a system of syllabic tones, which is without parallel in all of Northern and
Central Asia.
142 Stefan Georg
i ɨ u
e ǝ o
a
Stop b td k q
Fricative s h
Continuant 1 j
Nasal m n ŋ
Note the absence of /p/, which is quite typical for Siberian (especially “Altaic”) lan-
guages. The language lacks a phonemic /r/ (Southern Ket shows phonetic, flapped, [r]
as the realization of intervocalic /d/). Ket phonotactics is further characterized by the
avoidance of initial liquids (i.e. /l/) and the near-absence of initial nasals (/n/ and /m/),
which closely resembles the similar constraint found in (the native vocabulary of) Turkic
languages.
Ket syllabic tones manifest themselves only on monosyllabic words, most of which
are nominative forms of nouns; the system consists of four tonal units, which can be
symbolized as V̅ (half-long, level or slightly rising); Vʔ (short rising-falling, with glot-
tal constriction or “creakiness”); VV (long rising-falling, lower than Vʔ); and V̀ (short,
sharply falling), respectively. When suffixes are added, the distinctive prosodic quality
of the root vowel is lost and gives way to a stress-like system involving the first two syl-
lables of the phonological word (noted as V́ -V or V̀ -V, for details cf. Georg 2007:47–61,
and Vajda 2000).
Ket nouns show a typically Eurasian case system, with the exception that personal pos-
session is expressed by prefixes. A representative case paradigm is that of the feminine
noun qīm ‘woman’ (Table 6.3, cf. Georg 2007:104). Note that the dative, benefactive,
ablative, and adessive are based on the genitive – this is found in all genders and num-
bers. This kind of ‘two-story’ case system resembles similar phenomena in some Uralic
languages or, then, in Indo-European Tokharian. Animate nouns do not take a locative;
the suffix found on inanimate nouns is -ka.
Grammatical gender is covert in nominative/citation forms and manifests itself in the
genitive and the cases based on it: -da in masc. sg.; -di in fem. sg., neuter sg., and neuter
pl.; and -na in masc. and fem. (= animate) pl.
Ket case suffixes are often described as clitics, rather than true suffixes, because they
may “float”, i.e. they can occur without being attached to a noun or pronoun, when this
“head” is present in the immediately preceding discourse context, cf.:
Noun plurals are mostly formed by means of the phonotactically altering suffixes -n
and -ŋ, but many idiosyncratic and, for the macro-region, quite unusual techniques of
Other isolated languages of Asia 143
plural-formation exist, including the change of root vowels (diʔ, pl. daʔn ‘log’); tones
(ēj, pl. èj ‘tongue’, áluk, pl. àluk ‘yoke’); both (tēd, pl. tátn ‘husband’); or complete sup-
pletion (ōks, pl. aʔq ‘tree’); all of these techniques are highly exceptional for languages
of the Siberian and Central Asian linguistic area. This is also true of the use of prefixes
for possessors (Table 6.4), which are historically and etymologically connected with the
personal pronouns (Table 6.5).
The verb is the most salient, and complicated, part of speech in Ket, and it has taken
scholarship a long time to disentangle most of its intricacies. Only a rough sketch of Ket
verbal morphology can be given here. Any Ket verb (cf. Table 6.6) will contain a lexical
root (R), found at the end of the morpheme chain (and only followed by a subject plural
marker, PL). The lexical core of the verb may be compounded, in which case one of the
elements is found further left in the morpheme chain (at position P7; this position also
hosts incorporated elements, which may be patients/(effected) objects/targets of change
or, most frequently, instruments; the highly productive causative marker /q/ also occu-
pies this position). Also part of the lexical makeup of verbs are determiners (also known
as preverbs or adpositions, in position P5), morphemes consisting of a single consonant
(k, n, h etc.), with sometimes vaguely determinable functions, but often simply lexicalized.
144 Stefan Georg
The positions P6, P4, P3, and P1 host actant markers, indicating at least the person (and,
sometimes, the gender) of the main actant (subject, agent), or the object/patient of transi-
tives. However, a verb with two actant markers may cross-reference the sentential subject
twice in its morpheme chain. The choice of actant markers and their distribution over
the morpheme chain is largely lexically determined and leads to the classification of
Ket verbs into five different conjugations. P4 may also contain the “thematic” vowel /a/,
whose function is unclear, but which is, when present, routinely labialized to /o/ in past
tense forms. The past tense morpheme itself, in most cases /il/ or /in/, occupies position
P2, which, since P1 is only filled in a small subset of verbs, more often than not occurs
immediately before the root (R).
No Ket verb form can be correctly parsed (or formed) without recourse to one of the
numerous morpho-phonotactic (sometimes referred to as “morphotactic” in the specialist
literature) rules of the language. Such rules determine the deletion or insertion of vowels
and/or consonants, governed by the actual presence of morphological material in a certain
(and lexically defined) subset of the ten morpheme slots (and not by any purely phono-
logical feature of the verb form). The detection of morphotactical rules by Vajda (2001a)
allowed for the first time a full understanding of the surface realizations of the absolute
majority of Ket verb forms, while earlier descriptions often had to go as far as to call
practically every Ket verb “irregular”. Georg (2007, 203–215) enumerates 11 truncation
rules, 5 vowel insertion (anaptyxis) rules, and 12 separator rules. Only examples for such
rules can be given here, cf. Truncation Rule 1 (Georg 2007: 205):
Any P8 subject marker (≠ 3SGf. /da-/)) loses its vowel when immediately standing
before the preterite marker /il/, /in/, i.e. in the configuration P8-P2 – truncated mate-
rial marked by []:
This notation includes, for transparency’s sake, phonetic material elided by morphotac-
tic rules in square brackets and material added by such rules in round brackets; thus, the
surface realization of the 3PL form in the present tense column of Table 6.8 is /dajaŋtan/
and that of the 3SGf form in the past tense is /dolatan/.
A comprehensive guide to the literature on Yeniseian is Vajda (2001b), and descriptive
grammars of Ket include Werner (1997a), Vajda (2004), and Georg (2007). The lexical
stock of all known Yeniseic languages, past and present, is collected in Werner (2002),
with an English index). Data on Yugh are practically only found in Russian and German
publications, an exception is Werner (2012), which also contains verbal paradigms. Cas-
trén’s Kott data from the 1840s are analyzed and discussed in Werner (1997b).
2.2 Nivkh
Nivkh – ethnic autonym ñivxgu (Amur), ñiɣvŋun (Eastern Sakhalin) – or Gilyak, as it
used to be called in the earlier literature, is a small cluster of two fairly divergent variants,
which may or may not be viewed as separate languages; for those who see these varieties
as separate languages, the name Amuric has been proposed as that of the small family.
The two main variants, Amur Nivkh (in the vicinity of the estuary of the Amur River in
the Russian Far East) and (Eastern, Northern, and Southern) Sakhalin Nivkh (on Sakhalin
Island in the Sea of Okhotsk), are reported as mutually unintelligible; differences pervade
all linguistic subsystems; Table 6.13 below shows some cognates and non-cognates of
these two main variants.
Out of a population of ca. 5,000 ethnic Nivkhs, at most 700 individuals still use the
language (ca. 200 on the mainland/Amur and ca. 500 on Sakhalin), which translates into
a language retention rate of only ca. 14% (data from 1926 report a language retention rate
until then of approximately 100%).
Phonologically, Nivkh sticks out from all its neighbours (Ainu, Tungusic languages,
Japanese) by a particularly rich consonant system with an areally unusual contrast of
voiced, voiceless, and voiceless aspirated stops and an unusually large inventory of
voiced and voiceless continuants, cf. Table 6.10.
The (Eastern) Sakhalin dialect adds to this the bilabial approximant /w/.
Another areally salient feature of Nivkh phonology is its tolerance for word-initial
and, especially, word-final consonant clusters (cf. Eastern Sakhalin pʿriγŋ azmt’ ‘lover’,
mχokr muγf ‘holiday’, řəřkř ‘key’).
Nivkh is widely known for its system of initial consonant mutations, which require
the systematic alternation of initial consonants in reaction to the final consonant of a pre-
ceding morpheme. The consonants (all consonant phonemes, except the nasals, sonants
and the laryngeal /h/) are grouped into alternation series like /t ~ r ~ d/ (there are about
20 of such series). A noun or verb which (outside of any syntactic context, as “citation
form”) begins with the first member of such a set changes this anlaut in syntactic con-
structions (not in accidental juxtapositions) with preceding morphemes (attributes in the
case of nouns, direct objects in the case of verbs) to one of the other members of the
set. The operating rules are originally assimilatory in nature, but the loss of final con-
sonants and other processes have partly obscured the original contexts. Examples for
such alternations are, from the Amur variant (Gruzdeva 1998, 14; cf. also Mattissen and
Drossard 1998, 9f., Comrie 1981, 267, Beffa 1982, 60–64 for more in-depth treatments
and attempts to rationalize the system):
/p ~ v ~ b/
pəñx ‘soup’ –
t’us pəñx ‘beef soup’, čo vəñx ‘fish soup’, ova bəñx ‘flour soup’ (cf. East Sakha-
lin ofaŋ)
/z ~ t’ ~ d’/
zosq- ‘to break’ –
laq zosq- ‘break a ski’, luvr t’osq- ‘break a spoon’, ŋir d’osq- ‘break a cup’ (East
Sakhalin ŋirŋ)
A peculiar consequence of this system is that all transitive verbs begin with a fricative
phoneme, obviously due to the generalization of the form with preceding object, cf. pəkz-
‘to get lost’ versus vəkz- ‘to lose’ (Gruzdeva 1998, 12).
The personal pronouns of Nivkh are given in Table 6.11.
TABLE 6.13 LEXICAL COGNATES AND NON-COGNATES IN THE MAIN NIVKH VARIANTS
Amur Nivkh Eastern Sakhalin Nivkh
‘nest’ ŋəvi ŋavi
‘reindeer’ čʿolŋi tlaŋi
‘wolf’ liγs liγř
‘child’ oγla eγlŋ
‘search’ ŋəŋd’ ŋanγd
‘man’ utku azmət’
‘woman’ umgu řaŋγ
‘snow’ ŋaqr qʿavi
‘who’ aŋ nař
‘what’ sid’ nud
Nivkh has an unusually complex numeral system, which distinguishes at least 24 series
of numerals according to the type of the counted objects. Thus, the numeral ‘three’ has
the surface forms (Amur):
t’em (boats) t’eř (sledges)
t’ar (dried fish) t’ma (a measure, span)
t’a (fathoms) t’əγvi (bundles of dog food)
t’ŋaq (crops) t’χos (crops, diff. kind)
t’for (nets) t’fat (ropes for hunting)
t’eu (eyes of the fishing net) t’eo (fishing nets)
t’eřqe (strips) t’laj (a. k. o. ropes)
t’e(γ)it (fingers when measuring the thickness of fat) t’ezču (families)
t’la (poles) t’avr (places)
t’fasq (paired objects) t’et’ (wooden boards)
t’rax (thin, flat objects) t’ex (long objects)
t’aqr (people) t’or (animals)
(after Panfilov 1968, 414–415)
In some, but by no means in all, of these numerals, the noun for the counted object can
still be discerned in the rhyming portion of the word.
Other isolated languages of Asia 149
TABLE 6.15
LEXICAL DIFFERENCES IN KOLYMA AND TUNDRA YUKAGHIR BASIC
VOCABULARY
TABLE 6.16 NOMINAL CASE SUFFIXES IN THE TWO MAIN VARIANTS OF YUKAGHIR
Kolyma Yukaghir Tundra Yukaghir
The rather complicated system of Yukaghir focus has been repeatedly, and extensively,
treated in the linguistic literature, cf. Maslova (1997), Schmalz (2012).
Another Palaeoasiatic group, this time certainly a family, are the Chukchi-Kamchatkan
languages, comprised of Chukchi, Koryak, Alyutor, and (extinct) Kerek in the far North-
East of Russia (with Chukchi as the northeasternmost language of Eurasia, facing Alaska
Other isolated languages of Asia 151
over Bering Strait, and the other languages found further south, centering on Kamchatka).
In Georg and Volodin (1999), it was argued that its fifth member, Itel’men (or Kamchadal,
in the southwest of Kamchatka), should, perhaps, be taken out of this family and given the
status of a language isolate. Though the differences between Itel’men and the Chukchi-
Koryak (Chukchi, Koryak, Kerek, Alyutor) languages remain significant and deserve fur-
ther study, it is possibly safe to say that our skepticism was not justified, and that Itel’men
is best regarded as a – highly aberrant – member of this family after all.
The status of Korean and Japanese, both of course major languages of the world (with
upwards of 78 and 127 million speakers, respectively) and well studied from all angles,
needs to be briefly discussed here.
First of all, it should be mentioned that both are actually members of (small) language
families – Japanese forms, together with the languages of the Ryukyuan Islands (with
Okinawan being the best-known variant, cf. Michinori and Pellard 2010), the Japonic
family, and the highly aberrant characteristics of the Korean “dialects” of the island of
Jeju in the south and the Hamkyeng Mountains in the nort-east justify their treatment as
languages in their own right and thus speaking of a “Koreanic” language family. Japanese
may have a linguistic relative on the Asian continent, but the language of the Koguryo
empire (37 bce – 668 ce) is only known from small wordlists, and the controversial
discussion about their linguistic interpretation is still going on (for the position that the
Koguryo language is an early form of Japonic cf. Beckwith 2004, and for the opinion that
it is Koreanic instead, cf. Vovin 2005a).
Apart from this, it is true that both languages or small families have often been
regarded as genealogically related to each other or, more frequently, as members of a
larger grouping, “Altaic”, which is said to consist of the Turkic, the Mongolic, and the
Manchu-Tungusic languages. If validated, this would be one of the major language fam-
ilies of Eurasia, ranging from Eastern Europe (with the westernmost fringes of Turkic)
to the Pacific Ocean. However, for many specialists, it is by no means validated, and
for a sizable number of informed observers, the Altaic hypothesis is outright wrong and
the numerous commonalities between the languages involved are best explained in an
areal framework. The debate, often hotly conducted, continues with unabated vigour.
General-interest (and also some linguistic) encyclopediae and handbooks very often treat
“Altaic” (with or without Korean and Japanese) as an unquestioned given, but the reality
is certainly much more complex.
For a first approximation to this highly complicated and involved scholarly discussion
of the classificatory issues surrounding Korean and Japanese (and, of course, the other
“Altaic” languages), readers may turn to Martin (1966, pro Korean-Japanese); Miller
(1996, pro Korean and Japanese as Altaic); Starostin (1991, idem); Robbeets (2005,
idem); Georg (2008, a review of the latter, contra Altaic); Vovin (2005b, contra Altaic);
Vovin (2010, contra Korean-Japanese); and Georg (2011, contra Altaic). If anything,
the case of Japanese and Korean shows that isolated languages are by no means always
obscure, hardly known, and receding languages in difficult-to-access regions of the world.
300 km
200 mi
Burushaski
Kusunda
Nahali
Andamanese
3.1 Kusunda
The elusive Kusunda (or Ban Raja ‘Forest King’) language is spoken in Nepal’s
mid-Western Gorkha district by, today, at best a handful of persons (certainly less than
ten). It has been known at least since the mid-19th century (Hodgson 1848, 1857–1858),
but the first usable study of it became only available in the late 20th century (Reinhard
and Toba 1970).
No linguist was able to establish any contact with Kusunda speakers for some decades to
follow this, and the language and its speakers were widely regarded as extinct. In 2004, a
team led by David E. Watters managed to locate some still competent speakers of Kusunda,
Other isolated languages of Asia 153
which led to the publication of a book-length treatment of this hitherto virtually unknown
language (Watters 2006), which is the source of all information and examples given here.
No attempt to find linguistic relatives for Kusunda has been successful (and those
which were published did not go beyond typically macro-comparativist claims that what
is – somehow – similar must be related, mostly involving some vague notion of a connec-
tion with putative “Indo-Pacific” or the like). The sudden increase of available linguistic
data in the early years of this century did not change this situation – Kusunda is a linguis-
tic isolate and also a language with an areally highly unusual typological makeup.
Tables 6.17 and 6.18 illustrate the phonological system of the language.
Kusunda nouns have no gender/class system and no marker of plurality. Table 6.19
illustrates the suffixal noun case system of the language (Watters 2006: 50–56, cf. also
Kausen 2013: 605).
TABLE 6.17 KUSUNDA VOWEL PHONEMES
i u
ə
e o
a
Voiceless stop p t k q ʔ
Voiceless pʰ tʰ kʰ qʰ
aspir. stop
Voiced stop b d g ɢ (ʕ)
Voiced aspir. bʰ dʰ gʰ
stop
Fricative s (χ)
Voiceless ts (tʃ )
affricate
Voiceless tsʰ
aspir. affr.
Voiced dz
affricate
Voiced aspir. dzʰ
affricate
Nasal m n (ɲ) ŋ ɴ
Lateral l
Rhotic r
Semi-vowel w y
Source: Watters 2006: 32, elements in brackets are of somewhat doubtful phonemic status.
Nominative (zero)
Genitive -yi/-ye; -i/-e
Accusative-dative -da
Locative I -da
Locative II -ga/-gə
Ablative -əna
Allative -a
Comitative -ma
154 Stefan Georg
The “locative I” suffix is formally identical with the accusative-dative marker but rou-
tinely used as a locative with some nouns, such as ‘road’ (un-da ‘on the road’) or ‘forest’.
The “locative II” seems to be the general locative marker. The -a/-ə contrast follows an
as-yet only imperfectly understood pattern of (high-low) vowel harmony.
Although surrounded by (Indo-Aryan and Tibeto-Burman) languages with at least
some degree of ergativity, there seems to be no trace of this alignment pattern in Kusunda.
Kusunda verbs are inflected according to two fundamentally different patterns, of
which one (“class I”) marks the category of (subject) person by prefixes, and the other
(“class II”) by suffixes – all other categories (number, realis/irrealis, tense) are exclu-
sively suffixal. Class I verbs have no tense marking but have a realis form which covers
present and past time reference, as opposed to an irrealis form, which covers future
time and possibility. Class II verbs have a third form for past tense only. The realis-ir-
realis distinction is perceived by Watters as the older and more fundamental dichotomy,
whereas class II “past” gives the impression of having been secondarily added to the
system. Past forms have an “unequivocal past-completive reading, while realis is used
more frequently in utterances with a kind of ‘neutral’ or ‘timeless’ sense” (Watters
2006, 66–67), cf. the following example (n̩ marks a ‘syllabic’ articulation of the apical
nasal, which is not given phonemic status but routinely written as such by Watters and
his colleagues):
1SG dza-a-t-n̩
1PL dza-a-d-ən
2SG dza-a-n-n̩
2PL dza-a-n-ən
3SG dza-əg-ən
3PL dza-əg-ən
Apart from Watters (2006), Pokharel (2005) contains useful information on Kusunda.
3.2 Nahali
The Nahali (or Nihali) language is severely underdescribed. While works (of varying
quality) on its possible external relations are available, a thorough description of its pho-
nology and morphology is still lacking. It is not even possible to present a full account of
the affixes and processes that make up the Nahali verb. The language is spoken by, per-
haps, ca. 2,000 individuals of the ca. 5,000 members of the ethnic group (which refers to
itself as Kalto) in the Gawilgarh Hills in Madhya Pradesh and Maharashtra, Central India.
Most remaining speakers are bilingual in Korku (Munda) and/or Marathi (Indo-Aryan) or
local variants of Hindi. The Nahali language has so far resisted all attempts to classify it
with other languages of the Indian subcontinent and beyond (Munda/Austroasiatic being
the grouping mentioned most often), and the very complicated (and at times possibly
traumatic) sociolinguistic history of its speakers led to a linguistic system which gives
a patchwork-like impression of Munda, Dravidian, and also Indo-Aryan elements. This
goes so far that it has been claimed that Nahali is no “natural” language at all, but nothing
more than an “argot,” composed of heterogeneous elements, and not transmitted as a first
language. The recent summary by Zide (2008: 772ff) concludes, not without due caution,
that (my rephrasing, SG) (a) Nahali goes back to an autochthonous language in Western
Central India, which was not related to any known family of the continent and which
first (b) underwent some influence from early (pre-Korku or non-Korku) Munda, when
the ancestors of the present Kalto were still a vigorous and belligerent community. In
156 Stefan Georg
phase (c), which may have begun around 1800 ce, their society underwent a catastrophic
breakdown due to intentional violent acts by the Moghul state and/or other regional pow-
ers against the Kalto but also due to diseases and famines, which forced the survivors
of this ethnic group into an asymmetric symbiosis with the Munda speaking Korku, on
whom they are almost completely dependent with respect to their economy. This process
may have led to the breakup of the original morphosyntax of Nahali and opened the
gates for a now massive influx of foreign lexical elements, mostly from Southern Munda.
Phase (d) would, then, be characterized by a “creole-like” restructuring of the language,
with foreign morphological elements and a continuing heavy relexification from Korku/
Munda. The resulting Nahali language, then, may have been used as a secret speech by
some of its speakers, and it may also have undergone “argot-style speech deformation
in some of the lexicon” – note that this assumption does not make the Nahali language
“simply an argot, and not a ‘natural’ language”.
At any rate, the lexicon (and the little that is known of the morphology) of Nahali
shows an unusually extensive influence of practically all historically and contemporarily
adjacent language families. Kuiper (1962) estimates that the Nahali lexicon is composed
of approximately 40% Munda (mostly Korku) elements, 9% come from Dravidian, 2%
from Tibeto-Burman (this number can probably be dismissed, the comparisons mentioned
being most likely spurious), 20% Indo-Aryan (Hindi and Marathi), and 25% “unknown.”
This last figure would then represent the truly autochthonous Nahali lexical stock.
Table 6.22 lists a few of these lexical items (from Kuiper 1962, cf. Kausen 2013: 621).
Nahali has a system of suffixal noun cases, which may be summarized as in Table 6.23
(Kuiper 1962: 20).
Nominative (zero)
Genitive-accusative -n(a)
Dative -ke/-ki/-ge
Ablative -kon
Some of these suffixes may or may not have comparanda in Munda and possibly in
Dravidian. The personal pronouns clearly show the influence of Dravidian languages
(Table 6.24, Kuiper 1962: 27ff), which otherwise seem to have contributed relatively
little to the Nahali lexicon.
Other isolated languages of Asia 157
Nominative j(u)ō nē
Genitive eṅge nē, nēne
Accusative (h)eṅgen nēne-n
Dative eṅg-ke nē-ke
Ablative eṅge-kon -
All forms in this paradigm, with the exception of the Nominative 1SG, very closely
resemble (and 1SG oblique forms are practically identical with) the respective pronomi-
nal forms in the North Dravidian languages Kurukh and Malto. By way of an analogy, the
case of Samoyedic Enets in Northern Siberia may be mentioned, which clearly borrowed
its second and third person pronouns from Ket (cf. Siegl 2008) but shows little influence
from that language in other parts of its lexicon.
The morphology of the Nahali verb is hardly known, and the available data allow only
tentative identifications; thus, the language seems to have (Kuiper 1962: 31–35) a caus-
ative suffix -en-, an imperative marker -ki (also found in Dravidian Kurukh), a habitual
present in -ka/-ke, a future tense marker -ken, as yet poorly understood past tense markers
(-ya, -(y)i, -ka (?)), and an absolutive (converb) suffix in -ḍo.
For any progress on the understanding of, first, the synchrony and then, possibly, the
history of Nahali, the availability of fresh and coherent data is absolutely indispensable.
Kuiper’s (1962) comparative study had to base itself on the problematic, and partly cer-
tainly incorrect, material of the Linguistic Survey of India (with a single biblical text,
cf. Grierson 1906) and the short paper by Bhattacharya (1957). It may be safe to say
that Nahali has possibly received more work which tried to pigeonhole it into one of the
established or hypothetical language families of the world than descriptive work or field
studies, which alone would allow us to speak about these questions in an informed way.
1SG mɨ mi ~ ma ~ m ʈʰu
1PL eʈɨ - ma
1PL incl. eʈa-koʈoʈ - meŋ
2SG ɲi ɲi ~ ɲa ~ ni ~ na ŋu
2PL ni - ŋilie
3SG gi li ~ hi ~ h ɖu (distal), kʰudi (prox.)
3PL ekʷi - ɖuniyo (distal), diya (prox.)
Abbreviations:
1 first person
2 second person
3 third person
affr. affricate
aspir. aspirated
BEN benefactive
f/fem. feminine
DU dual
incl. inclusive
m/masc. masculine
n neuter
NEG negation
P position class
PL/pl. plural
Other isolated languages of Asia 159
prox. proximal
Pst past
R root
SG/sg. singular
voicel. voiceless
REFERENCES
Abbi, Anvita. 2006. Endangered Languages of the Andaman Islands. München/
Newcastle: LINCOM EUROPA
Abbi, Anvita. 2009. Is Great Andamanese Genealogically and Typologically Distinct
from Onge and Jarawa? Language Sciences 31: 798–812.
Aikio, Ante. 2014. The Uralo-Yukaghir Lexical Correspondences: Genetic Inheritance,
Language Contact or Chance Resemblance? Finnisch-Ugrische Forschungen 62: 7–76
Beckwith, Christopher. 2004. Koguryo, the Language of Japan’s Continental Relatives:
An Introduction to the Historical-Comparative Study of the Japanese-Koguryoic
Languages with a Preliminary Description of Archaic Northeastern Middle Chinese.
Leiden/Boston: Brill.
Beffa, Marie-Lise. 1982. Présentation de la langue nivx. Études mongoles et sibériennes
13: 40–98.
Bhattacharya, Sudhibhushan. 1957. Field-Notes on Nahali. Indian Linguistics 17:
245–258.
Comrie, Bernard. 1981. The Languages of the Soviet Union. Cambridge: Cambridge Uni-
versity Press.
Georg, Stefan. 2007. A Descriptive Grammar of Ket (Yenisei Ostyak). Folkestone: Global
Oriental.
Georg, Stefan. 2008. Review of Robbeets 2005. Bochumer Jahrbuch für Ostasienfor-
schung 32: 247–278.
Georg, Stefan. 2011. The Poverty of Ataicism, online publication, www.academia.
edu/1638942/The_Poverty_of_Altaicism
Georg, Stefan and Aleksandr P. Volodin. 1999. Die itelmenische Sprache. Grammatik und
Texte. Wiesbaden: Harrassowitz.
Grierson, G.A. (ed.). 1906. Linguistic Survey of India, Vol. IV: Muṇḍā and Dravidian
Languages. Calcutta: Government of India Printing Office.
Gruzdeva, Ekaterina. 1998. Nivkh. München/Newcastle: LINCOM EUROPA.
Gruzdeva, Ekaterina. 2001. Nivkhskij jazyk. Jazyki rossijskoj federacii i sosednykh gosu-
darstv. Enciklopedija, Vol. II (K – R), ed. by V.A. Vinogradov et al., 357–366. Moskva:
Nauka.
Hodgson, Brian Houghton. 1848. On the Chépáng and Kúsúnda Tribes of Nepal. JASB
17: 650–658.
Hodgson, Brian Houghton. 1857–1858. Comparative Vocabulary of the Languages of the
Broken Tribes of Népal. JASB 26: 317–522, 27, 393–442.
Kausen, Ernst. 2013. Die Sprachfamilien der Welt. Bd. 1: Europa und Asien. Hamburg:
Buske.
Krejnovich, E.A. 1958. Jukagirskij jazyk. Moskva/Leningrad: Nauka.
Kuiper, F.B.J. 1962. Nahali. A Comparative Study. Amsterdam: Noord-Holland.
Martin, Samuel E. 1966. Lexical Evidence Relating Korean to Japanese. Lg 42: 185–251.
Maslova, Elena. 1997. Yukagir Focus in a Typological Perspective. Journal of Pragmat-
ics 27: 457–475.
160 Stefan Georg
Mattissen, Johanna and Werner Drossard. 1998. Lexical and Syntactic Categories in
Nivkh (Gilyak). Düsseldorf: Heinrich-Heine-Universität (Theorie des Lexikons. Arbei
ten des SFB, 282).
Michinori, Shimoji and Thomas Pellard. 2010. An Introduction to Ryukyuan Languages.
Tokyo: ILCAA.
Miller, Roy Andrew. 1996. Languages and History. Japanese, Korean, and Altaic. Bang-
kok: White Orchid Press.
Nedjalkov, Vladimir P. and Galina A. Otaina. 2013. A Syntax of the Nivkh Language. The
Amur Dialect. Amsterdam: John Benjamins.
Nikolaeva, Irina A. and Evgenij A. Khelimskij. 1997. Jukagirskij jazyk. Jazyki Mira.
Paleoaziatskie jazyki, ed. by V.N. Jarceva et al., 155–168. Moskva: Indrik.
Panfilov, Vladimir Z. 1962–1965. Grammatika nivkhskogo jazyka, Vol. 1–2. Moskva/
Leningrad: Nauka.
Panfilov, Vladimir Z. 1968. Nivkhskij jazyk. Jazyki Narodov SSSR. Vol. 5: Mongol’skie,
tunguso-man’chzhurskie i paleoaziatskie jazyki, ed. by Skorik P. Ja. et al., 408–434.
Leningrad: Nauka.
Pokharel, P. 2005. Strategies of Pronominalization in Kusunda. Contemporary Issues in
Nepalese Linguistics, ed. by Yadava, Y. et al., 189–192. Kathmandu: Linguistic Soci-
ety of Nepal.
Reinhard, Johan and Sueyoshi Toba. 1970. A Preliminary Linguistic Analysis and Vocab-
ulary of the Kusunda Language. Kathmandu: Summer Institute of Linguistics.
Robbeets, Martine. 2005. Is Japanese Related to Korean, Tungusic, Mongolic and Tur-
kic? Wiesbaden: Harrassowitz.
Savel’eva, V.N. and Ch.M. Taksami. 1970. Nivkhsko-russkij slovar’. Moskva: Sovetskaja
enciklopedija.
Schmalz, Mark. 2012. Towards a Full Description of the Focus System in Tundra Yuk-
aghir. Linguistic Discovery 10, no. 2: 53–108.
Shrenk, L.I. 1883. Ob inorodcakh Amurskago kraja, Vol. I. Sankt Peterburg: Impera-
torskaja Akademija Nauk.
Siegl, Florian. 2008. A Note on Personal Pronouns in Enets and Northern Samoyedic.
Linguistica Uralica 44/2: 119–130.
Starostin, Sergej A. 1991. Altajskaja problema i proiskhozhdenie japonskogo jazyka.
Moskva: Nauka.
Vajda, Edward. 2000. Ket prosodic Phonology. München: LINCOM EUROPA.
Vajda, Edward. 2001a. The Role of Position Class in Ket Verb Morphophonology. Word
52, no. 3, 369–436.
Vajda, Edward. 2001b. Yeniseian Peoples and Languages. Surrey: Curzon.
Vajda, Edward. 2004. Ket. München: LINCOM EUROPA.
Vovin, Alexander. 2005a. Koguryǒ and Paekche: Different Languages or Dialects of Old
Korean? Journal of Inner and East Asian Studies 2, no. 2, 34–64.
Vovin, Alexander. 2005b. The End of the Altaic Controversy. Central Asiatic Journal 49,
no. 1: 71–132.
Vovin, Alexander. 2010. Koreo-Japonica: A Re-Evaluation of a Common Genetic Origin.
Honolulu: University of Hawai’i Press.
Watters, David E. 2006. Notes on Kusunda Grammar: A Language Isolate of Nepal.
Himalayan Linguistics Archive 3: 1–182.
Werner, Heinrich. 1997a. Die ketische Sprache. Wiesbaden: Harrassowitz.
Werner, Heinrich. 1997b. Abriß der kottischen Grammatik. Wiesbaden: Harrassowitz.
Other isolated languages of Asia 161
AFRICAN LANGUAGE
ISOLATES
Roger Blench
1 INTRODUCTION
One of the notable differences between Africa and most other linguistic areas is its
relative uniformity. With few exceptions, all of Africa’s languages have been gathered
into four major phyla, and most recent progress in classification has been in resolving
details (Greenberg 1963; Blench 2016). The number of undisputed language isolates is
very small. By contrast, Australia, Papua and the New World are extremely diverse at
the phylic level, and all have substantial numbers of isolates or very small phyla. Eur-
asia is hard to classify since Europe is universe and is characterised by a small number
of geographically extensive languages, but Siberia and Northeast Asia are diverse on a
level with the Amazon. Southeast Asia, on the other hand, is somewhat similar to Africa,
in having a relatively small number of phyla, each with many languages and almost no
isolates. Given the time-depth of human settlement in Africa, this is somewhat surpris-
ing. If the ex Africa hypothesis for the origin of modern humans is accepted, then we
have to assume that Homo sapiens sapiens originated some 150–200 kya and spread to
Eurasia from Northeast Africa, largely displacing, but perhaps also interbreeding with,
the hominids already in situ. Looking at the worldwide pattern of isolates, they are
apparently very unevenly distributed, assuming standard references such as the Ethno-
logue reflect true diversity and not just differences in research traditions across regions.
There is almost a gradient from west to east, with few in Europe and the greatest number
in the New World.
The explanation for this is unclear, and indeed for some authors, this is based on a
mistaken analysis of the genetic affiliation of individual families or specific subgroups.
The identification of isolates in Africa has not been without controversy. Joseph Green-
berg, whose classification of African languages remains the principal framework in use
today, was a committed ‘lumper’ and was inclined to ensure every language found a
classificatory home, sometimes on the basis of extremely tenuous evidence. Recent years
have seen a sceptical counter-trend, to consider that some of the languages or branches
classified by Greenberg and formerly accepted, are isolates. If this is so, then Africa
may be the home of many more isolates than are usually listed. This chapter1 describes
the controversies over the identification of African isolates, covers in more detail those
generally accepted, and deals more briefly with controversial cases. For some languages,
fragmentary data makes an uncontroversial resolution impossible. The chapter also con-
siders briefly the identification of substrates and claims about residual foragers which
may well point to a prior, more diverse Africa.
African language isolates 163
2 METHODOLOGICAL ISSUES
LUMPERS
SPLITTERS
then its genetic affiliation will inevitably be questioned. Nilo-Saharan and Khoesan in
particular include languages whose inclusion in the phylum remains debated. Several
of the languages of the Ethio-Sudan borderland, such as Shabo and Gumuz within Nilo-
Saharan and the ‘Mao’ languages, particularly Ganza, within Omotic, not only have very
low lexical cognate count with their relatives but lack tidy correspondences. Three expla-
nations are possible:
a) The putative branches have been diverging away from the rest of the phylum for suf-
ficiently long for natural vocabulary erosion to be responsible for low lexical counts.
b) Apparent similarities with the other branches of the phylum are due to borrowing.
c) Apparent similarities are due entirely to chance.
proportion to the speculation about their classification. For example, neither of the two
languages for which there are only written sources, Meroitic and Guanche, have enough
core vocabulary to establish their relationships following the usual canons of historical
linguistics. They could be treated as unclassifiable, following a distinction made else-
where in this book. At least one language, Oropom, is almost certainly spurious, and two
others, Kwadi and Mpra, died out in the twentieth century before an adequate amount
of data could be collected. I have not reviewed all the fragmentary reports of unknown
languages, for which there are sometimes ten words or less. Table 7.1 lists the languages
about which few doubts exist.
Jalaa may well be extinct; although individuals claiming Jalaa ethnicity are still present
in the Cham-speaking area, none now remember any words of the language.
There are further languages which have been reported initially as isolates but which
seem to be affiliated to known phyla or can otherwise be excluded. A list of these is given
in Table 7.2.
Bēosi Madagascar Birkeli (1936), Blench and Walsh Austronesian with unknown ?
(n.d.) Southern Cushitic substrate
Dompo Ghana Painter (1967), Blench (n.d. a) Guang language with unknown
substrate
Guanche Canaries Wölfel (1965) Extinct. Absence of basic
vocabulary makes classification
impossible to resolve
Gumuz Ethiopia Bender (2005), Ahland (2004, 2012) Nilo-Saharan isolate branch
Kujarge Sudan Doornbos and Bender (1983); Probably Chadic
Lovestrand (2012), Blench (2013)
Kwadi Angola Westphal (1963), Güldemann (2004) Perhaps Khoesan
Meroitic Ancient Sudan Rilly and De Voogt (2012) Probably a close relative of Nubian
Mpra Ghana Cardinall (1931), Blench (n.d. c) Extinct. Kwa language
Ongota Ethiopia Fleming et al. (1992), Sava and Probably Afroasiatic
Tosco (2000)
Oropom Uganda Wilson (1970) Probably spurious (Heine pers.
comm.)
Sandawe Tanzania Sands (1998), Güldemann and Probably Khoesan
Elderkin (2010)
Shabo Ethiopia Bender (1977), Fleming (1991), Nilo-Saharan isolate branch
Teferra (1991, 1995), Tsehay (2015)
168 Roger Blench
Mediterranean Syria
Tunisia Lebanon
Sea
Morocco Iraq
Israel
Jordan
Guanche (†)
Algeria
Libya Egypt Persian
Western
Saudi
Sahara
Arabia
Mauritania Red
Meroitic (†)
Nemadi Sea
Imraguen Mali Niger
Bangi Me Sudan Yemen
Chad Eritrea
Senegal
The Gambia Burkina Kujarge Gulf of
Faso
Jalaa Djiborti
Guinea Benin Laal Gumuz Aden
Guinea-Bissau
Mbre Mbra Nigeria Shabo
Sierra Cote Central South Ethiopia
Leone d’Ivoire Ghana African Sudan Ongota
Cameroon Republic
Liberia Gulf of
Togo
Guinea
Somalia
Equatorial Uganda
Guinea Congo Kenya
Gabon Rwanda
DR Congo Burundi Hadza
Tanzania
Sandawe
Angola
KEY Kwisi Zambia Malawi
Swaziland
South
Atlantic Lesotho
Ocean South
Africa
Meroitic and Guanche became extinct long ago, while for Bēosi, Kwadi and Mpra, it is
unlikely that further data can be collected, so the question cannot be resolved. The status
of Kujarge is unknown, but no speakers have been encountered since Doornbos’ original
record, and the civil war that has passed over their homeland may well have finalised their
demise. Map 7.1 shows a composite map locating the language isolates, the controversial
cases given in Table 7.2, and the location of residual foragers who might represent former
isolates.
3.2.1 Bangi Me
The Bangi Me (Bangi-me) language is spoken in Mali, in seven villages east of Karge,
reached by turning off the Sevare-Douentza road 38 km north of Sevare. The population
African language isolates 169
/y/ is used for the palatal approximant, corresponding to IPA /j/. /ʋ/ is an allophone of /b/.
Vowels
Bangi Me vowels are given in Table 7.4.
[ie iɛ ɛe eɛ aɛ ɔo oɔ]
Plosives pb td kg
Prenalised m
p mb t d
n n ŋ
k ŋg
Nasals m n ɲ ŋ
Fricatives s ʃʒ ɕ ɣ h
Affricates ʧ
Approximants [ʋ] r y ɥ w
Nasal r̃ ỹ w̃
approximant
Lateral l
Close i u
Close-mid e o
Open-mid ε ɔ
Open a
3.2.1.2 Tone
Bangi Me is tonal language, with two tones, high and low, with the mora the tone-bearing
unit. On monosyllabic words with two morae, level tones can combine to create rising or
falling melodies. Rising tones may appear on monomoraic syllables in word-initial posi-
tion. Rising tones on monomoraic words usually appear after a velar consonant. A pho-
netic mid tone which is the result of a non-automatic downstep.
3.2.1.3 Morphology
One of the main attributes of Bangi Me that differentiates it from the Dogon languages is
its lack of segmental, bound morphology. Like many Niger-Congo languages, Dogon lan-
guages are agglutinating, whereas Bangi Me is isolating. Bangi Me has no evidence for
noun class markers or even remnants, although there is a diminutive suffix and an opaque
frozen [–r] suffix. Bangi Me also differs from the Dogon languages in that tense, aspect
and mood markers are unbound morphemes. Verbs in Bangi Me are divided into different
classes based on transitivity, phonological shape and semantic category, whereas most
verbs in Dogon take the same inflection, with the exception of change-of-state verbs.
3.2.1.4 Syntax
Although at the phrase level Bangi Me is head initial, with noun-postposition and
noun-modifier word order (except DEF N and POSS N), at the clause level, the basic con-
stituent order is either SVO, SOV or OSV. The ordering of constituents in the sentence
depends on the tense/aspect/mood of the clause. A feature not shared by any surrounding
African language isolates 171
language is the use of tonal marking on the verb and object if present. Subject and TAM
are marked by a combination of segmental and autosegmental features.
3.2.2 Hadza
The Hadza (Tindiga) language is spoken by about 800 individuals close to Lake Eyasi
in Northern Tanzania. It has been the subject of intensive anthropological research with
more than a thousand references, mostly focusing on the persistence of hunting and gath-
ering.4 The presence of clicks in Hadza encouraged earlier researchers to classify Hadza
together with Sandawe (also in Tanzania) and the Khoesan languages of Southern Africa.
This idea may first have been argued by Bleek (1956) in her ‘Comparative Bushman Dic-
tionary’ and was then picked up in Greenberg (1963). Since Sands (1998) it is generally
accepted that Hadza is an isolate, despite the presence of clicks and that the connections
with Khoesan were based on unreliable transcriptions. Kirk Miller (personal communi-
cation) has been working on a grammar and dictionary of Hadza, but these are not yet in
the public domain.
Hadza phonology is complex and the history of descriptions is marked by consid-
erable variation between different accounts. The earliest modern description is Tucker
et al. (1977), but the most complete overview of Hadza phonology is Sands et al. (1993),
reformulated in Sands (2013a). Table 7.5 is adapted from Sands et al. (1993).
Hadza has the five cardinal vowels: /i/, /e/, /a/, /o/ and /u/ (Table 7.6). A few words
show contrastive /ĩ/ and ũ/. Vowel length, pharyngealisation, glottalisation and breathi-
ness are not contrastive.
Whether Hadza is tonal is the subject of some uncertainty. Tucker et al. (1977) tran-
scribe both stress and three level tones. However, subsequent investigations have not
Plosive pʰ p b tʰ t d ʤ kʰ k g kʰʷ kʷ gʷ ʔ
Ejective (p’) k’ k’ʷ
Central oral click kǀ k!
Lateral oral click kǁ
Nasal m n ɲ ŋ ŋʷ
Nasal central click ŋǀ’ ŋǀ ŋ!’ ŋ!
Nasal lateral click ŋǁ’ ŋǁ
Prenasal plosive mpʰ mb ntʰ nd ŋkʰ ŋg
Prenasal affricate nts ndz nʤ
Central affricate ts dz ʧʤ
Lateral affricate tλ̥
Ejective ts’ ʧ’
central affricate
Ejective tλ̥’
lateral affricate
Fricative f s ʃ
Lateral fricative ɬ
Approximant y w ɦ
Lateral l
172 Roger Blench
confirmed this. Sands et al. (1993) and Sands (2013a) conclude that Hadza shows a sim-
ple two-way contrast and might well be considered a pitch-accent language.
Morphology
Hadza divides nouns into masculine and feminine and marks both gender and number
with suffixes. Table 7.7 shows number and gender marking for n!e ‘leopard’. Hadza verbs
are inflected with suffixes, although initial reduplication can mark emphasis. Hadza also
has plural verbs, or distributives, which are marked with infixes.
Syntax
The basic constituent order of Hadza is VSO. For example:
However, Sands (2013c: 265) provides examples of the great variability in word order
and concludes, ‘Hadza is best described as a pronominal argument language’.
3.2.3 Jalaa
The Jalaa (also Jalabe, Jaabe) live in a single settlement, Loojaa, in Balanga Local Gov-
ernment Area, southern Bauchi State, Nigeria. One person is níí jàlàà, and the people
are jàlààbɛ̀. They are also known locally as Cèntûm or Cùntûm, from a name for their
former settlement. The only information available on the Jalaa language is the wordlist
in Kleinwillinghöfer (2001: 243) which gives lexical comparisons with neighbouring
languages but states that ‘old people were able to remember and provide additional words
and phrases from their former language’. Kleinwillinghöfer (2001: fn. 8) mentions that it
is possible Jalaa is still used in ritual performances but that his informants were unwill-
ing to disclose information about this. The Jalaa are surrounded by Adamawa-speaking
peoples, such as the Cham and Dadiya, and they have been almost completely absorbed
linguistically by the Cham.
Not much can be said of the phonological features, but the basic sound-system can be
inferred from the data in Kleinwillinghöfer (2001). Table 7.8 shows the consonants of
Jalaa.
African language isolates 173
Plosives pb td kg kp
Nasals m n ɲ ŋ
Fricatives f s h
Affricates ʧʤ
Approximants r y w
Lateral l
Jalaa permits labialised consonants /sʷ/, kʷ/, /bʷ/ as well as a palatal /dʸ/.
Vowels
Kleinwillinghöfer does not give the vowel system explicitly and uses the Nigerian con-
vention of subdots to represent – ATR vowels. On this basis there is a ten-vowel system,
in Table 7.9.
Three level tones are transcribed, as well as a falling tone. Some long vowels are tran-
scribed, for example yúú ‘sesame’, but whether length is systematic remains difficult to
discern.
Jalaa has a number-marking system with alternating suffixes for nouns, like the sur-
rounding Adamawa languages. Whether this is original or borrowed is unclear. For exam-
ple, Jalaa often has an identical suffix alternation for similar meanings to Cham, the
language with which it has a strong borrowing relationship, despite quite different seg-
mental material. Table 7.10 illustrates this.
As Kleinwillinghöfer (2001) points out, the similarities of nominal affix alternation
with its Adamawa neighbours combined with the striking rarity of shared lexemes lead to
the speculation that these number marking strategies were borrowed.
3.2.4 Laal
The Laal (Gori, Laabe) language is spoken in Central Chad in the Moyen-Chari Region,
Barh Kôh department, between Korbol and Dik, Gori (centre), Damtar and Mailao
174 Roger Blench
Close i u
ɩ ʊ
Close-mid e ə o
Open-mid ε ɔ
Open a
villages. There were 750 speakers in the year 2000. Damtar village was said to have
its own dialect called Laabe with three speakers left in 1977. The Laal do not have an
autonym but refer to themselves as:
The language name, yəw láàl, is ‘language’ + Gori.nominal suffix. The Laal are not
hunter-gatherers, but today have an economy based on fishing and farming and may have
formerly been pastoralists.
Preliminary work on Laal was conducted by Boyeldieu (1977, 1982a, 1982b, 1987, n.d.)
who first drew attention to the difficulties of classifying it. Faris (1994) confirmed that the
Laal had survived the civil war in Chad and Lionnet (2010, 2013) has begun a description
of Laal. Boyeldieu shows that, although Laal incorporates elements of the neighbouring
Chadic and Adamawa languages, it has a large corpus of unetymologisable lexemes.
Phonological features
The consonant inventory of Laal is characteristic of the Southern Chad area, except
perhaps for the palatal implosive [ʄ] (Table 7.11). The vowels are represented by Lion-
net (2010) as in Table 7.12. The type of system this is intended to represent is slightly
opaque, at least to me. Boyeldieu (1977) transcribes three tone heights and a rising and
falling tone.
Morphology
Number marking on nouns is extremely diverse. Table 7.13 shows examples of the differ-
ent number-marking strategies.
African language isolates 175
Plosives pb td cj kg ʔ
[ɟ]
Prenalised ͫb n
d n
g
Nasals m n ɲ ŋ
Implosives ɓ ɗ ì [ʄ ]
Fricatives s h
Flap r
Lateral l
Approximants y [ j] w
Close i ü [y] ɨ u
Mid e üo [ɥo] ə o
Open i̭ a (~ɛ) üa [ɥa] a ṷa (~ɔ)
The striking feature of Laal which marks it out from all neighbouring languages is its
threefold gender system marked, not on nouns, but on pronouns, and the ‘connective’
particle. The three classes are masculine (human male), feminine (human female) and
neuter (non-human).
The subject pronouns are as presented in Table 7.14.
Although the usual comparisons for Laal are with Chadic and Adamawa languages,
this gender system is strongly reminiscent of Nilo-Saharan languages, such as Krongo
(Reh 1985), although there is no other evidence for a Nilo-Saharan affiliation.
176 Roger Blench
Singular 1 já jí –
2 ʔò –
3 ʔà ʔɨǹ ʔàn
Plural 1 ex ʔùrú –
1 inc ʔǎŋ –
2 ʔùn –
3 ʔì ʔuàn
Syntax
Basic constituent order of Laal is SVO. For example:
ʔà sɨr̀ sū
he drinks water
3.3.1 Bēosi
The island of Madagascar is today entirely the province of an Austronesian language,
Malagasy, divided into a large number of dialects. However, there is strong archaeologi-
cal and palaeo-environmental evidence for hunter-gatherer settlement prior to the coming
of the Austronesians (Blench 2007c). Today there are number of forager groups scattered
across Madagascar, bearing the names Mikea, Vazimba and Bēosi or their variants. All
these people speak Malagasy today, and genetic studies of the Mikea have not indicated
any unusual profile (Pierron et al. 2014). Nonetheless, some Mikea groups, particularly
the Bēosi, have non-standard lexical items in their lect of Malagasy and also retain songs
which cannot be interpreted. This suggests that, although this speech would be classi-
fied as a variant of Malagasy, today it may retain a substrate of an isolate or unknown
language. The only record of these is Birkeli (1936) although more recent reports show
that some of these terms are still in use (Stiles 1994). Unfortunately the lexical data
reflects such items as useful plants or poetic terms rather than core vocabulary. Blench
and Walsh (n.d.) have analysed this idiosyncratic material and suggest there is a possible
African language isolates 177
Southern Cushitic substrate. This would not be unreasonable since the nearest forager
group today on the adjacent mainland are the Southern Cushitic Aasax, whose language
has unfortunately now been lost, but for which a reasonable record remains (Fleming
1969). However, many other lexical items are of unknown origin, and since it seems
unlikely further data can be collected, the question of the original affiliation of Bēosi will
never be resolved with certainty.
3.3.2 Gumuz
The Gumuz language is situated on the Ethio-Sudan borderland and has 179,000 speak-
ers in Ethiopia according to the 2007 census. It is dialectally heavily divided (Ahland
2004). Reports in 2014 show that there is a previously unreported language apparently
related to Gumuz, Dasin (Ahland, personal communication). Bender (1979, 1997) is the
first published record of Gumuz although his work is a recension of earlier Italian and
other sources and he treated Gumuz as a branch of Nilo-Saharan. However, Gumuz lacks
many characteristic Nilo-Saharan features such as ‘moveable k-’ and three-term number
marking (Ahland 2010, 2012). In his final statement on the subject, Bender (2005) sug-
gested Gumuz was an isolate. Ahland (personal communication) has prepared a compar-
ative wordlist illustrating cognate items shared between the two families, and the present
author considers that Gumuz is Nilo-Saharan and indeed related to the Koman languages.
3.3.3 Shabo
The Shabo language is spoken by the Sabu (Shabo, Chabu) people of southwestern Ethi-
opia. The name found in earlier sources, Mekeyer, is used by the Majang people (Jordan
et al. 2007). The Shabo live in what used to be the Kafa Region, between Godere and
Masha, among the Majang and Shekkacho. According to the current administrative divi-
sions, most Shabo people now live in the Sheka Zone of the Southern Nations, Nationali-
ties and Peoples Region (SNNPR) and the Majangir Zone of Gambela Region.
Under the name Mikeyir, Harvey Hoekstra seems to have been the first to report this
language, and using his data, Bender (1977) classified it as possibly Surmic. Shabo is
still spoken by some 400–500 individuals, although it is losing ground to Majang and
latterly Amharic. The forms then identified as cognates are now seen to be the result of
extensive loans from the Majang language rather than an indication of true genetic affil-
iation. Since that date there have been a variety of attempts to classify Shabo, including
Teferra and Unseth (1989); Fleming (1991, 2002); Ehret (1995); Bender (1983, 1997);
and Schnoebelen (2009). None of these is conclusive, in part because of the small amount
of available data. Bender’s treatment of Shabo as an isolate branch of Nilo-Saharan is a
reasonable inference from the existing data. Teferra (1991, 1995) was for a long time
178 Roger Blench
almost the only descriptive work on the phonology and grammar of Shabo, but more
detailed treatment of Shabo lexicon and grammar has recently become available (Tsehay
2015). Like Gumuz, Shabo lacks ‘classic’ features of Nilo-Saharan such as three-term
number marking or moveable k-. Nonetheless it seems most likely that Shabo is related
to its close neighbours Koman and Gumuz. Although these are close to one another geo-
graphically, they are surprisingly dissimilar; nevertheless, they have enough common
aspects to tentatively propose that they form a subgroup of Nilo-Saharan. Some typical
items shared are given in Table 7.16.
Shabo is undoubtedly a language of considerable significance in the larger picture of
African languages.
3.3.4 Ongota
The Ongota (Birale) people live in a single village in southwestern Ethiopia, in the South
Omo zone, on the west bank of Weyt’o River. Ethnologue (Simons and Fennig 2017)
reports ten speakers, but recent visitors suggest there may be as few as six who are com-
petent (Mikeš, personal communication). Nearly all adults have switched to the Cushitic
Tsamay or other regional languages such as Konso and Hamer. The first report of this
language is in Fleming et al. (1992), and since then, it has had considerable publicity,
although in terms of actual data there is only an extended wordlist and sketches of aspects
of the grammar. Key references are Fleming et al. (1992), Fleming (2006), Sava and
Tosco (2000), Yilma5 (ined.) and Blažek (2007). These authors come to very different
conclusions on the affiliation of Ongota. These views can be summarised as in Table 7.17.
None of these support the notion that Ongota is a true isolate, although the differ-
ent conclusions concerning its affiliation make any definitive assignment problematic.
It could indeed be an isolate with differing levels of influence from different languages.
The present author considers Fleming’s proposal for an Afroasiatic affiliation the most
reasonable.
3.3.5 Meroitic
Meroitic was the language of a substantial urban polity that existed on the Nile between
eighth century bc until about 350 ad, when it was destroyed by Axumite armies. The
inhabitants of Meroe used hieroglyphs and initially wrote in the Egyptian language. By
the first century bc, hieroglyphs gave way to a Meroitic script that adapted the Egyptian
writing system to an indigenous language. Meroitic is an alphabetic script with 23 signs
used in a hieroglyphic form (mainly on monumental art) and in a cursive. The cursive
version was widely used; so far some 1,278 texts are known. The most up-to-date review
of what is known about Meroitic script is in Rilly and De Voogt (2012). The new alpha-
bet was phonetic, assigning syllabic values to hieroglyphs and occasionally using hiero-
glyphs in their original sense to explicate the texts, rather as Chinese ideograms are still
printed alongside Japanese today.
Meroitic inscriptions, which have proven problematic to decipher, have fuelled a
string of poorly supported and indeed fringe hypotheses as to its genetic affiliation. Some
of these are very bizarre, such as the proposal that Meroitic was Tocharian, the extinct
Indo-European language of north-west China. The Web has created a new forum for
individuals to publish their attempts at decipherment without the usual constraints of
scholarship. Meroitic was previously considered to be degraded Egyptian, but it was then
unclear why it could not easily be read. Most serious attempts at decipherment assumed
that the original language is Afroasiatic, although there was no particular reason to think
this was the case. The proposal that Meroitic was Nilo-Saharan was first made in the
1960s, and Greenberg (1971b) and Bender (1981) both supported this. However, since
2000, considerable progress has been made, and there are now more than 40 Meroitic
terms transcribed with some certainty. Rilly and De Voogt (2012) argue that it was a close
relative of Nubian, and this has gained general acceptance among Nilo-Saharan scholars.
3.3.6 Oropom
The Oropom language, said to be spoken among the Karamojong in northeast Uganda,
is recorded in a single source, Wilson (1970). Wilson claimed that the Oropom were
a subset of the Karamojong who used stone tools until the recent past. He recorded a
97-word list of the language, transcribed orthographically. Some ten years after Wilson’s
report, Bernd Heine (personal communication) went to seek rememberers of Oropom and
could find no individuals who would even admit to this ethnic identity. For this reason
he regarded the language as spurious, perhaps constructed on the spot by an informant.
Souag (2004) re-analysed the vocabulary and found much of it borrowed from neigh-
bouring languages, although with a core of unexplained lexical items. With no further
reports, the safest conclusion is that Heine was correct in regarding Oropom as bogus.
3.3.7 Sandawe
The Sandawe are a people in the Kondoa district of Dodoma region in central Tan-
zania, notable for their non-Bantu click language. They were predominantly foragers
and pastoralists before Europeans colonised Africa. In 2000, the Sandawe population
was estimated to be 40,000. Sandawe ethnography and language was first described in
Dempwolff (1916) and later in Ten Raa (1986). Sandawe grammar has been relatively
well described (Van de Kimmenade 1954; Eaton 2010; Eaton et al. 2007; Steeman
2012), and there are two lexicons (Kagaya 1993; Ehret et al. 2012). The presence of
180 Roger Blench
clicks in Sandawe led Bleek (1956) and Greenberg (1963) to assume a relationship with
Southern African Khoesan. More recent analyses have also reached the same conclu-
sion although most of the earlier proposed cognates were compromised by poor tran-
scription (Elderkin 1983; Sands 1998; Güldemann and Elderkin 2010). Nonetheless,
if this is correct, the relationship is not close. Surprisingly, given that both are click
languages in the same region of East Africa, Sandawe and Hadza seem to show no
common lexicon.
3.3.8 Kwadi
The Kwadi (Bakoroka, Cuanhoca, Cuepe, Curoca, Koroka, Makoroko, Mucoroca) are
a group of former pastoralists who live in the remote area in the extreme southwest
of Angola. Strikingly, despite speaking a click language, they do not have the typical
phenotype of Khoesan speakers. They were first reported by Capello-Ivens (1886) and
described in more detail by the ethnographer Estermann (1956, translation Gibson 1976).
Tape recordings of spoken Kwadi were made by the ethnographer Almeida, but these
have never been released. Westphal (1963) made a field trip to the area and made exten-
sive notes on Kwadi, which remain in the archive of the University of Cape Town. How-
ever, for some reason he never published an analysis of this data although he considered
Kwadi an isolate (Westphal 1963, 1971). Güldemann (2013a, 2013b, 2013c) has writ-
ten up the linguistic element of Westphal’s notes. Güldemann (2004, 2008b) argues that
Kwadi is part of Khoe, i.e. Central Khoesan, although the argument for this is complex,
as the pronominal system and person marking seem to be very different from Khwe. The
lexical cognates, however, seem to be at a level of near identity (Table 7.18).
Kwadi also shares a common Khwe root for ‘cattle’.
Khwe góɛ́
Naro gòè
//Ana gúè
Kwadi goe-
Since the morphosyntax of Kwadi is very different from the Khwe languages, the near
identity of the forms where the words are cognate suggests to the present author the
possibility that the Kwadi language is an isolate displaying a borrowing relationship with
Khwe. Since Güldemann (2008b) notes that the Kwadi were former pastoralists it might
be that at least the livestock vocabulary was an early borrowing from Kwadi into Khwe.
Unfortunately, even in the 1950s, there were few speakers of Kwadi, and it seems the lan-
guage has now vanished completely, so this question can probably no longer be resolved.
3.3.9 Kujarge
The Kujarge language is, or was, spoken on the Chad-Sudan border by a small and scat-
tered group of hunter-gatherers. The fate of these people, whose homeland is exactly in
the centre of the recent civil conflicts, is unknown, but prognostications cannot be good.6
The only published information on this language is Doornbos and Bender (1983). On
the basis of 100 words, they concluded that the language was East Chadic, although its
cognacy rate with other East Chadic languages is very low. In the 2000s an unpublished
manuscript containing additional words collected by Paul Doornbos has been circulated,
together with some etymological commentary. Nonetheless, the sample remains small,
and the transcription and reliability of some forms can be questioned. Kujarge is clearly
an important language, however, and the exiguous nature of the dataset is to be regretted.
The present author has listed Kujarge as an isolate in various publications (e.g. Blench
2006) based on its low cognacy counts with its neighbours. Lovestrand (2012) has estab-
lished additional lexical resemblances to East Chadic languages, and Blench (2013) now
considers it to belong with East Chadic, although a highly divergent branch. Lovestrand
classifies it as B1.3, a parallel branch to the Bidiya and Kajakse groups. The unlikeli-
hood that more data will become available may mean that the definitive classification of
Kujarge will remain unresolved.
3.3.10 Dompo
The Dompo language is spoken in West-Central Ghana in a settlement adjacent to Banda,
the main town of the Nafaanra people. Painter (1967) gives a map reference as 8° 09´
N 2° 22´ W. Banda is reached from Wenchi by going northwards from the main road to
Bondoukou in Côte d’Ivoire, south of the Black Volta. A visit by the present author in
April 1998 established a longer wordlist. Dompo has a striking lexicon for wild fauna
which is of unknown origin (cf. Table 7.19), but the main lexicon is undoubtedly Guan,
and its closest relative is probably Gonja (Blench n.d. a). Either the names for animals
constitute some sort of lexical avoidance or honorific system (Blench 2007b) or Dompo
is a relic hunting group almost completely assimilated by the Guan.
3.3.11 Mpra
Cardinall (1931) reported the existence of a language, Mpre (correctly Mpra), spoken
in Central Ghana, which had nearly disappeared in his time. Goody (1963) revisited the
settlement in 1956 and was able to add a few more lexical items. Mpra has been listed
in some sources as an isolate (e.g. Dimmendaal 2011). To see whether any speakers still
existed, the present author visited the village of Butei (Bute in Goody) on February 28th,
2007. Butei is some 20 km from the main Tamale-Kintampo road, branching east towards
Mpaha shortly after the Fulfulso junction leading to Damongo, and between the two
branches of the Volta. By 2007, although former speakers still acknowledged their ethnic
identity, only personal names and a few songs in Mpra remained (Blench n.d. c).
Blench (n.d. c) tabulates possible external sources of the lexicon. Overall, a large pro-
portion of the vocabulary of Mpra has no evident source. The most notable parallels are
with Avikam, a language spoken along the coastal lagoons of Cote d’Ivoire west of Abi-
djan (Hérault 1983b). Some lexical similarities are only shared with Avikam, to judge by
Hérault (1983a); others are also found in other coastal languages such as Eotile, Adyukru
and Nzema. The similarities to Lagoon languages might be ancient loans rather than true
genetic cognates, particularly as many are extremely close in form and there are no obvi-
ous regular sound changes. There are also a few very specific parallels with the names
of animals in the Dompo language (Table 7.19), a Guan language spoken near the Cote
d’Ivoire border (Section 3.3.10). This is particularly surprising, as Mpra otherwise shows
no Guan influence and is quite remote from Banda, where the Dompo live.
This may be evidence of the sharing of technical vocabulary between roaming hunters
in their long-distance sweeps of the bush in the dry season. In the absence of further data,
Mpra can probably be accepted as Niger-Congo, but whether it was an isolate branch or
affiliated to a larger grouping can no longer be resolved.
3.3.12 Guanche
The Guanche were the ancient people of the Canary Islands, which were apparently set-
tled around 3000 BP. The name originally applied to the inhabitants of Tenerife but has
come to refer to what were probably at least four distinct languages. Modern European
contact probably dates from the fourteenth century, and the first record of the Guanche
language appears in the work of the Genoese mariner Nicoloso da Recco in 1341. The
Castilian conquest of the Canaries began in 1402, and Guanche disappeared as a spo-
ken language in seventeenth century, though rememberers may have persisted somewhat
later. Virtually all the existing language materials are collected in Wölfel (1965). Rock
inscriptions in the Canaries include short sentences in both Libyco-Berber and Punic lan-
guages. Unfortunately these include hardly any basic lexicon, except numbers, and many
items of unknown origin. It is generally considered that Guanche is related to Berber,
mostly on the basis of numbers (Pietschmann 1879). However, it is equally likely that it
was an old North African language of unknown genetic affiliation and that similarities to
Berber are later borrowings.
4 CONCLUSIONS
This chapter has covered the complex methodological issues concerning the identifica-
tion of language isolates in Africa and established a reference list of the most likely
candidates, which are briefly described from a linguistic point of view. A longer list
covers languages which have sometimes been considered isolates, but which are either
undecidable for lack of adequate data or now have a fairly certain genetic affiliation. It
should be underlined that a spectrum of views exists, from a position where languages
are considered isolates until their affiliation is proven to a very high standard of evidence,
to a position linking almost all known languages to larger phyla. The author has tried to
tread a middle road and give a flavour of the debate. It is certain, however, that almost
all candidates have only very small numbers of speakers, and living languages such as
Laal, Bangi Me and Hadza deserve more description and analysis. Language isolates can
African language isolates 185
provide clues to the language situation of Africa in the Pleistocene and enriching this
sparse but valuable evidence must surely be a high priority.
NOTES
1 This chapter draws on the presentations and discussions at a workshop held
in Lyon December 3 and 4, 2010, and a presentation circulated for that meet-
ing. I am grateful to Harald Hammarström for helping me to get access to a vari-
ety of scarce documents and for insightful comments on the first version. I was
subsequently invited to review the classificatory work of Joseph Greenberg for a
special session of the Linguistic Society of America, held in Washington, Janu-
ary 2016. The text from that session is available at www.academia.edu/20110452/
Greenberg_s_Universal_Project_the_classification_of_the_world_s_languages.
2 Larry Hyman (2011) has also presented a detailed critique of Güldemann’s methods
and results, although using very different examples from those given here.
3 http://dogonlanguages.org/bangime.cfm
4 According to Woodburn (personal communication, May 2014), there are still Hadza
who live almost entirely from foraging, despite the encroachment on their lands by
herders and national parks.
5 This document is referred to in Fleming (2006), but it seems never to have been pub-
lished, nor is a full bibliographic reference available.
6 Lovestrand (personal communication) conducted a search for Kujarge speakers in
2015 but without success.
REFERENCES
Ahland, Colleen A. 2004. Linguistic Variation within Gumuz: A Study of the Relationship
between Historical Change and Intelligibility. MA Linguistics. Arlington: UTA.
Ahland, Colleen A. 2010. Noun Incorporation and Predicate Classifiers in Gumuz. Jour-
nal of African Languages and Linguistics 31, no. 2: 159–203.
Ahland, Colleen A. 2012. A Grammar of Northern and Southern Gumuz. Ph.D. Univer-
sity of Oregon.
Amha, Azeb. 2012. Omotic. The Afroasiatic Languages, ed. by Z. Fajzyngier and E. Shay,
423–504. Cambridge: Cambridge University Press.
Bahuchet, Serge. 1992. Dans la forêt d’Afrique Centrale: les pygmées Aka et Baka. His-
toire d’une civilisation forestière, I. Paris: Peeters-SELAF.
Bahuchet, Serge. 1993. La rencontre des agriculteurs: les pygmées parmi le peuples
d’Afrique centrale. Histoire d’une civilisation forestière, II. Paris: Peeters-SELAF.
Bender, Marvin Lionel. 1969. Chance CVC Correspondences in Unrelated Languages.
Language, 45, no. 3: 519–531.
Bender, Marvin Lionel. 1975. Omotic: A New Afroasiatic Language Family. Carbondale,
IL: University Museum Studies 3.
Bender, Marvin Lionel. 1977. The Surma Language Group: A Preliminary Report. Stud-
ies in African Linguistics 7:11–21.
Bender, Marvin Lionel. 1979. Gumuz: A Sketch of Grammar and Lexicon. Afrika und
Übersee, 62: 38–69.
Bender, Marvin Lionel. 1981. The Meroitic Problem. Peoples and Cultures of the
Ethio-Sudan Borderlands, ed. by M.L. Bender. Northeast African Studies 10, 5–32.
East Lansing, MI: Michigan State University.
186 Roger Blench
Bender, Marvin Lionel. 1983. Remnant Languages of Ethiopia and Sudan. Nilo-Saharan
Language Studies, ed. M.L. Bender. Michigan State University Press.
Bender, Marvin Lionel. 1988. Proto-Omotic: Phonology and Lexicon. Cushitic-Omotic:
Papers from the International Symposium on Cushitic and Omotic Languages, ed. by
M. Bechhaus-Gerst and F. Serzisko, 121–162. Hamburg: Buske Verlag.
Bender, Marvin Lionel. 1997. The Nilo-Saharan Languages: A Comparative Essay (2nd
ed.). Munich: Lincom Europa.
Bender, Marvin Lionel. 2000. Comparative Morphology of Omotic Languages. München:
Lincom Europa.
Bender, Marvin Lionel. 2003. Omotic Lexicon and Phonology. Carbondale, IL: Southern
Illinois University.
Bender, Marvin Lionel. 2005. Gumuz. Encyclopaedia Aethiopica, ed. by Siegbert Uhlig,
3: 914–916.. Wiesbaden: Harrassowitz Verlag.
Bendor-Samuel, John (ed.). 1989. The Niger-Congo Languages. Lanham: University
Press of America.
Bertho, J. 1953. La place des dialectes dogon de la falaise de Bandiagara parmi les autres
groupes linguistiques de la zone soudanaise. Bulletin de l’IFAN 15: 405–441.
Birkeli, Emil 1936. Les Vazimba de la cote ouest de Madagascar: notes d’ethnologie.
Mémoires de l’Académie malgache. Tananarive: Imprimerie moderne de l’Emyrne,
Pitot de la Beaujardière.
Blažek, Václav. 2007. Nilo-Saharan Straturn of Ongota. Advances in Nilo-Saharan Lin-
guistics. Proceedings of the 8th Nilo-Saharan Linguistics Colloquium, University of
Hamburg, August 22–25, 2001, ed. by Mechthild Reh and Doris L. Payne, 1–10. Köln:
Rüdiger Köppe Verlag.
Bleek, Dorothea 1956. A Bushman Dictionary. New Haven: American Oriental Society.
Bleek, Wilhelm H.I. 1862, 1869. A Comparative Grammar of South African Languages.
(1862: Part I; 1869: Part II). London: Trübner & Co.
Blench, Roger M. 1999a. The Languages of Africa: Macrophyla Proposals and Implica-
tions for Archaeological Interpretation. Archaeology and Language, IV, ed. by R.M.
Blench and M. Spriggs, 29–47. London: Routledge.
Blench, Roger M. 1999b. Are the African Pygmies an Ethnographic Fiction? Hunter-
Gatherers of Equatorial Africa, ed. by K. Biesbrouyck, G. Rossel, and S. Elders,
41–60. Leiden: Centre for Non-Western Studies.
Blench, Roger M. 2002. Besprechungsartikel. The Classification of Nilo-Saharan. Afrika
und Übersee 83: 293–307.
Blench, Roger M. 2006. Archaeology, Language and the African Past. Lanham: Altamira
Press.
Blench, Roger M. 2007a. Bangi Me: A Language of Unknown Affiliation in Northern
Mali. Mother Tongue XII: 147–178.
Blench, Roger M. 2007b. Lexical Avoidance Taboos and the Reconstruction of Names
for Large Animals in Niger-Congo, an African Language Phylum. Le symbolisme
des animaux – l’animal “clef de voûte” dans la tradition orale et les interactions
homme-nature, ed. by Edmond Dounias, Elisabeth Motte-Florac, and Margaret Dun-
ham, 545–569. unpaginated appendices. Paris: Editions IRD.
Blench, Roger M. 2007c. New Palaezoogeographical Evidence for the Settlement of
Madagascar. Azania XLII: 69–82.
Blench, Roger M. 2013. Links between Cushitic, Omotic, Chadic and the position of
Kujarge. Proceedings of the 5th International Conference of Cushitic and Omotic lan-
guages, ed. by M. van Hove, 67–80. Köln: Rüdiger Köppe.
African language isolates 187
Blench, Roger M. 2016. Greenberg’s Universal Project: The Classification of the World’s
Languages. Paper presented at the Symposium of Joseph Greenberg, Linguistic Soci-
ety of America, Washington, 2016.
Blench, Roger M. n.d. a. Wordlist and Etymological Analysis of Dompo. Electronic ms.
Blench, Roger M. n.d. c. Recovering Data on Mpra [=Mpre] a Possible Language Isolate
in North-Central Ghana. Electronic ms.
Blench, Roger, M. and M. Walsh n.d. The Vocabularies of Vazimba and Beosi: Do They
Represent the Languages of the Pre-Austronesian Populations of Madagascar? Elec-
tronic ms.
Boyeldieu, Pascal. 1977. Eléments pour une phonologie du laal de Gori (Moyen-Chari).
Etudes phonologiques tchadiennes, 186–198. Paris: SELAF (Bibliothèque, 63–64).
Boyeldieu, Pascal. 1982a. Deux Etudes laal (Moyen-Chari, Tchad). Berlin: Dietrich
Reimer Verlag.
Boyeldieu, Pascal. 1982b. Quelques questions portant sur la classification du laal (Tchad).
The Chad Languages in the Hamitosemitic-Nigritic Border Area (Papers of the Mar-
burg Symposium, 1979), ed. by H. Jungraithmayr, 80–93. Berlin: Dietrich Reimer.
Boyeldieu, Pascal. 1987. Détermination directe/indirecte en laal. La maison du chef et
la tête du cabri: des degrés de la détermination nominale dans les langues d’Afrique
centrale, ed. by P. Boyeldieu, 77–87. Paris: Geuthner.
Boyeldieu, Pascal. n.d. Presentation du láà:l ou “Gori” (Moyen-Chari, Tchad). ms.
CNRS, Paris.
Cardinall, A.W. 1931. A Survival. Gold Coast Review 5, no. 1: 193–197.
Cohen, M. 1947. Essai comparatif sur le vocabulaire et la phonétique du Chamito-
Sémitique. Paris: Honoré Champion.
Dalby, David. 1970. Reflections on the Classification of African Languages: With Special
Reference to the Work of Sigismund Wilhelm Koelle and Malcolm Guthrie. African
Language Studies 11 147–171.
Dempwolff, Otto. 1916. Die Sandawe: linguistisches und ethnographisches Mate-
rial aus Deutsch-Ostafrika. (Abhandlungen des Hamburger Kolonial-Institutes,
Bd 34. Reihe B: Völkerkunde, Kulturgeschichte und Sprachen, Bd 19.) Hamburg:
Friederichsen.
Dimmendaal, Gerrit J. 2011. Historical Linguistics and the Comparative Study of African
Languages. Amsterdam: John Benjamins Publishing.
Dixon, R.M.W. 1997. The Rise and Fall of Languages. Cambridge: Cambridge Univer-
sity Press.
Doke, C.M. 1961. The Earliest Records of Bantu. Contributions to the History of Bantu
Linguistics, ed. by C.M. Doke and D.T. Cole, 1–26. Johannesburg: Witwatersrand Uni-
versity Press.
Doornbos, P. and M.L. Bender. 1983. Languages of Wadai-Darfur. Nilo-Saharan Lan-
guage Studies, ed. by M.L. Bender, 43–79. East Lansing: Michigan State University
Press.
Eaton, Helen. 2010. A Sandawe Grammar. (SIL e-Books, 20.) Dallas, TX: SIL Interna-
tional. www.sil.org/silepubs/index.asp?series=941
Eaton, Helen, Daniel Hunziker, and Elisabeth Hunziker. 2007. A Sandawe Dialect Sur-
vey. SIL Electronic Survey Reports, 2007–2014. Dallas, TX: SIL International. www.
sil.org/silesr/2007/silesr2007-014.pdf.
Ehret, Christopher. 1995. Do Krongo and Shabo Belong in Nilo-Saharan? Proceedings of
the Fifth Nilo-Saharan Linguistics Colloquium, Nice, 1992, R. Nicolai and F. Rottland,
169–193. Köln: Rudiger Köppe.
188 Roger Blench
Kagaya, Ryohei. 1993. A Classified Vocabulary of the Sandawe Language (Asian and
African lexicon, 26.) Tokyo: Institute for the Study of Languages and Cultures of Asia
and Africa, Tokyo University of Foreign Studies (ILCAA).
Kleinewillinghöfer, U. 2001. The Language of the Jalaa: A Disappearing Language Iso-
late. Sprache und Geschichte in Afrika 16/17: 239–271.
Letouzey, R. 1976. Contribution de la Botanique au problème d’une éventuelle langue
Pygmée. Bibliothèque de la SELAF, 57–58. Paris : SELAF.
Lionnet, Florian. 2010. Laal: An Isolate Language? Handout for the Workshop Isolates
in Africa Lyon, 3 December 2010.
Lionnet, Florian. 2013. Doubly Conditioned Rounding in Laal: Conditional Licensing
and Correspondence Chains. Abstract for Berkeley Phonetics and Phonology Forum
(Vol. 8).
Lovestrand, J. 2012. Classification and Description of the Chadic Languages of the
Guéra (East Chadic B). SIL Electronic Working Papers 2012–2004. Dallas, TX: SIL
International.
Matthey, Piero. 1966. Brief Notes on the Nooy, a Former Tribe of Hunters and Fishers
in Southern Chad. Bulletin of the International Committee on Urgent Anthropological
Ethnological Research 8: 37–38.
Meinhof, Carl. 1910. Grundriss einer Lautlehre der Bantusprachen. Berlin: Dietrich
Reimer.
Nicolaisen, Ida. 2011. Elusive Hunters: The Haddad of Kanem and the Bahr el Ghazal.
Aarhus: Aarhus Universitetsforlag.
Nicolaisen, Johannes. 1968. The Haddad – a Hunting People in Tchad: Preliminary
Report of an Ethnographical Reconnaissance. Folk 10: 91–109.
Painter, C. 1967. The Distribution of Guang in Ghana and a Statistical Pre-Testing on
Twenty-Five Idiolects. Journal of West African Languages, 4, no. 1: 25–78.
Pierron, Denis, Harilanto Razafindrazaka, Luca Pagani, François-Xavier Ricaut, Tiago
Antao, Mélanie Capredon, Clément Sambo, Chantal Radimilahy, Jean-Aimé Rako-
toarisoa, Roger M. Blench, Thierry Letellier, and Toomas Kivisild. 2014. Genome-
Wide Evidence of Austronesian – Bantu Admixture and Cultural Reversion in a
Hunter-Gatherer Group of Madagascar. Proceedings of the National Academy of Sci-
ences 111, no. 3: 936–941.
Pietschmann, Richard. 1879. Über die kanarischen Zahlwörte. Zeitschrift für Ethnologie
11: 377–391.
Reh, M. 1985. Die Krongo-Sprache (nìino mó-dì) : Beschreibung, Texte, Wörterverzeich-
nis. Kölner Beiträge zur Afrikanistik, 12. Berlin: Dietrich Reimer.
Rilly, Claude and Alex de Voogt. 2012. The Meroitic Language and Writing System.
Cambridge: Cambridge University Press.
Ringe, Don A. 1992. On Calculating the Factor of Chance in Language Comparison.
Transactions of the American Philosophical Society 82, no. 1: 1–110.
Ringe, Don A. 1999. Language Classification: Scientific and Unscientific Methods. The
Human Inheritance: Genes, Language, and Evolution, ed. by Brian D. Sykes, 45–74
Oxford: Oxford University Press.
Sands, B. 1998. Eastern and Southern African Khoisan: Evaluating Claims of a Distant
Linguistic Relationships. Quellen zur Khoisan-Forschung 14. Köln: Rüdiger Köppe.
Sands, B. 2013a. Phonetics and Phonology: Hadza. The Khoisan Languages, ed. by
Rainer Voßen, 38–42. Routledge Language Family Series. London: Routledge.
Sands, B. 2013b. Morphology: Hadza. The Khoisan Languages, ed. by Rainer Voßen,
107–124. Routledge Language Family Series. London: Routledge.
African language isolates 191
Sands, B. 2013c. Syntax: Hadza. The Khoisan Languages, ed. by Rainer Voßen, 265–274.
Routledge Language Family Series. London: Routledge.
Sands, B., I. Maddieson, and P. Ladefoged. 1993. The Phonetic Structures of Hadza.
UCLA Working Papers in Phonetics 84: 67–87.
Savà, Graziano and Mauro Tosco. 2000. A Sketch of Ongota: A Dying Language of
Southwest Ethiopia. Studies in African Linguistics 29, no. 2: 59–135.
Schnoebelen, T. 2009. Classifying Shabo: Phylogenetic Methods and Results. Confer-
ence on Language Documentation and Theory, Vol. 2, ed. by Peter K. Austin, Oliver
Bond, Monik Charette, David Nathan, and Peter Sells, 275–284. London: SOAS.
Simons, Gary F. and Charles D. Fennig (eds.). 2017. Ethnologue: Languages of the
World, Twentieth edition. Dallas, Texas: SIL International. Online version: http://www.
ethnologue.com.
Souag, Lameen M. 2004. Oropom Etymological Lexicon: Exploring an Extinct, Unclas-
sified Ugandan Language. ms. SOAS.
Starostin, S.A., Anna Dybo, and Oleg Mudrak. 2003. Etymological Dictionary of the
Altaic Languages. Leiden/ Boston: Brill.
Steeman, S. 2012. A Grammar of Sandawe: A Khoisan Language of Tanzania. LOT-
Netherlands Graduate School of Linguistics, Utrecht.
Stiles, D. 1994. The Mikea, Hunter-Gatherers of Madagascar. Kenya Past and Present
26: 27–33.
Taine-Cheikh, Catherine. 2013. Des ethnies chimériques aux langues fantômes: l’ex-
emple des Imraguen et Nemâdi de Mauritanie. In and Out of Africa: Languages in
Question. In Honour of Robert Nicolaï. Vol. 1. Language Contact and Epistemological
Issues, ed. by Carole de Féral, 137–164. Louvain: Peeters.
Teferra, A. 1991. A Sketch of Shabo Grammar. Studies in African Linguistics, 7: 11–21.
Teferra, A. 1995. Brief Phonology of Shabo (Mekeyir). Fifth Nilo-Saharan Linguistics
Colloquium. Nice, 24–29 août 1992, ed. by R. Nicolaï and F. Rottland, 169–193. Ham-
burg: Helmut Buske.
Teferra, A. and P. Unseth. 1989. Toward the Classification of Shabo (Mikeyir). Topics in
Nilo-Saharan Linguistics. Nilo-Saharan 3, ed. by M.L. Bender, 405–418. Hamburg:
Helmut Buske.
Ten Raa, Eric. 1986. The Acquisition of Cattle by Hunter-Gatherers: A Traumatic Experi-
ence in Cultural Change. Sprache und Geschichte in Afrika 7, no. 2: 361–374.
Theil, Rolf. 2012. Omotic. Semitic and Afroasiatic: Challenges and Opportunities, ed. by
Lutz Edzard, 369–384. Wiesbaden: Otto Harrassowitz.
Tosco, Mauro. 1991. A Grammatical Sketch of Dahalo Including Texts and a Glossary
(Kuschitische Sprachstudien/Cushitic Language Studies, Band 8). Hamburg: Helmut
Buske Verlag.
Traill, A. 1973. “N4 or S7”: Another Bushman Language. African Studies 32, no. 1:
25–32.
Tsehay, Kibebe. 2015. Documentation and Grammatical Description of Chabu. Ph.D.,
University of Addis Ababa.
Tucker, Archibald N. and M.A. Bryan. 1956. The Non-Bantu Languages of North-
Eastern Africa. Published for the International African Institute by Oxford University
Press.
Tucker, Archibald N., Margaret Bryan, and James Woodburn. 1977. The East African
Click Languages: A Phonetic Comparison. Zur Sprachgeschichte und Ethnohistorie
in Afrika (Festschrift Oswin R.A. Köhler), ed. by W.J.G. Möhlig, F. Rottland, and B.
Heine, 301–323. Berlin: Dietrich Reimer.
192 Roger Blench
LANGUAGE ISOLATES OF
NORTH AMERICA
Marianne Mithun
Approximately half of the language families indigenous to North America North of Mex-
ico have been identified as isolates, families consisting of a single language. The figure
is necessarily approximate for several reasons. The first is the quality of data. Some lan-
guages are represented by so little documentation, often of poor quality, that demonstrat-
ing genetic relationships to any other languages with confidence is difficult. A second is
the continuum between dialects and languages. A third is the potentially deep impact of
language contact. Ultimately, isolate status does not entail particular structural character-
istics inherent in the language itself: it is more about their potential relatives.
In view of its source it is perhaps not surprising that there is more similarity between
Comecrudo and Cotoname in Gatschet’s materials than in Berlandier’s. Two signif-
icant examples suggest that Gatschet’s informants used original Cotoname words in
both languages.
(Goddard 1979: 370)
After close inspection of the sources, Goddard concludes that “Coahuilteco, Comecrudo,
and Cotoname must all be considered independent isolated languages whose genetic rela-
tionships are at present unknown, and the fragments of Solano and Aranama cannot be
put in any language grouping with any confidence” (1979: 379).
Similar challenges are presented by Karankawa, once spoken on the Texas coast.
The last Karankawa speakers died in 1858 on Padre Island at the hands of Texas Rangers
and Mexican soldiers. All of the attested material, over 400 vocabulary items and some
phrases, sentences, and translations of English nursery rhymes, is assembled in Grant
(1994). It comes from the following sources: French brothers Jean-Baptiste and Pierre
Talon, who had been part of La Salle’s expedition and were captured by Karankawas
around 1686, dictated to M. de Boissieu in Brittany in 1689; the French sea captain Jean
Béranger, who collected vocabulary near Matagorda Bay in 1720–1721; the Mexican
geologist Rafael Chowell, who collected vocabulary in south Texas or Coahuila, Mex-
ico in 1828–1829; the Tonkawa man Old Simon (mentioned earlier as the source for
Aranama), dictated to Albert Gatschet at Fort Griffin, Texas, in 1884; a blind Tonkawa
woman, Sallie Washington, who had once lived with a Karankawa man, dictated to
Gatschet in 1884; and Alice Williams Oliver, a White woman who had lived near the
Karankawa in Texas as a child and made a list of around 600 forms, though that list was
subsequently lost. Gatschet discovered Mrs. Oliver living in Lynn, Massachusetts, and
worked with her in 1888–1889. Not surprisingly, Grant found considerable variation in
the material, since the language was spoken by a number of bands and so much came
from non-native speakers. Nevertheless Grant, like others, concludes that it was a single
language.
In some cases classification of a language depends not only on the quantity and quality
of the data available on that language, but also on our knowledge about potential rel-
atives. A German philologist comparing German Hund and English dog might not see
Language isolates of North America 195
a relationship immediately, but with more extensive knowledge of English, the link to
English hound would be clear, and further comparisons could confirm the connection.
When Jacques Cartier first sailed into the Bay of Gaspé in 1534, he encountered fish-
ermen from a settlement up the Saint Lawrence River at the site of present-day Quebec
City. He took two captives back with him to France but returned with them the following
summer, staying longer and going further upriver to a settlement at the site of present-day
Montreal. When Champlain arrived in the region in 1603, these people had vanished from
the area. But appended to the accounts of the first two Cartier voyages were vocabulary
lists which together comprise a little over 200 words. From these lists it is possible to
identify the language as Northern Iroquoian with certainty, in good part because the mod-
ern Northern Iroquoian languages are well known (Mithun 1982). It can even be seen that
the words come from several different dialects or languages.
A group known as the Adai (also known as Adaizan, Adaizi, Adaise, Adahi, Adaes,
Adees, or Atayos) was first encountered in what is now eastern Louisiana around 1530.
Adai was the language of the Spanish Mission of Adayes founded in 1715 west of Natchi-
toches, but when that mission closed in 1792, the converts joined Caddoan groups in
Texas. Their language is known only from a list of 275 words recorded around 1802 by
John Sibley, the first Indian Agent of the United States (Sibley 1832: 722). The original
manuscript has been lost, but a copy is in the American Philosophical Society Library,
and vocabulary from it has been published in Adelung and Vater (1816: 3.2.278), Gallatin
(1836, with errors), and Taylor (1963: 114). Sibley noted that the language “differs from
all others, and is so difficult to speak or understand that no nation can speak ten words
of it” (Powell 1891: 122). Gatschet saw some similarities to words of the neighboring
Caddo. With better knowledge of Caddo, however, Wallace Chafe could determine with
certainty that it is not Caddo (personal communication). This leaves us with a language
which we can see is not related to the most likely neighbor and without demonstrable
relationships to other well-known languages in the area, suggesting it is an isolate.
Knowledge about earlier stages of potential relatives can sometimes be useful. Slightly
more documentation is available for Beothuk, once spoken in present-day Newfound-
land. The Beothuk were first mentioned in the account of Cartier’s 1534 voyage. Records
of their language consist of 4 vocabularies together comprising over 400 items: one
recorded by John Cline during the eighteenth century; one by John Leigh from a woman
named Demasduit (Mary March) captured in 1819; one by a man named King; and one
by W. E. Cormack from a voluntary captive named Shanawdithit (Nancy April), who died
in 1829. All are published in Hewson 1978. Investigation into possible genetic relation-
ships has focused on the surrounding Algonquian languages. A few similarities have been
noted, but Voegelin and Voegelin (1946) concluded that the similarities were too close to
be cognates and should rather be identified as loans. Hewson (1968, 1971, 1978, 1982)
did see some similarities between the Beothuk vocabulary and certain Proto-Algonquian
reconstructions. At present, however, the general consensus is that a relationship is
unlikely though not impossible.
isolate. (It should be noted that into the early twentieth century, the term dialect was
often used to refer to all speech varieties known to be related at any level.) Particularly
for languages no longer spoken, it can be difficult to assess precise degrees of mutual
intelligibility. There are a number of such situations in North America.
Alsea, spoken along the central coast of Oregon, comprises either two dialects or
closely related languages: Alsea and Yaquina. Frachtenberg, who worked with both in
1910, left a major manuscript grammar. There he reported on the recent history of the
two, which he termed Yakonan:
Until 1876 most of the Yakwina and Alsea Indians lived at the Yahatc reservation.
Even at that time the number of Yakwinas still extant was limited. When, on the 26th
day of April, 1876, this reservation was abolished, the remaining members of the
Yakonan family were transferred to the newly established Siletz Agency (in 1913)
but only one adult Yakwina and eight full-grown Alsea Indians were left, of whom
only five may be said to have retained some knowledge of the language, traditions,
and customs of their forefathers. With the demise of this handful the Yakonan group
will have become a thing of the past.
(Frachtenberg 1918 ms: 14)
Such proximity could have led to mutual intelligibility even if the two were not closely
related, but further comments suggest that they were in fact close dialects:
The differences between the Alsea and Yakwina dialects are very slight. As far as has
been ascertained, there is only one phonetic deviation, and a limited number of stems
are particular to each dialect. The Alsea combination of al appears in most instances
in Yakwina as a u diphthong.
(Frachtenberg 1918 ms: 15)
Frachtenberg provides four examples of pairs of words showing the phonological corre-
spondence (among them qalp-/qaup- ‘to roll’), and nine stems that differ (among them
pəlú:pəlu:/k’ins ‘beard’). His descriptions suggest that the two were indeed dialects, so
together they would constitute a language isolate. (Distant relationships to neighboring
Siuslaw and Coos have been proposed but are not considered established.)
Mutual intelligibility is of course a matter of degree and experience. Yana, spoken
in central Northern California, was well documented by Sapir early in the twentieth
century, but it disappeared not long afterward. Four dialects are generally recognised:
Northern Yana, Central Yana, Southern Yana, and Yahi. Sapir characterised their rela-
tionships as follows:
The probability is strong that Southern Yana was a link between the Central and Yahi
dialects, with a leaning, I surmise, to Yahi rather than to Central Yana. The Central
and Northern dialects, though neatly distinct on a number of phonetic points, are
mutually intelligible without difficulty. Yahi is very close in all essential respects
to the two Northern forms of Yana, but there are enough differences in phonetics,
vocabulary, and morphology to put it in a class by itself as contrasted with the other
two. It is doubtful if a Northern or Central Yana Indian could understand Yahi per-
fectly, but it is certain that he could make out practically all of it after a brief contact.
(Sapir and Spier 1943, cited in Sapir and Swadesh 1960: 14)
Language isolates of North America 197
Atakapa, once spoken by a number of small groups along the Gulf of Mexico from
Vermillion Bay and Bayou Teche in Louisiana to Galveston Bay and up the Trinity River
in Texas, presents a cloudier picture. It is sometimes said to comprise three languages,
sometimes two, and sometimes one. Three varieties were recognised by Swanton. A list
of 45 words, recorded in 1721 by the sea captain Jean Béranger from a captive taken
at Galveston Bay, Swanton identified as Akokisa (published in Villiers du Terrage and
Rivet 1919). Another list of 287 words collected by Martin Duralde in 1802 at modern
Martinville, Louisiana, he identified as Eastern Atakapa (published in Vater 1820–1821
and Gallatin 1836). The most extensive documentation, consisting of around 2,100 words
and sentences, as well as 9 texts, was obtained by Gatschet in 1885 at Lake Charles,
Louisiana, from 2 of the last speakers to know the language well. This material Swanton
identified as Western Atakapa. A grammatical sketch based on all sources is in Swanton
1929a, and a dictionary with texts in Gatschet and Swanton 1932. Swanton’s classifica-
tion may not actually have been meant as a linguistic one, however. Martin notes that “the
relatively small and unsystematic variation seen in the data provide little support for these
groupings” (2004: 79). If the three varieties are indeed very closely related, as it appears
they are, together they comprise a language isolate. (Hypotheses of more remote relations
once grouped it with neighboring languages in a Gulf superstock, described below.)
Ultimately the status of a language as an isolate depends more on possible relatives
than on the language itself. The Timucua were first encountered by Ponce de León in
1513 near present-day St. Augustine, Florida. Their language was documented early by
two priests who were in the area between 1603 and 1627, Francisco Pareja and Gregorio
de Movilla. They left a grammar, three catechisms, a confessional, a doctrina, and other
materials, with a total of about 2,000 pages of bilingual Timucua-Spanish text. The gram-
matical sketch, by Pareja, was published in Mexico between 1612 and 1627 and reprinted
in 1886. Two catechisms by Movilla were published in Mexico in 1635. There are also
two letters with Spanish translations, one written to the governor of Florida in 1636 by
a Timucua chief, and another by six Timucua chiefs addressed to the king of Spain in
1688 (Crawford 1979: 326–327). By the early nineteenth century, few Timucua people
remained. Granberry published a grammatical sketch (1990) and grammar (1993) based
on these resources.
In 1707 a man named Lamhatty appeared at the estate of Colonel John Walker in Vir-
ginia, saying that he was from the village of Tawasa near the Gulf of Mexico but had been
captured by the Tuscarora. Walker recorded 60 words of his language, but soon afterward
Lamhatty disappeared. Swanton (1929b) noted resemblances in the wordlist to Timucua,
as well as to some Muskogean languages. Since then, the status of Tawasa has been under
discussion. The wordlist is reprinted in Granberry (1993: 10). A number of the Tawasa
forms are so similar to Timucua as to suggest they are the same language, represented
with different spellings: Tawasa hĕmè, Timucua hime ‘come’; Tawasa néăh, Timucua nia
‘woman’; and so on. Some differences could be attempts at representing sounds not in the
native languages of the transcribers: Tawasa soua, Timucua soba ‘meat’ for something
like [soβa], or Tawasa héwah, Timucua hiba ‘sit down’, for [hiβa]. There are also forms
that match those in neighboring Muskogean languages, likely loans in one direction or
the other: Tawasa hássey, Alabama haši, Timucua ela ‘sun’. Martin (2004: 78) notes that
most of these are probably loans from Creek. If Tawasa was actually Timucua itself or a
dialect of that language, Timucua would remain an isolate, as it was identified by Pow-
ell (1891: 123) and Sapir (1921, 1929). If Tawasa was an unrelated language, Timucua
would still be an isolate. (There is also a long history of attempts to link Timucua to other
198 Marianne Mithun
languages of the area as well as Middle and South America, discussed in Crawford 1979,
1988.)
The possibility of clear-cut distinctions between dialects and languages is also chal-
lenged by dialect chains, situations in which speakers of different varieties may be able
to understand their immediate neighbors, but those at the edges may not understand
each other. An example is Keres, with seven varieties spoken in pueblos in New Mex-
ico. Two major groups can be distinguished: Eastern Keres, consisting of Cochiti, Santo
Domingo, San Felipe, Santa Ana, and Zia, and Western Keres, consisting of Laguna
and Acoma. Each dialect is mutually intelligible with its neighbors, but differences
are greater between those of pueblos located at larger distances, such as Cochiti and
Acoma. Davis (1959) estimates the time depth of the group at not more than 500 years,
a period over which change can be slight or substantial. Keres is generally considered
an isolate.
Apparent intelligibility can of course also come from exposure or bilingualism. Cayuse
was spoken in the early nineteenth century in the plateau region of northeastern Oregon
and southeastern Washington. Documentation consists primarily of nineteenth-century
wordlists published in Rigsby 1969. For some time, Cayuse was linked to the neighboring
Molala. Rigsby (1966: 369) traces the original idea of a relationship to an 1846 publica-
tion by Horatio Hale containing Cayuse and Molala vocabularies. Hale did not explic-
itly declare that the two were related, but he listed them together in a Waiilatpu family.
Rigsby suspects that this grouping may have been stimulated by a remark from Marcus
Whitman, who established a Presbyterian mission in the area in 1836, and assumed the
languages were mutually intelligible when he heard Cayuse and Molala people speaking
together. The grouping was retained in the Powell classification of 1891 and carried into
the Sapir classifications of 1921 and 1929. After careful examination of the material from
both languages, however, Rigsby concluded that the two are in fact not related:
Cayuse and Molala do not appear to be genetically relatable, though there are
obvious areal relations involved since they share a small number of identical or
near-identical lexical items.
(Rigsby 1966: 370)
In the situations observed by Whitman, the Cayuse speakers may have learned some
Molala, the Molala speakers may have learned some Cayuse, or both may have simply
acquired enough passive bilingualism to be able to understand the other, while still speak-
ing their own languages. Cayuse is now considered an isolate.
The main purpose of the paper is to point out that California languages may be
classified into several groups. It must be clearly understood, however, that the clas-
sification that has been attempted deals only with structural resemblances, not with
definite genetic relationships; that we are establishing not families, but types of
families.
(Dixon and Kroeber 1903: 3)
In 1913, however, they proposed some remote genetic relationships among California
languages (1913a, 1913b). One they termed “Penutian” included Costanoan, Miwok,
Maiduan, Yokuts, and Wintuan (1913b). As samples of evidence of relationship, they
provided five lexical sets (‘bow’, ‘three’, ‘fire’, ‘liver’, ‘forehead’); some case suffixes;
and structural similarities, including elaborate vowel mutations, lack of prefixes, seven
cases on nouns, absence of instrumental or locative affixes on verbs, and intransitive,
inceptive, voice, mode, tense, and person suffixes. They concluded:
There is available enough information on the structure of the five Penutian lan-
guages to prove their genetic affinity beyond a doubt even without recourse to lex-
ical similarities.
(Dixon and Kroeber 1913b: 649)
A second group they termed “Hokan” included Karuk, Chimariko, Shastan, Pomoan,
Yana, Esselen, and Yuman (1913b). They illustrated their proposed relationship with
another five sets of stems (‘tongue’, ‘eye’, ‘water’, ‘stone’, ‘sleep’), as well as some
structural similarities, including no plural form for most nouns, frequent pairs of distinct
verb stems differing in number, verb prefixes denoting instruments and often pronomi-
nals, and verb suffixes marking plurality and location.
A flurry of work followed, aimed at uncovering deeper relationships among the Pow-
ell families and refining subgrouping. In 1919 Dixon and Kroeber presented lexical
sets obtained from translations of 225 English terms into 67 languages, as well as some
additional forms. For Penutian, they found 171 stem resemblances between two or more
languages, proposed some sound correspondences, discussed phonological changes,
and listed some structural similarities. To Hokan, they added Washo, Salinan, and Chu-
mash in California and Seri and Chontal (Tequistlatecan) in Mexico, drawing in part on
work by Sapir (1917a) which showed similarities among Yana and other possible Hokan
languages.
Sapir continued the search for remote relations, ultimately proposing classifications
of all of the languages into just six superstocks or phyla (1921, 1929): Eskimo-Aleut,
Algonkin-Wakashan, Nadene, Penutian, Hokan-Siouan, and Aztec-Tanoan. With this
scheme there were no isolates.
Some of Sapir’s subgroups within the phyla are now generally accepted as families.
One is the Tlingit-Athabaskan family, which was Sapir’s Continental Nadene subgroup
of his Nadene superstock (with the later addition of Eyak). Another is the Algic family,
his Algonkin-Ritwan subgroup of his Algonkin-Wakashan superstock. (Wiyot and Yurok
of California, linked as Ritwan, are no longer seen as a subgroup.) A third is the Utian
200 Marianne Mithun
family, his Miwok-Costanoan branch of his California Penutian subgroup of his broader
Penutian hypothesis. A fourth is the Uto-Aztecan family, a subgroup of his Aztec-Tanoan.
A fifth is the Kiowa-Tanoan family, another subgroup of his Aztec-Tanoan. Some other
proposals appear promising to varying degrees but are not yet generally considered fully
established, such as his Plateau Penutian subgroup (Sahaptian, Molala, Klamath-Modoc);
Takelman (Takelma and Kalapuya, which he linked as two of three members of his Ore-
gon Penutian subgroup); his Iroquoian-Caddoan; and his Siouan-Yuchi. Other proposals
of his have been abandoned. In many cases the discovery of more remote relationships
may never be possible because the languages are no longer spoken, and time depths
would be so deep that few common inheritances remain.
In what follows, languages identified as isolates are surveyed by geographical area.
There is insufficient space to list all references to unpublished and published documen-
tation and comparative work on the languages, or to describe each language in full, but
additional detail can be found in Goddard 1996a, 1996b, 1996c; Campbell 1997; Mithun
1999; and Golla 2011.
series of stops and affricates: plain (written as voiced) d, λ, ʒ [dz], ǯ [dž], g, gʷ, g, gʷ, ʔ,
ʔʷ; aspirated t, ƛ, c, č, k, kʷ, q, qʷ; and ejective t’, ƛ’, c’, č’ k’, k’ʷ, q’, q’ʷ; plain fricatives
ł, s, š, x, xʷ, x̣, x̣ʷ, h, hʷ; glottalised fricatives ł’, s’, x’, x’ʷ, x̣’, x̣’ʷ; and sonorants n, y, ÿ
(rounded), and w. It also has a simple vowel system: i, e, a, u (Leer 1991a: 10). Tlingit
also shows agent/patient patterning in pronominals, Tlingit verbs, like those in Athabas-
kan languages, contain incorporated nouns (or remnants of them) near the beginning of
the template, which can indicate unspecified involvement of a kind of entity.
There are, however, conspicuously few lexical similarities apart from loans. The agent/
patient patterning of Tlingit pronominals is likely an effect of contact from Haida, as the
Eyak and Athabaskan cognates show nominative/accusative patterning (Mithun 2008).
According to native tradition, the Kaigani Haida (Masset) moved from the Queen Char-
lotte Islands north into southeastern Alaska, formerly Tlingit territory, around 1700.
Tlingit village names remain there (De Laguna 1990: 203). Story (1966: 10) notes that
Tlingit people have intermarried extensively with other groups and moved to live with
them. Pinnow (1964), Levine (1979), Leer (1990, 1991b), and Jacobsen (1993) have
shown that the similarities that originally served as the basis for this grouping, mainly
structural, are actually the result of misanalysis and contact. Haida is once again generally
considered an isolate.
Kutenai (=Kootenai=Kootenay) is spoken in communities in British Columbia, Mon-
tana, and Idaho. It was the dominant community language until about the mid-twentieth
century, when parents stopped passing it on to their children. During the nineteenth cen-
tury, various visitors collected vocabulary. In 1891 Chamberlain worked with all of the
dialects, collecting vocabulary, grammatical material, and texts (1893, 1894a, 1894b,
1894c, 1895a, 1895b, 1902, 1906, 1910). In 1894 the missionary Canestrelli published
a grammar in Latin (republished in 1927 by Boas). Boas visited the Kutenai in 1888 and
1914, when he worked through the Chamberlain texts and collected additional material,
which he published in 1918. Garvin published lexical material (1947, 1948a); a gram-
matical sketch (1948b, c, d, 1951a); a narrative text (1953) and conversation (1954); and
articles on further topics (1951b). A major grammar with texts is by Morgan (1991).
Kutenai was listed as an isolate (Kitunahan) in Powell 1891. Sapir 1921, 1929 included
it as one of three branches of his Algonkin-Wakashan superstock, beside Algonkin-Ritwan
and Salish-Wakashan. An Algonquian language, Blackfoot, is spoken immediately to the
east, and Salishan languages are spoken to the west and south.
The possibility of a remote relationship to Algonquian was further investigated by
Haas (1965). Kutenai does show certain structural similarities to Algonquian languages,
in particular obviation, a kind of ranking of third persons. In both Kutenai and Algon-
quian languages, if there is only one third person in a clause, this will be the unmarked
proximate. If there is more than one, the more topical one is proximate, and all other sub-
sidiary third persons are marked as obviative. In the sentence below, the man is proximate
and the bee obviative.
Language isolates of North America 203
Comparable verb suffixes in Algonquian languages are termed inverses. The Kutenai
system differs slightly from its Algonquian counterparts, however. In Kutenai, first and
second persons are equivalent on the hierarchy (1, 2 > 3prox > 3obv), while in Algon-
quian, second persons are ranked over first (2 > 1 > 3prox > 3obv). Morgan proposes
that, though the Kutenai suffix -ap may have originally been an inverse marker, it now
functions more like a first and third person singular object marker. The second person
object suffix is -is, and the first person plural object suffix is ‑awas. (The actual forms of
the morphemes show no similarities between Kutenai and Algonquian.)
Morgan (1991: 497) reexamined evidence of a distant relationship to Algonquian with
lexical comparisons and identified only 14 Kutenai morphemes or, at the outside 24,
that might be similar enough to Algonquian forms to suggest either cognate relations
or borrowing. He concluded that the resemblances are likely due to chance. A common
origin for Kutenai and Algonquian is no longer considered likely, though similarities like
the proximate/obviative distinction and inverse system indicate that even very abstract
structures can be transferred through language contact.
Both Algonquian and Salishan languages contain verbal suffixes with meanings much
like lexical nominals, likely descendants of incorporated nouns. Kutenai shows such con-
structions as well, like -q’anku- ‘firewood’ in the verb ‘pack-firewood’ below.
strikingly similar to that reconstructed for Proto-Salishan: p, t, c, k, kʷ, q, qʷ, ; p’, t’, ƛ’,
c’, k’, k’ʷ, q’, q’ʷ; ł, s, x, xʷ, x̆ , x̆ ʷ, (h); m, n, l, y, (ɣ), w, ʢ, ʢʷ; m’, n’, l’, y’, (ɣ’), w’, ʢ’,
ʢ’ʷ (Kroeber 1999: 7, after Thompson 1979). Reconstructed Salishan vowels are i, a, u,
ə. Morgan (1980, 1991: 494–498) investigated possible relationships between Kutenai
and the Salishan family and found 144 possible cognate sets, which he groups by recur-
ring sound correspondences. He reports that of 99 Kutenai grammatical morphemes, 42
appear to be cognate with Salishan morphemes (1991: 494). Another 23 sets he identi-
fies as the results of lexical borrowing, most from Salishan languages into Kutenai. He
predicts that additional sets are unlikely to be found, and that if there is a genetic link, it
is very remote:
The more one chooses to see the linguistic connection between Kutenai and Salishan
as diffusional, rather than genetic, the stronger the case can be for the idea that the
Kutenai language was in sustained contact with a variety of Salishan languages,
including not only the presently neighboring Interior Salishan languages, but also
probably Proto-Interior Salish, and quite possibly also Proto-Salish itself. Both
Proto-Interior Salish, and Proto-Salish were evidently spoken directly to the west of
what is now Kutenai territory.
(Morgan 1991: 497)
Her memory of old traditions was almost entirely gone, and she had lost the faculty
of relating facts coherently and in consecutive order. Besides, her narratives, such as
could be obtained, were too much interspersed with Chinook Jargon.
(Frachtenberg 1914: 1)
William Smith was an Alsea man who knew Lower Umpqua as a second language. Nev-
ertheless, Frachtenberg produced remarkable documentation, publishing a text collection
with vocabulary (1914) and a detailed grammatical sketch (1922). The language was
gone by the 1970s.
In the mid-nineteenth century, Latham (1848) and Gatschet (1884) recognised Siuslaw
and Lower Umpqua as dialects of an isolate. On the basis of a vocabulary he collected
in 1884, Dorsey linked the two with Alsea and Yaquina in a Yakonan stock, a grouping
adopted by Powell in his 1891 classification. After more intensive work, however, Fracht-
enberg disagreed:
After a superficial investigation, lasting less than a month, Dorsey came to the con-
clusion that Siuslaw and Lower Umpqua were dialects belonging to the Yakonan
stock. This assertion was repeated by J.W. Powell in his “Indian Linguistic Families”
(Seventh Annual Report of the Bureau of American Ethnology, p. 134), and was held
to be correct by all subsequent students of American Indian languages. This view,
however, is not in harmony with my own investigations. A closer study of Alsea
Language isolates of North America 205
(one of the Yakonan dialects) on the one hand, and of Lower Umpqua on the other,
proves conclusively that Siuslaw and Lower Umpqua form a distinct family, which
I propose to call the Siuslawan Linguistic stock.
(Frachtenberg 1922: 437)
Frachtenberg did not entirely dismiss the possibility of more remote relations:
It is not at all impossible that this stock, the Yakonan, Kusan, and perhaps the
Kalapuyan, may eventually prove to be genetically related. Their affinities are so
remote, however, that I prefer to take a conservative position, and to treat them for
the time being as independent stocks.
(Frachtenberg 1922: 437)
Sapir combined those languages in his Oregon Penutian subgroup, but at present, this
grouping is not generally considered established.
Takelma was spoken in southern Oregon in the Rogue River Valley. There were at
least four dialects (Kendall 1982). It was no longer spoken by the mid-twentieth century.
Sapir published a collection of texts (1909) and impressive grammatical sketch (1922) on
the basis of one and a half months of work with a single speaker, Frances Johnson, at the
Siletz reservation. Further references are in Kendall (1977 and 1990).
The language was listed as an isolate in Powell (1891). In 1918 Frachtenberg sug-
gested a relation to the Kalapuyan family to the north in what he termed Takelman:
While carding and indexing my Kalapuya field material (collected three years ago),
preparatory to the writing of a grammatical sketch of these languages, I was forci-
bly struck by some marked correspondences in the lexicography of Kalapuya and
Takelma, and of Kalapuya and Chinook. . . . The resemblances between Kalapuya
and Takelma are much greater and far more numerous, although, as has been stated
before, only part of the Kalapuya data have thus far been tabulated.
(Frachtenberg 1918: 178)
Based on these works and my own investigation, it seems certain that Takelma and
Kalapuyan share a common, though remote, origin. That the two languages belong
to the same subgroup within Penutian must remain a working hypothesis.
(Kendall 1997: 1)
206 Marianne Mithun
In 1998, however, Tarpent and Kendall reexamined the evidence and concluded that the
similarities noted previously were erroneous and that Takelma is an isolate after all.
Klamath-Modoc (= Lutuamian) consists of two dialects, Klamath proper and Modoc.
They were spoken in south central Oregon along the eastern slope of the Cascade Moun-
tains, Klamath to the north and Modoc to the south. The last speaker died in 2003. Good
documentation exists. Most important are a two-volume work containing a dictionary,
grammatical sketch, and texts in Modoc by Gatschet (1890) and especially a dictionary,
grammar, and texts by Barker (1963a, b, 1964). Additional work on particular aspects of
the language has been published as well.
The language was listed as an isolate by Gallatin (1848) and Powell (1891). Others saw
possible relationships to neighboring languages. Gatschet (1880) assembled 18 pairs of
words in Klamath and Northern Sahaptin, noting:
The Sahaptin and Wayíletpu [Klamath-Modoc] families are the only ones with
whom a distant kinship is not altogether out of the question.
(Gatschet 1880: lvi, cited in Aoki 1963: 107).
The Karok language is not closely or obviously related to any other. It has, however,
been classified as a member of the northern group of Hokan languages, in a subgroup
which includes Chimariko and the Shastan languages, spoken in the same general
part of California as Karok itself. Considerable work remains to be done before the
historical position of Karok can be properly clarified.
(Bright 1957: 1)
died in the 1940s. An excellent resource is in Jany (2009), based primarily on 3,500
pages of handwritten field notes collected by John Peabody Harrington in the 1920s. Jany
(2009) contains both a grammatical description and typological comparison with other
languages in the area.
Chimariko was listed as an isolate in the Powell (1891) classification. In 1910, on the
basis of 57 lexical comparisons, Dixon suggested that it was related to Shasta, Achomawi,
and Atsugewi (1910: 306). Jany points out that the Chimariko spent years in exile with
the Shasta before becoming Dixon’s consultants, so at least some similarities he noted
may have been the result of contact (2009: 2). Sapir classified those languages as a sub-
branch of his Northern Hokan.
Further south and east, in an area centering around Lake Tahoe into Nevada, is Washo.
Major documentation includes a grammatical sketch in Kroeber (1907) and grammar
and grammatical sketch in Jacobsen (1964, 1986); and texts in Dangberg (1927) and
Lowie (1963). More recently Alan Yu and colleagues have created a website with online
audio dictionary and bibliography of works on the language at https://lucian.uchicago.
edu/blogs/washo/?page_id=90.
Examining early vocabularies, Gatschet (1882: 254–255) identified Washo as an iso-
late. On the basis of his own work with speakers in 1883, Henshaw echoed this judgment:
From the fragmentary vocabularies of this tongue before accessible the Washo had
been supposed to be the sole representative of a linguistic stock, a supposition which
the present vocabulary sustains.
(Henshaw 1887: xxx, cited in Jacobsen 1964: 11)
The language was accordingly listed as an isolate in Powell (1891). In 1917 Harrington
announced similarities between Washo and the Chumashan languages to the south (1917:
154), which led Sapir to classify Washo as Hokan, since the Chumash family was already
in that group (1917b: 449–450). (The proposed Hokan affiliation of Chumash has since
been abandoned.) Dixon and Kroeber noted that originally Washo “was credited with
remarkably few parallels to any other. Superficial examination indeed reveals very few
similarities between it and Hokan tongues” (1919: 104). Stimulated by a comment from
Sapir on a possible relationship, however, they assembled a list of lexical similarities
between Washo and other languages hypothesised to be Hokan:
The outcome is about sixty parallels of greater or less validity. This is not a wholly
convincing showing. But the general plan of Washo structure is so similar to that of
Hokan that material resemblances weigh more heavily.
(Dixon and Kroeber 1919: 105)
For instance, the noun without cases but with numerous local suffixes and with
possessive prefixes; the verb with pronominal and instrumental prefixes, local and
modo-temporal suffixes; verb stems frequently different for singular and plural;
composition in abundance.
(Dixon and Kroeber 1919: 105)
In the same work they included a longer list of Washo-Hokan similarities assembled inde-
pendently by Sapir, with lists of stems, pronouns, prefixes, noun suffixes, postpositions,
Language isolates of North America 209
local suffixes on verbs, and other verb suffixes. The subsequent history of classification
of Washo is described in detail in Jacobsen (1964: 10–21). At present, Washo is still gen-
erally considered an isolate.
Further south on the California coast was Esselen (= Huelel), spoken by early converts
at Mission Carmel. Shaul (1995a: 191) surmises that it was probably the first known
California language to have become extinct. Because of its early demise, documenta-
tion is sparser than for some other California isolates. Sources are described in Beeler
(1977, 1978); Turner and Shaul (1981); and Shaul (1995a). There is a list of 20 words
(including 10 numerals), collected during the La Pérouse expedition of 1786, and a list
of 107 words and a trilingual (Esselen/Spanish/English) catechism by Father Lasuén,
collected during the Galiano-Malaspina expedition of 1792. The numerals were collected
again in 1840–1842 by Duflot. The Franciscan missionary Felipe Arroyo de la Cuesta
recorded 58 words and 14 phrases and sentences at the Soledad Mission in 1832 (Kroeber
1904). Later documentation comes from non-native speakers and rememberers. In 1878
Pinart collected about 140 items from a woman whose husband had been Esselen (Heizer
1952: 73–82), and in 1888, Henshaw collected 110 words and 50 phrases and sentences
from a Rumsen speaker whose mother had been Esselen (Kroeber 1904: 49–57, Heizer
1955). After that time a few words were collected from people who were not speakers
but remembered having heard the language (Shaul 1995b). Descriptions of Esselen pho-
nology, morphology, and syntax are in Shaul (1995a), along with a transcription of the
complete Galiano-Malaspina catechism.
Esselen was classified as an isolate in Powell (1891). In 1913 Dixon and Kroeber
added it to their Hokan stock on the basis of three words plus an ending -nax, which
they related to an ending -na in Yana: Esselen a sa-nax, Yana ha-na ‘water’; Esselen
šie fe, Yana k’ai-na ‘stone’; Esselen a tsi n, Yana sa m ‘sleep’ (1913b: 651). Sapir
adopted the proposal and linked Esselen with Yuman as a subgroup of Hokan. At pres-
ent it is generally agreed that there is too little evidence to link Esselen to any other
languages.
Immediately to the south of Esselen on the central coast was Salinan, with two docu-
mented closely related dialects, Antoniano (from Mission San Antonio, founded in 1771)
and Migueleño (from Mission San Miguel, founded in 1797). The dialect situation may
have been more complex before the mission period. A third dialect was mentioned in
early accounts, Playaño (on the beach), but it was never documented. The language was
last spoken around 1960. Sources are detailed in Turner (1987 and 1988). Franciscan
priests at both missions apparently spoke the language and left important documenta-
tion (Golla 2011: 116). Fathers Buenaventura Sitjar and Miguel Pieras, who founded the
San Antonio mission, assembled a bilingual dictionary (Sitjar 1861) and a confesionario.
Later Father Pedro Cabot, who served there from 1804 to 1835, and Father Juan Sancho,
there from 1804 to 1830, added to the dictionary and translated a number of religious
texts. Additional material was collected through the nineteenth century and the first half
of the twentieth. Kroeber collected some Migueleño vocabulary and grammatical mate-
rial (1904: 43–49). J. Alden Mason worked with speakers of both dialects in 1910 and
1916 and published a grammatical sketch and texts in 1918. J. P. Harrington worked with
speakers of both dialects in 1922 and 1931–1933, collecting over 5,000 words, much of
it re-elicitation of earlier material. His nephew Arthur Harrington made audio recordings
of hours of monolingual Miegueleño texts in the mid-1930s. Jacobsen worked with the
last speakers of both dialects in 1954, 1955, and 1958. His field notes and recordings are
archived in the Survey of California and Other Indian Languages at the University of Cal-
ifornia, Berkeley: http://linguistics.berkeley.edu/Survey/. The work by J. P. Harrington
210 Marianne Mithun
and Jacobsen is particularly valuable for its phonetic precision not always present in the
earlier records. A longer grammar is in Turner (1987).
Salinan was identified as an isolate in the Powell classification of 1891. Dixon and
Kroeber had noted some structural similarities to the Chumashan languages to the south
and proposed a grouping of the two they termed Iskoman, but they recognised that lexi-
cal similarities were not ‘conspicuous’ (1913b: 652–653). When Harrington saw resem-
blances between Chumash languages and Yuman, already part of the Hokan hypothesis,
Salinan was ‘folded into’ Hokan and remained part of this stock in the 1929 Sapir classi-
fication. As reported by Turner, further work has not unearthed compelling similarities to
Chumashan languages (apart from loans) or other languages classified as Hokan:
immediate neighbors Wintu and Hupa but not Shasta (which lacks an aspirated series). Its
uvular series (q, q’, qʰ) matches those of Wintu and Hupa but not Shasta. Other languages
classified as Penutian show inventories similar to Chimariko, such as Takelma with p, t,
k, kʷ,; pʰ, tʰ, kʰ, kʰʷ; p’, t’, c’, k’, k’ʷ, ʔ; w, s, y, x, h; m, n; a, e, i, o, u, ü, and Molala with
p, t, c, ƛ, k, q; pʰ, tʰ, kʰ, qʰ; p’, t’, c’, k’, q’, ʔ; f, s, ł, x, h; m, n, l, ; w, y; i, a, u, and i:, a:,
u: (Berman 1996: 3).
Chimariko arguments are identified on the predicate by pronominal prefixes, which
show agent/patient patterning with a hierarchical overlay. Just one argument is repre-
sented on the verb, according to a hierarchy (1, 2, > 3; Agent > Patient). In the sentences
below, only first persons are identified in the verb, a grammatical patient č- in the first
clause, and a grammatical agent i- in the last.
Hierarchical systems can be seen to have developed through contact in the area, but
from different resources in the different languages (Mithun 2012). They can be seen as
well not only in Karuk and Yana (both isolates classified as Hokan by Sapir), but also in
Hupa (Athabaskan) and Yurok (Algic). Possession in Chimariko is marked on the posses-
sor (the head of possessive construction) as in Hupa but not Shasta (or Wintu).
One of the structural features originally cited as characteristic of Hokan languages is
means/manner/instrumental prefixes on verbs. Chimariko contains such prefixes, among
them mitei- ‘with the foot’, wa- ‘by sitting on’, e- ‘with the end of long object’, a- ‘with
a long object’, me- ‘with the head’, tsu- ‘with a round object’, and tu- ‘with the hand’
(Jany 1009: 133).
(The initial ni- prefix is likely an imperative.) Such prefixes appear not only in Chi-
mariko, the isolate Karuk, and Achumawi (all classified by Sapir as Hokan), but also in
the isolates Yana and Washo, and in the Pomoan, Yuman, and Chumashan languages, all
in California and all classified at least at some point as Hokan. They do not, however,
appear in other languages classified as Hokan, namely Shastan or the isolates Esselen, or
Salinan, also in California. Furthermore they occur in languages not classified as Hokan.
In California they appear in the Maidun languages (Maidu, Konkow, Nisenan) spoken at
212 Marianne Mithun
contact immediately to the south of Yana and Atsugewi and to the north of Washo, clas-
sified as Penutian. In Oregon they appear in the isolates Klamath and Takelma, and the
Sahaptian languages, also classified as Penutian. But they do not occur in other languages
classified as Penutian; there is no mention of them in descriptions of the Wintun, Utian,
or Yokuts languages in California, nor the Oregon languages Coosan, Siuslaw, or Alsea
(Mithun 2007). They also occur in neighboring languages classified as neither Hokan nor
Penutian: Yuki and Wappo (Sapir’s Yukian) and Uto-Aztecan (though at certain points
there were attempts to add Uto-Aztecan to Penutian).
Another structural feature originally cited as characteristic of Hokan languages is an
inventory of locative/directional suffixes in the verb. Chimariko shows just such an inven-
tory: -ktam/-tam ‘down’, -ema/-enak ‘itno’, -ha ‘up’, -hot ‘down’, -lo ‘apart’, -ro ‘up’,
-sku ‘towards’, -smu ‘acrsoss’, -tap ‘out’, -tku/-ku cislocative (‘towards here’), -tmu/-mu
translocative (towards there), -kh ‘motion towards here’, -m ‘motion towards there’, -tpi
‘out of’, -xun/-xunok ‘in, into’, -aʰa ‘along’, -pa ‘off, away’, -qʰutu ‘itno water’, -č’ana
‘to, toward’, -čama ‘in, into’.
h-eṭahe-sku-t č’utamdač.
3-flee-towards-pft Burnt.Ranch
they ran away to Burnt Ranch.’
As Jany notes, neighboring Shasta, Karuk, Yana, and Achumawi, all classified as
Hokan, also have such suffixes, but the neighboring Maidu, classified as Penutian, does
as well. Prefixes with similar functions occur in other languages of the area classified
as Hokan: Yana, Washo, and Pomoan. But they also occur in other languages classi-
fied as Penutian: not only Maidu, but also the isolate Klamath and the Sahaptian lan-
guages. They do not appear in all members of either group, however. There is no mention
of them in the Yuman family (classified as Hokan), or in the isolates Takelma or Siuslaw,
nor in the Wintun, Utian, Yokuts, Coosan, or Alsea families (all classified as Penutian).
They are not mentioned for Chumashan Yukian. Like the means/manner/instrumental
prefixes, the locative/directional suffixes appear to be very old structures that were
spread through contact.
The possibility of deeper relations among languages in the West remains to some
extent an open question. Much of the structural evidence put forth for earlier groupings
has turned out to be a likely result of longstanding contact. Identification of lexical cog-
nates is hampered by the extensive phonological processes that many of the languages
have undergone, such as ablaut and phonological reduction. At least some of the pro-
posed deeper relations remain intriguing, however. Kaufman (1988ms, 1989) has pursued
both lexical and structural similarities among hypothesised Hokan languages; has recon-
structed consonants, phonotactics, and word structure; and has presented preliminary
ideas on the reconstruction of vowels. He, like others, sees some of the original members
more likely than others, including isolates Karuk, Chimariko, Yana, Washo, Esselen, Sali-
nan, and Coahuiltecan. Others remain unconvinced.
Language isolates of North America 213
Zuni was identified as an isolate in the Powell (1891) classification, but Sapir included
it in his Aztec-Tanoan superstock. Similarities to Uto-Aztecan are generally attributed
to contact, however. Suggestions of numerous other links have been proposed as well,
including Penutian, Hokan, and Keres, but none are considered convincing. Zuni is again
generally identified as an isolate.
3.4 Texas
A number of languages were spoken at contact in what is now Texas and adjacent Mexico.
Troike (1996: 644) reports that so far as is known, all had become extinct by the mid-
twentieth century. Some of those mentioned in historical sources are unclassifiable because
there is little or no documentation. As seen earlier, Aranama, Cotoname, and Karankawa
present challenges because of the limited quantity and quality of data from them. There are
still others, however, which have been identified as isolates with more confidence.
Coahuilteco was spoken in south Texas and northeastern Mexico. It is described in
detail in Troike (1996). The primary documentation is a confessor’s manual apparently
written by a Spanish Franciscan missionary, Bartolomé García, at Mission San Francisco
de la Espada in San Antonio. A version published in 1760 in Mexico City includes 88
numbered pages of text with parallel columns of Spanish and Coahuilteco. Troike reports:
The text consists primarily of short sentences or paragraphs. These include ques-
tions, statements, and commands, with questions predominating. . . . The total num-
ber morphs in the text is estimated at about 20,000. The short and often repetitious
nature of the sentences in many cases facilitates analysis, but at the same time the
limitations of the text sometimes make it impossible to determine the composition
of certain unique constructions or to discern the semantic significance or contextual
conditioning of particular functional elements.
(Troike 1996: 646)
214 Marianne Mithun
García, who spent 12 years at the mission and knew the language well, probably com-
pleted this polished version in 1738 (Troike 1978, 1996) An earlier, undated draft enti-
tled Confesonario de Indios was discovered in 1962, accompanied by another document
entitled Cuadernillo de lengua de. . . Pajalates. The Cuadernillo contains vocabulary and
verb paradigms from the same dialect and was attributed to Father Gabriel de Vergara,
president of the San Antonio missions from 1725 to about 1737.
The Tonkawa were first encountered in the eighteenth century in central Texas. Over
the nineteenth century, travelers recorded vocabulary. In 1884 Gatschet recorded around
1,000 words and 50 pages of texts, still unpublished. From 1928 to 1931 Hoijer car-
ried out major documentation, working primarily with speaker John Rush Buffalo. At
that time only six elderly speakers remained. Hoijer published a grammar (1931–1933),
grammatical sketch (1946), dictionary (1949), and texts (1972), in addition to work on
specific topics.
Tonkawa was listed as an isolate in Powell (1891). In 1915 Swanton proposed a
family consisting of two branches: Cotoname-Tonkawa and Coahuilteco-Comecrudo-
Karankawa. In 1920 Sapir linked this group to Hokan and then included it in his Hokan-
Siouan superstock. In 1940, however, after further examination and work by various
other scholars, Swanton reconsidered. He wrote:
Manaster Ramer (1996) proposed that Coahuilteco, Cotoname, Comecrudo, Garza, and
Mamulique are all related in a family he termed Pakawan, related in turn to Karankawa
and perhaps Atakapa to the east. After careful examination of the evidence presented
by Manaster Ramer, Campbell (1996) concluded that at present, there are insufficient
grounds for positing these relationships. Haas briefly explored the possibility of a rela-
tionship between Tonkawa and Algonquian (1959, 1967). Coahuilteco and Tonkawa are
now considered isolates.
The Natchez were near present-day Natchez, Mississippi, when La Salle’s expedition
came through in 1682. Over the next 50 years, their numbers were seriously diminished
by wars with the French. Survivors took refuge among other Southeastern groups, par-
ticularly the Cherokee, Creek, and Chickasaw. When these groups were force to move to
Oklahoma during the 1836 Removal, the Natchez accompanied their hosts. Some vocab-
ulary was recorded during the eighteenth and nineteenth centuries, all reprinted in Van
Tuyl 1979. In 1907, 1908, and 1915 Swanton collected lexical, grammatical, and textual
material in Oklahoma from five of the last speakers. In 1934 and 1936 Haas collected
much more from the last two, Watt Sam and Nancy Raven. The Swanton and Haas mate-
rials remain unpublished, but Haas produced a 1975 manuscript grammar sketch, which
served as the basis for a shorter published sketch by Kimball (2005).
The Tunica were near present-day Vicksburg, Mississippi, when they first encountered
the French in the late seventeenth century. A century later they were dispersed, and many
moved to an area near present Marksville, Louisiana. In 1886 Gatschet collected vocab-
ulary and texts. Twenty years later Swanton visited the group and ultimately produced
a grammatical sketch (1921). Between 1933 and 1939 Haas worked with the last semi-
fluent speaker, Sesostrie Youchigant, which resulted in a full grammar (1941), grammat-
ical sketch (1946), text collection (1950), and dictionary (1953),as well as work on more
specific topics. There is an active revitalisation program underway on the Tunica-Biloxi
reservation at Marksville.
Atakapa, Chitimacha, Natchez, and Tunica were all identified as isolates in Powell
(1891). During the first half of the twentieth century, various scholars noted similari-
ties among these languages and those of the neighboring Muskogean family. Swanton
linked Atakapa to Chitimacha and Tunica (1919), and Natchez to Muskogean (1907,
1924). Sapir classified these two larger groups, along with his Iroquoian-Caddoan and
Siouan-Yuchi, as part of his Hokan-Siouan superstock (1929). Further work was directed
at uncovering additional links among Atakapa, Chitimacha, Natchez, Tunica, and Musk-
ogean. Swadesh presented 258 lexical sets with sound correspondences between Atakapa
and Chitimacha (1946b, 1947). Haas assembled reconstructions for ‘water’ (1951) and
‘land’ (1952) for a stock she termed Gulf, consisting of Atakapa, Chitimacha, Natchez,
Tunica, and Muskogean. She noted additional similarities between Natchez and Musk-
ogean languages (1956). In 1969 Gursky published some lexical comparisons among
Atakapa, Chitimacha, and Tunica, revising some earlier proposals by Swanton. In 1979,
Haas reconsidered some of the relationships, and in 1994, Kimball detailed further diffi-
culties with the Gulf hypothesis. Attempts have also been made to connect Gulf to other
groups. In 1958 Haas sought evidence of links between the hypothesised Gulf stock and
Algic, but evidence was weak and the hypothesis has since been abandoned. In 1994
Munro looked into possible connections to Yuki in California, but that proposal has not
been pursued. Proposals involving they hypothesised Gulf and relations beyond the
Southeast are assessed by Campbell (1997: 305–309).
The Euchee (= Yuchi) were encountered by Europeans in present-day Georgia at
the beginning of the eighteenth century, but they were forced to move, along with the
Shawnee and Lower Creeks, to eastern Oklahoma during the 1832–1834 Removal. There
are now a very few speakers near Sapulpa, in northeastern Oklahoma. Several vocabu-
lary lists were collected during the nineteenth century. Between 1904 and 1908 Speck
recorded additional vocabulary as well as texts. During 1928 and 1929 Wagner worked
with Yuchi speakers in Oklahoma and published a text collection (1931) and grammatical
sketch (1933). A comprehensive grammar is in Linn (2000).
216 Marianne Mithun
The language is not demonstrably related to any other. It was listed as an isolate
(Uchean) in Powell (1891). Sapir suggested a relation to the Siouan family and then
combined his Siouan-Yuchi group into his larger Hokan-Siouan superstock. Crawford,
who worked with Yuchi speakers intermittently between 1969 and 1973, pursued the
possibility of a link to Siouan, and concluded that evidence was insufficient to establish
a relationship. He concluded:
If Yuchi and Siouan are related, the time depth of separation is probably so great that
it will be exceedingly difficult, if not impossible, to prove the relationship. I say this
in spite of the fact that it is possible to find homophonous and nearly homophonous
morphemes and segments of morphemes in Yuchi and Siouan. One would expect
that languages, no matter how distantly related, might share a few identity correspon-
dences and a certain number of homophonous or nearly homophonous cognates. But
when all the evidence for a genetic relationship consists of nothing else and when
the identity correspondences are not regular and recurrent, one is inclined to suspect
that the similarities may be coincidental or due to borrowings. It is quite possible that
borrowing may be the explanation for many of the Yuchi-Siouan similarities.
(Crawford 1979: 342–343)
By 1979 Haas felt that Timucua, Atakapa, Chitimacha, Natchez, Tunica, and Yuchi are
isolates. She reported:
In the past, similarities among languages have often been considered explainable
only on a genetic basis. This is true in spite of the fact that many of the earlier group-
ings were originally suggested by typological similarities. Moreover Boas’ (1920,
1929) well-known objection to some of the genetic schemes of Kroeber, Radin, and
Sapir were resisted at the time as representing an anti-historical bias. But in recent
years an increasing amount of attention is being given to areal linguistics, i.e. the
tracing of traits across the basic genetic boundaries. This is fast becoming a very
promising field of investigation, especially since it is now generally recognised
that genetic linguistics and areal linguistics are not antithetical but complementary.
Consequently the proper delineation of linguistic prehistory requires us to take full
advantage of both lines of investigation.
(Haas 1979: 319)
It is now generally agreed that so far, resemblances among the hypothesised Gulf lan-
guages are due to contact, though a remote relationship between Euchee and Siouan is
possible. The Southeast, like the Northwest and West, is a well-known linguistic area.
Phonologically labial fricatives are rare across North America, but they occur in the
isolates Atakapa, Chitimacha, Tunica, Yuchi, and Timucua. They are reconstructed for
Proto-Muskogean, ancestor of the major language family of the Southeast (Booker 1980:
254). They also occur in Ofo and (marginally) Biloxi, languages of the Siouan family,
as a result of a regular sound change Proto-Siouan *s > Ofo fh (Robert Rankin, personal
communication). Cognates in Siouan languages outside of the area show s or š in their
place. In the isolates, the fricatives tend to occur in loanwords.
Another structural feature shared by many languages of the area is agent/patient pat-
terning of pronominal affixes. Such a pattern is reconstructed for Proto-Muskogean and
Proto-Siouan, and it also appears in Atakapa, Chitimacha (in first person only), Natchez,
and in variants in Tunica. There are several mechanisms by which such patterns can
Language isolates of North America 217
spread. In all of these languages, third persons are unmarked, intransitive and transitive
verbs are not distinguished formally, and basic word order is predicate-final. It would
thus be easy for speakers to reanalyze a nominative/accusative pattern as agent/patient
and vice versa:
Another mechanism for transfer can be the reanalysis of transitive impersonal construc-
tions as intransitives, a likely source for some Tunica constructions:
Still another pervasive feature across languages of the Southeast is the propensity of
speakers to specify the posture or position of entities, typically ‘sitting’, ‘standing’ (ver-
tical), or ‘lying’ (horizontal). Because of their frequency, in most of the languages they
have developed into aspectual auxiliaries or suffixes. Such constructions occur, not only in
languages of the Siouan and Muskogean families, but also in the isolates Atakapa, Chitim-
acha, Natchez, and Tunica. It is easy to see how such structures could spread through long-
term intensive contact. Bilinguals used to specifying position systematically in one of their
languages could easily transfer this propensity to the other, since the lexical means would
already be available. Rankin (1977, 1978, 2004, 2011) has demonstrated that the construc-
tion was apparently brought into the Southeast by speakers of Siouan languages (Quapaw,
Osage, Biloxi, Ofo). It can be reconstructed for Proto-Siouan. It occurs in all of the modern
Muskogean languages, but the forms are not cognate across the daughter languages. The
construction has developed to varying degrees in the isolates (Mithun 2010, 2017).
4 IMPLICATIONS
In the end, isolate status is not an either/or matter: there are degrees of quantity and qual-
ity of documentation; of mutual intelligibility between dialects and separate languages;
of likelihood of chance resemblance; and of identifiability of areal phenomena. Many
isolates are represented by small, closed corpora. This is of course not a necessary feature
of isolates: in North America, Zuni is known through a grammar, grammatical sketch,
dictionary, and texts (though much more would be desirable), and it is still used by sub-
stantial numbers of speakers of nearly all ages. Often, however, apparent isolates are
represented solely by data collected during brief encounters by scribes unfamiliar with
the language, perhaps working through an interpreter, trying to render unfamiliar sounds
in a writing system without equivalent categories. The speakers themselves may not have
had native command of the language. Small vocabulary lists do tend to be heavy in basic
vocabulary, of the kind most likely to remain in languages over long periods of time, but
more data of better quality might reveal relationships to other languages.
Isolate status depends entirely on the existence of related but distinct languages. Iden-
tification of relatives can depend on the quantity and quality of documentation of poten-
tial candidates. And assessing the nature of the relationships can present challenges. If
potential relatives are mutually intelligible, they can be considered dialects, and the group
as a whole an isolate. If they are not, the language is no longer an isolate but a member
of a larger family. Yet assessing degrees of mutual intelligibility is not always straight-
forward, particularly in the case of languages no longer spoken. Understanding can vary
along a continuum and be facilitated by experience.
218 Marianne Mithun
REFERENCES
Adelung, Johann Christoph and Johan Severin Vater. 1816. Mithridates, oder allgemeine
Sprachenkunde. Berlin: Vossischen.
Angulo, Jaime de and L.S. Freeland. 1931. Karok Texts. International Journal of Amer-
ican Linguistics 6: 194–226.
Aoki, Haruo. 1963. On Sahaptian-Klamath Linguistic Affiliations. International Journal
of American Linguistics 29, no. 2: 107–112.
Barker, M.A.R. 1963a. Klamath Texts. University of California Publications in Linguis-
tics 30. Berkeley, CA: University of California, Berkeley.
Barker, M.A.R. 1963b. Klamath Dictionary. University of California Publications in Lin-
guistics 31. Berkeley, CA: University of California, Berkeley.
Barker, M.A.R. 1964. Klamath Grammar. University of California Publications in Lin-
guistics 32. Berkeley, CA: University of California, Berkeley.
Beeler, Madison. 1977. The Sources for Esselen: A Critical Review. Berkeley Linguistic
Society 3: 37–45.
Beeler, Madison. 1978. Esselen. Journal of California Anthropology Papers in Linguis-
tics 1: 3–38.
Berlandier, Jean Louis and Rafael Chowell. 1828–1829. Vocabularies of Languages of
South Texas and the Lower Rio Grande. Additional Manuscripts no. 38720, in the
British Library, London.
Berman, Howard. 1996. The Position of Molala in Plateau Penutian. International Jour-
nal of American Linguistics 62: 1–30.
Boas, Franz. 1918. Kutenai Tales. Bureau of American Ethnology vol. 59, pp. 1–387.
Boas, Franz. 1920. The Classification of American Languages. American Anthropologist
22: 367–376.
Boas, Franz. 1929. Classification of American Indian Languages. Language 5: 107.
Booker, Karen. 1980. Comparative Muskogean: Aspects of Proto-Muskogean Verb Mor-
phology. Ph.D. dissertation, University of Kansas.
Bright, William. 1957. The Karok Language. University of California Publications in
Linguistics 13. Berkeley, CA: University of California Press.
Bunzel, Ruth. 1932. Zuñi Origin Myths. Bureau of American Ethnology Annual Report
47, 545–609. Washington: Government Printing Office.
Bunzel, Ruth. 1933. Zuni Texts. Publications of the American Ethnological Society 15.
New York: G E Steckert & Co.
Bunzel, Ruth. 1934. Zuni. Handbook of American Indian Languages, vol. 3, ed. by Franz
Boas, 383–515. Gluckstadt: J.J. Augustin.
Campbell, Lyle. 1996. Coahuiltecan: A Closer Look. Anthropological Linguistics 38,
no. 4: 620–634.
Campbell, Lyle. 1997. American Indian Languages: The Historical Linguistics of Native
America. Oxford: Oxford University Press.
Language isolates of North America 219
Dixon, Roland and Alfred Kroeber. 1913b. New Linguistic Families in California. Amer-
ican Anthropologist 15: 647–655.
Dixon, Roland and Alfred Kroeber. 1919. Linguistic Families of California. University of
California Publications in American Archaeology and Ethnology 16: 47–118.
Eastman, Carol and Elizabeth Edwards. 1983. Qaao Qaao: A Haida Traditional Narrative,
or Quoth the Raven “Nevermore”. International Conference on Salish and Neighbor-
ing Languages 18: 64–79.
Eastman, Carol and Elizabeth Edwards. 1991. Gyaehlingaay: Traditions, Tales, and
Images of the Kaigani Haida. Seattle: Burke Memorial Museum/University of
Washington.
Edwards, Elizabeth. 1995. ‘ “It’s an ill wind” ’. Language and Culture in Native North
America: Studies in Honor of Heinz-Jürgen Pinnow, ed. by Michael Dürr, Egon Ren-
ner, and Wolfgang Oleschinsy, 245–252. Munich: LINCOM.
Edwards, Elizabeth and Carol Eastman. 1995. Fried Bread: A Recipe for the Structure of
Haida Oral Narrative. Language and Culture in Native North America, ed. by Michael
Dürr, Egon Renner, and Wolfgang Oleschinsy, 253–264. Munich: LINCOM.
Enrico, John. 1995. Skidegate Haida Myths and Stories. Skidegate, BC: Queen Charlotte
Islands Museum.
Enrico, John. 2003. Haida Syntax. Lincoln: University of Nebraska Press.
Escalante Fontaneda, Hernando d’. 1944. Memoir of Do. d’Escalenta Fontaneda Respect-
ing Florida, Written in Spain about the Year 1575. Trans. from the Spanish with Notes
by Buckingham Smith (Washington, 1854), ed. David O. True, University of Miami
and the Historical Association of Southern Florida Miscellaneous Publications 1.
Miami.
Frachtenberg, Leo. 1914. Lower Upmqua Texts and Notes on the Kusan Dialect. Colum-
bia University Contributions to Anthropology 4: 141–150. Reprinted in 1969 in New
York: AMS.
Frachtenberg, Leo J. 1918. Comparative Studies in Takelman, Kalapuyan, and Chinookan
Lexicography, a Preliminary Paper. International Journal of American Linguistics 1:
175–182.
Frachtenberg, Leo J. 1918ms, Alsea Grammar. Washington: Smithsonian Institution.
Frachtenberg, Leo J. 1922. Siuslawan (Lower Umpqua). Handbook of American Indian
Languages, vol. 2, ed. by Franz Boas, pp. 431–629. Gluckstadt: J.J. Augustin.
Gallatin, Albert. 1836. A Synopsis of the Indian Tribes within the United States East of
the Rocky Mountains, and in the British and Russian Possessions in North America.
Transactions and Collections of the American Antiquarian Society 2. Worcester, MA:
American Antiquarian Society.
Gallatin, Albert. 1848. The Families of Languages as Far as Ascertained. Hale’s Indians
of North-West America, and Vocabularies of North America: With an Introduction,
xxiii–clxxxviii. Transactions of the American Ethnological Society 2. New York.
García, Bartolomé. 1760. Manual para administrar los santos sacramentos. . . a los
indios de las naciones: Pajalates, Orejones, Pacaos, Tilijayas, Alasapas, Pausanes,
y otras muchas diferentes que se hallan en las misiones del Rio San Antonio, y Rio
Grande. Mexico.
Garvin, Paul. 1947. Christian Names in Kutenai. International Journal of American Lin-
guistics 13: 69–77.
Garvin, Paul. 1948a. Kutenai Lexical Innovations. Word 4: 120–126.
Garvin, Paul. 1948b. Kutenai I: Phonemics. International Journal of American Linguis-
tics 14: 37–42.
Language isolates of North America 221
Garvin, Paul. 1948c. Kutenai II: Morpheme Variation. International Journal of American
Linguistics 14: 87–90.
Garvin, Paul. 1948d. Kutenai III: Morpheme Distributions (prefix, theme, suffix). Inter-
national Journal of American Linguistics 14: 171–178.
Garvin, Paul. 1951a. Kutenai IV: Word Classes. International Journal of American Lin-
guistics 17: 84–97.
Garvin, Paul. 1951b. L’obviation en Kutenai: échantillon d’une catégorie grammaticale
amérindienne. Bulletin de la Société de Linguistique de Paris 47: 166–212.
Garvin, Paul. 1953. Short Kutenai Texts. International Journal of American Linguistics
19: 305–311.
Garvin, Paul. 1954. Colloquial Kutenai Text: Conversation II. International Journal of
American Linguistics 20: 316–334.
Gatschet, Albert S. 1880. The Numeral Adjective in the Klamath Language of Southern
Oregon. American Antiquarian and Oriental Journal 2: 210–217.
Gatschet, Albert S. 1882. Indian Languages of the Pacific States and Territories and of
the Pueblos of New Mexico. Magazine of American History with Notes and Queries
8: 254–263.
Gatschet, Albert S. 1884. Field Notes on Karankawa and Aranama, Collected at Fort
Griffen, Texas. Smithsonian Institution National Anthropological Archive ms 506.
Gatschet, Albert S. 1890. The Klamath Indians of Southwestern Oregon. Contributions to
North American Ethnology 2.1.
Gatschet, Albert S. and John R. Swanton. 1932. A Dictionary of the Atakapa Language,
Accompanied by Text Material. Bureau of American Ethnology Bulletin 108, Smith-
sonian Institution, Washington. Reprinted by Scholarly Press, St. Claire Shores, MI,
1974.
Goddard, Ives. 1979. The Languages of South Texas and the Lower Rio Grande. The
Languages of Native America: Historical and Comparative Assessment, ed. by Lyle
Campbell and Marianne Mithun, 355–389. Austin, TX: University of Texas Press.
Goddard, Ives. 1996a. The Description of the Native Languages of North America before
Boas. Languages: Handbook of North American Indians, vol. 17, 17–42. Washington:
Smithsonian Institution.
Goddard, Ives. 1996b. The Classification of the Native Languages of North America.
Languages: Handbook of North American Indians, vol. 17, 290–323. Washington:
Smithsonian Institution.
Goddard, Ives. 1996c. Introduction. Languages: Handbook of North American Indians,
vol. 17, 1–16. Washington: Smithsonian Institution.
Golla, Victor. 2011. California Indian Languages. Berkeley: University of California
Press.
Granberry, Julian. 1990. A Grammatical Sketch of Timucua. International Journal of
American Linguistics 56: 60–101.
Granberry, Julian. 1993. A Grammar and Dictionary of the Timucua Language (3rd ed.).
Tuscaloosa: University of Alabama Press.
Grant, Anthony P. 1994. Karankawa Linguistic Materials. Papers in Linguistics 19: 1–56.
Gursky, Karl-Heinz. 1969. A Lexical Comparison of the Atakapa, Chitimacha, and Tunica
Languages. International Journal of American Linguistics 35, no. 2: 80–107.
Haas, Mary R. 1941. Tunica. Handbook of American Indian languages, vol. 4, ed. by
Franz Boas, 1–143. New York: J.J. Augustin.
Haas, Mary R. 1946. A Grammatical Sketch of Tunica. Linguistics Structures of Native
America, ed. Harry Hoijer, vol. 6, 337–366. New York: Viking Fund.
222 Marianne Mithun
Hewson, John. 1982. Beothuk and the Algonkian Northeast. Languages in Newfoundland
and Labrador, ed. by Harrold J. Paddock, 176–187. St. John’s, Newfoundland: Depart-
ment of Linguistics, Memorial University.
Hoijer, Harry. 1931–1933. Tonkawa: An Indian Language of Texas. Extract from the
Handbook of American Indian Languages vol. 3, pp. 1–148, Distributed privately by
the University of Chicago Libraries in 1931, Columbia University, New York.
Hoijer, Harry. 1946. Tonkawa. Linguistic Structures of Native America, ed. by Harry
Hoijer, 289–311. Viking Fund Publications, New York.
Hoijer, Harry. 1949. An Analytical Dictionary of the Tonkawa Language, University of
California Publications in Linguistics vol. 9, Berkeley.
Hoijer, Harry. 1972. Tonkawa Texts. University of California Publications in Linguistics
vol. 73, Berkeley: University of California Press.
Jacobsen, William H. 1964. A Grammar of the Washo Language, Ph.D. dissertation, Uni-
versity of California, Berkeley.
Jacobsen, William H. 1986. Washoe Language. Handbook of North American Indians
11: Great Basin, ed. by Warren L. D’Azevedo, 107–112. Washington: Smithsonian
Institution.
Jacobsen, William H. 1993. Another Look at Sapir’s Evidence for Inclusion of Haida in
Na-Dene. Paper presented at the Annual Meeting of the linguistic Society of America,
Los Angeles.
Jany, Carmen. 2009. Chimariko Grammar: Areal and Typological Perspective. Univer-
sity of California Publications in Linguistics 142. Berkeley: University of California
Press.
Kaufman, Terrence. 1988ms. A Research Program for Reconstructing Proto-Hokan: First
Groupings. ms.
Kaufman, Terrence. 1989. Some Hypotheses Regarding Proto-Hokan Grammar. Paper
presented at the Hokan-Penutian Workshop, University of Arizona, Tucson.
Kendall, Daythal. 1977. A Syntactic Analysis of Takelma Texts. Ph.D. dissertation, Uni-
versity of Pennsylvania.
Kendall, Daythal. 1982. Some Notes toward Using Takelma Data in Historical and Com-
parative Work. Occasional Papers on Linguistics 10: Proceedings of the 1981 Hokan
Languages Workshop and Penutian Languages Conference, 78–81. Carbondale, IL:
Southern Illinois University Department of Linguistics.
Kendall, Daythal. 1990. Takelma. Handbook of North American Indians 7: Northwest
Coast, ed. by Wayne Suttles, 589–592. Washington: Smithsonian Institution.
Kendall, Daythal. 1997. The Takelma Verb: Toward Proto-Takelma-Kalapuyan. Interna-
tional Journal of American Linguistics 63, no. 1: 1–17.
Kimball, Geoffrey. 1994. Comparative Difficulties of the “Gulf” Languages. Proceedings
of the Meeting of the Society for the Study of the Indigenous Languages of the Amer-
icas, July 2–4, 1993, and the Hokan-Penutian Workshop, July 3, 1993, ed. by Marga-
ret Langdon, 31–39. Berkeley: Survey of California and Other Indigenous Languages
Report 8, Department of Linguistics, University of California.
Kimball, Geoffrey. 2005. Natchez. Native Languages of the Southeastern United States,
ed. by Heather K. Hardy and Janine Scancarelli, 385–453. Lincoln: University of
Nebraska Press.
Krauss, Michael E. 1979. Na-Dene and Eskimo-Aleut. The Languages of Native Amer-
ica: Historical and Comparative Assessment, ed. by Lyle Campbell and Marianne
Mithun, 803–901. Austin, TX: University of Texas Press.
224 Marianne Mithun
Kroeber, Alfred S. 1904. Languages of the Coast of California South of San Francisco.
University of California Publications in American Archaeology and Ethnology 2, no. 2.
Kroeber, Alfred S. 1907. The Washo Language of East Central California and Nevada.
University of California Publications in American Archaeology and Ethnology 4, no. 5:
251–317.
Kroeber, Alfred S. 1936. Karok Towns. University of California Publications in Ameri-
can Archaeology and Ethnology 35, no. 4: 29–38.
Kroeber, Paul D. 1999. The Salish Language Family: Reconstructing Syntax. Lincoln:
University of Nebraska Press.
Landar, Herbert J. 1996. Sources. Languages: Handbook of North American Indians,
vol. 17, 721–761. Washington: Smithsonian Institution.
Latham, Robert G. 1848. On the Languages of the Oregon Territory. Journal of the Eth-
nological Society of London 1: 154–166.
Lawrence, Erma (ed.). 1977. Haida Dictionary, with grammatical sketch by Jeffry Leer.
Alaska Native Language Center, Fairbanks.
Leer, Jeffry. 1990. Tlingit: A Portmanteau Language Family. Linguistic Change and
Reconstruction Methodology, ed. by Philip Baldi, 73–98. Berlin: Mouton de Gruyter.
Leer, Jeffry. 1991a. The Schetic Categories of the Tlingit Verb, Ph.D. dissertation, Uni-
versity of Chicago.
Leer, Jeffry. 1991b. Evidence for a Northern Northwest Coast Language Area: Promiscu-
ous Number Marking and Periphrastic Possessive Constructions in Haida, Eyak, and
Aleut. International Journal of American Linguistics 57: 158–193.
Levine, Robert. 1977. The Skidegate Dialect of Haida, Ph.D. dissertation, Columbia Uni-
versity, New York.
Levine, Robert. 1979. Haida and Na-Dene: A New Look at the Evidence. International
Journal of American Linguistics 45: 157–170.
Linn, Mary. 2000. A Reference Grammar of Euchee (Yuchi), Ph.D. dissertation, Univer-
sity of Kansas.
Lowie, Robert. 1963. Washo Texts. Anthropological Linguistics 5, no. 7: 1–30.
Manaster Ramer, Alexis. 1996. Sapir’s Classifications: Coahuiltecan. Anthropological
Linguistics 38: 1–38.
Marquardt, William H. 2004. Calusa. Handbook of North American Indians: Southeast,
vol. 14, ed. by Raymond D. Vogelson, 204–212. Washington: Smithsonian Institution.
Martin, Jack. 2004. Languages. Handbook of North American Indians: Southeast, vol.
14, 68–86. Washington: Smithsonian Institution.
Mason, J. Alden. 1918. The Language of the Salinan Indians. University of California
Publications in American Archaeology and Ethnology 14, no. 1: 1–154.
Mithun, Marianne. 1982. The Mystery of the Vanished Laurentians. Papers from the Fifth
International Conference on Historical Linguistics. Current Issues in Linguistic The-
ory 21, ed. by Anders Ahlqvist, 230–242. Amsterdam: John Benjamins.
Mithun, Marianne. 1999. The Languages of Native North America. Cambridge: Cam-
bridge University Press.
Mithun, Marianne. 2007. Grammar, Contact, and Time. Journal of Language Contact
Thema 1: 144–167.
Mithun, Marianne. 2008. The Emergence of Agentive Systems. The Typology of Seman-
tic Alignment Systems, ed. by Mark Donohue and Søren Wichmann, 297–333. Oxford:
Oxford University Press.
Mithun, Marianne. 2010. Contact in North America. Handbook of Language Contact, ed.
by Raymond Hickey, 673–694. Oxford: Blackwell.
Language isolates of North America 225
Mithun, Marianne. 2012. Core Argument Patterns and Deep Genetic Relations: Hierar-
chical Systems in Northern California. Typology of Argument Structure and Grammati-
cal Relations, ed. by Pirkko Suihkonen, Bernard Comrie, and Valery Solovyev, Studies
in Language Companion Series, 257–294. Amsterdam: John Benjamins.
Mithun, Marianne. 2017. Native North American Languages. The Cambridge Handbook
of Areal Linguistics, ed. by Raymond Hickey, 878–933. Cambridge: Cambridge Uni-
versity Press.
Morgan, Lawrence. 1980. Kootenay-Salishan Linguistic Comparison: A Preliminary
Study. M.A. thesis, University of British Columbia, Vancouver.
Morgan, Lawrence. 1991. A Description of the Kutenai Language. Ph.D. dissertation,
University of California, Berkeley.
Movilla, Gregorio de. 1635. Forma Breve de administar los Sacramentos a los Indios, y
Españoles que viuen entre ellos. Approbado por Autoridad Apostolica, y sacado del Man-
ual Mexicano, que se usa en toda la nuena España y Pirù, mutatis mutandis, esto es, lo q̄
estua en légua Mexicana traducido en lengua Floridiana. Imprenta de Iuan Ruyz, Mexico.
Munro, Pamela. 1994. Gulf and Yuki-Gulf. Anthropological Linguistics 36, no. 2:
125–222.
Newman, Stanley. 1958. Zuni Dictionary. Indian University Research Center Publica-
tions vol. 6. Indiana University, Bloomington.
Newman, Stanley. 1965. Zuni Grammar. University of New Mexico Publications in
Anthropology 14. University of New Mexico, Albuquerque.
Newman, Stanley. 1996. Sketch of the Zuni Language. Handbook of North American
Indians: Language, vol. 17, ed. by Ives Goddard, 483–506. Washington: Smithsonian
Institution.
O’Neill, Sean. 2008. Cultural Contact and Linguistic Relativity Among the Indians of
Northwestern California. Norman: University of Oklahoma.
Pareja, Francisco [1612] 1886. Arte de la lengua timvqvana compvesto en 1614 por el Pe
Francisco Pareja y publicado comforme al ejemplar original único por Lucien Adam y
Julien Vinson. Bibliothèque Linguistique Américane 11. Paris.
Pareja, Francisco de 1612, ‘Catechismo y breve exposición de la doctrina Christiana.
Muy util y necessaria, asi para los Espanoles, como para los Naturales, en Lengua Cas-
tellana, y Timuquana, en modo de preguntas, y respuestas’. México: Viuda de Pedro
Balli. Copy in Buckingham Smith Collection, New York Historical Society, New York
City. Photostat, Manuscript No. 2401, “Codex A”, National Anthropological Archives,
Smithsonian Institution, Washington.
Pinnow, Hans-Jürgen. 1964. On the Historical Position of Tlingit. International Journal
of American Linguistics 30: 155–164.
Powell, John Wesley. 1891 [1892]. Indian Linguistic Families of America North of Mex-
ico. Annual Report of the Bureau of [American] Ethnology 7 for 1885–1886. Smith-
sonian Institution, Washington, vol. 7, 1–142. Reprinted 1966 University of Nebraska
Press, Lincoln.
Rankin, Robert L. 1977. From Verb to Auxiliary to Noun Classifier and Definite Article:
Grammaticalisation of the Siouan Verbs ‘sit’, ‘stand’, ‘lie’’. Proceedings of the 1976
Mid-America Linguistics Conference, 273–283, ed. by R.L. Brown, Jr., K. Houlihan,
L.G. Hutchinson, and A. MacLeish. St. Paul: Department of Linguistics, University of
Minnesota.
Rankin, Robert L. 1978. On the Origin of the Classificatory Verbs in Muskogean. Paper
presented at the Annual meeting of the American Anthropological Association, Los
Angeles.
226 Marianne Mithun
Rankin, Robert L. 2004. The History and Development of Siouan Positionals with Spe-
cial Attention to Polygrammaticalisation in Dhegiha. Sprachtypologie und Universa-
lien Forschung (STUF) 2/3: 202–227.
Rankin, Robert L. 2011. The Siouan Enclitics: A Beginning. Paper prepared for the Com-
parative Linguistics Workshop, University of Michigan, Ann Arbor.
Rigsby, Bruce. 1965. Linguistic Relations in the Southern Plateau, Ph.D. dissertation,
University of Oregon.
Rigsby, Bruce. 1966. On Cayuse-Molala Relatability. International Journal of American
Linguistics 32: 369–378.
Rigsby, Bruce. 1969. The Waiilatpuan Problem: More on Cayuse-Molala Relatability.
Northwest Anthropological Research Notes 3, no. 1: 68–146.
Rude, Noel. 1987. Some Klamath-Sahaptian Grammatical Correspondences. Kansas
Working Papers in Linguistics 12, no. 1: 189–190.
Sapir, Edward. 1909. Takelma Texts. University of Pennsylvania University Museum,
Anthropological Publications, vol. 2, 1–267. Berkeley: University of California Press.
Sapir, Edward. 1915. The Na-Dene Languages, a Preliminary Report. American Anthro-
pologist 17: 534–558.
Sapir, Edward. 1917a. The Position of Yana in the Hokan Stock. University of California
Publications in American Archaeology and Ethnology, vol. 13, 1–34. Berkeley: Uni-
versity of California Press.
Sapir, Edward. 1917b. The Status of Washo. American Anthropologist 19: 449–450.
Sapir, Edward. 1920. The Hokan and Coahuiltecan Languages. International Journal of
American Linguistics 1, no. 4: 280–290.
Sapir, Edward. 1921. A Bird’s Eye View of American Languages North of Mexico. Sci-
ence n.s. 54, 408. Reprinted 1990 in The Collected Works of Edward Sapir 5: American
Indian Languages, ed. by William Bright, Mouton de Gruyter, Berlin, pp. 93–94.
Sapir, Edward. 1922. The Takelma Language of Southwestern Oregon. Handbook of
American Indian Languages, ed. by Franz Boas. Bureau of American Ethnology Bul-
letin vol. 40, no. 2, pp. 1–296. Washington: Government Printing Office.
Sapir, Edward. 1929. Central and North American Indian Languages. Encyclopaedia Bri-
tannica (14th ed.), vol. 5, 138–141. Reprinted 1949, 1963 in Mandelbaum, ed. pp.
169–178, and 1990 in The collected works of Edward Sapir 5: American Indian Lan-
guages, ed. William Bright, Mouton de Gruyter, Berlin, pp. 95–104.
Sapir, Edward and Leslie Spier. 1943. Notes on the Culture of the Yana. Anthropological
Records 3, no. 3: 239–297.
Sapir, Edward and Morris Swadesh. 1960. Yana Dictionary, ed. Mary R. Haas. University
of California Publications in Linguistics vol. 22. Berkeley: University of California
Press
Shaul, David. 1995a. The Huelel (Esselen) Language. International Journal of American
Linguistics 61, no. 2: 191–239.
Shaul, David. 1995b. The Last Words of Esselen. International Journal of American Lin-
guistics 61: 245–249.
Shipley, William. 1969. Proto-Takelman. International Journal of American Linguistics
35: 226–230.
Sibley, John. 1832. Historical Sketches of the Several Indian Tribes in Louisiana, South
of the Arkansas River, and Between the Mississippi and River Grande. Communicated
to Congress by Thomas Jefferson, February 19, 1806. American State Papers, Docu-
ments, Legislative and Executive, of the Congress of the United States IV: 721–730.
Language isolates of North America 227
Sitjar, Bonaventure. 1861. Vocabulary of the Language of San Antonio Mission, Califor-
nia, Shea’s Library of American Linguistics vol. 6. New York: Cramoisy Press.
Story, Gillian. 1966. A Morphological Study of Tlingit, M.A. thesis, School of Oriental
and African Studies, University of London, London.
Swadesh, Morris. 1946a. Chitimacha. Linguistic Structures of Native America, ed. by
Harry Hoijer, 312–336. Viking Fund Publications in Anthropology 6. New York:
Viking Fund.
Swadesh, Morris. 1946b. Phonologic Formulas for Atakapa-Chitimacha. International
Journal of America Linguistics 12, no. 3: 113–132.
Swadesh, Morris. 1947. Atakapa-Chitimacha *kw. International Journal of American
Linguistics 13, no. 2: 120–121.
Swadesh, Morris. 1956. Problems of Long-Range Comparison in Penutian. Language
32: 17–41.
Swadesh, Morris. 1965. Kalapuya and Takelma. International Journal of American Lin-
guistics 31, no. 3: 237–240.
Swanton, John R. 1907. The Ethnological Position of the Natchez Indians. American
Anthropologist n.s. 9, no. 3: 513–528.
Swanton, John R. 1915. Linguistic Position of the Tribes of Southern Texas and North-
eastern Mexico. American Anthropologist 17: 17–40.
Swanton, John R. 1919. A Structural and Lexical Comparison of the Tunica, Chitima-
cha, and Atakapa Languages. Bureau of American Ethnology Bulletin, 68. Washington:
Smithsonian Institution.
Swanton, John R. 1921. The Tunica Language. International Journal of American Lin-
guistics 2: 1–39.
Swanton, John R. 1922. Early History of the Creek Indians and Their Neighbors. Bureau
of American Ethnology Bulletin, Government Printing Office, Washington, 73.
Reprinted 1970 by Johnson Reprint, New York.
Swanton, John R. 1924. The Muskhogean Connection of the Natchez Language. Interna-
tional Journal of American Linguistics 3, no. 1: 46–75.
Swanton, John R. 1929a. A Sketch of the Atakapa Language. International Journal of
American Linguistics 5, no. 2–4: 121–149.
Swanton, John R. 1929b. The Tawasa Language. American Anthropologist 31: 435–453.
Swanton, John R. 1940. Linguistic Material from the Tribes of Southern Texas and North-
eastern Mexico. Bureau of American Ethnology Bulletin, 127. Washington: Govern-
ment Printing Office.
Tarpent, Marie-Lucie and Daythal Kendall. 1998. On the Relationship Between Takelma
and Kalapuyan: Another Look at “Takelman”. Paper presented at the annual meeting
of the Society for the Study of the Indigenous Languages of the Americas, New York.
Taylor, Allan. 1963. Comparative Caddoan. International Journal of American Linguis-
tics 29: 113–121.
Thompson, Laurence. 1979. Salishan and the Northwest. The Languages of Native North
America, ed. by Lyle Campbell and Marianne Mithun, 692–765. Austin, TX: Univer-
sity of Texas Press.
Troike, Rudolph C. 1978. The Date and Authorship of the Pajalate (Coahuilteco) Cuard-
ernillo. International Journal of American Linguistics 44, no. 4: 168–171.
Troike, Rudolph C. 1996. Sketch of Coahuilteco, a Language Isolate of Texas. Handbook
of North American Indians: Languages, ed. by Ives Goddard, vol. 17, 644–665.Wash-
ington: Smithsonian Institution.
228 Marianne Mithun
Turner, Katherine. 1983. Areal and Genetic Affiliations of the Salinan. Kansas Working
Papers in Linguistics 8, no. 2: 215–246.
Turner, Katherine. 1987. Aspects of Salinan Grammar. Ph.D. dissertation, University of
California, Berkeley.
Turner, Katherine. 1988. Salinan Linguistic Materials. Journal of California and Great
Basin Anthropology 10: 265–270.
Turner, Katherine and David Shaul. 1981. John Peabody Harrington’s Esselen Data and
the Excelen Language. Journal of California Publications in Linguistics 3: 95–124.
Van Tuyl, Charles D. 1979. The Natchez: Annotated Translation from Antoine Simon le
Page du Pratz’s Histoire de la Louisiane and a Short English-Natchez Dictionary, with
Ethnographic Footnotes, Natchez Transcription, Sound System, Kinship Terminology,
and Kinship System by Willard Walker. Oklahoma Historical Society Series in Anthro-
pology vol. 4. Oklahoma City, OK: Oklahoma Historical Society.
Vater, Johann Severin. 1820–1821. Analekten der Sprachenkunde 2 vols. Leipzig:
Dyksche Buchhandlung.
Villiers du Terrage, Marc de and Paul Rivet. 1919. Les indiens du Texas et les expéditions
françaises de 1720 et 1721 à la ‘Baie Saint-Bernard.’ Journal de la Société des Améri-
canistes de Paris, n.s. 11, no. 2: 403–442.
Voegelin, Carl and Erminie Voegelin. 1946. Linguistic Considerations of Northeastern
North America. Man in Northeastern North America, 178–194. Papers of the Robert
S. Peabody Foundation for Archaeology 3. Andover, MA.
Wagner, Günther. 1931. Yuchi Tales. Papers of the American Ethnological Society 13.
New York: AMS Press.
Wagner, Günther. 1933. Yuchi. Handbook of American Indian Languages, vol. 3, 293–
384. Glückstadt/Hamburg/New York: JJ Augustin.
FURTHER READING
Campbell, Lyle. 1997. American Indian Languages: The Historical Linguistics of Native
America. Oxford: Oxford University Press.
Goddard, Ives. 1996a. The Classification of the Native Languages of North America.
Languages: Handbook of North American Indians, vol. 17, 290–323. Washington:
Smithsonian Institution.
Goddard, Ives. 1996b. Introduction. Languages: Handbook of North American Indians,
vol. 17, 1–16. Washington: Smithsonian Institution.
Mithun, Marianne. 1999. The Languages of Native North America. Cambridge: Cam-
bridge University Press.
CHAPTER 9
LANGUAGE ISOLATES
OF MESOAMERICA AND
NORTHERN MEXICO
Raina Heaton
1 INTRODUCTION
This chapter discusses the four linguistic isolates spoken (or formerly spoken) in Meso-
america and northern Mexico: Seri, Huave, Purépecha (Tarascan) and Cuitlatec. Although
Cuitlatec is the only one which is no longer spoken, the other three are either threatened
or endangered. This chapter provides basic information on each of these isolates, includ-
ing their location, vitality and brief history. In an effort to demonstrate that each of these
languages is indeed a linguistic isolate, also included is a discussion of the major pro-
posals for genetic relationships involving these languages, as well as an evaluation of the
evidence presented for each. The section on each language concludes with an overview of
the typological characteristics of the language and some of the features it possesses which
may be of general interest to linguists.1
Mesoamerica can be a particularly difficult region in which to determine genetic rela-
tionship since it is recognised as a linguistic area, where features have been shared across
language boundaries. This Mesoamerican linguistic area roughly aligns with the Meso-
american cultural area and is considered to have, in addition to a large number of less-
widely distributed linguistic features, the following diagnostic characteristics (Campbell
et al. 1986: 555):
MAP 9.1 LOCATIONS OF THE SERI, PUREPÉCHA, CUITLATEC AND HUAVE LINGUISTIC
COMMUNITIES
2 HUAVE
There is some disagreement as to the number and status of the varieties which compose
Huave. The general consensus appears to be that Huave is a single language with four
dialects which have varying degrees of mutual intelligibility: Santa María del Mar, San
Francisco del Mar, San Dionisio del Mar and San Mateo del Mar. Ethnologue lists each
variety as a separate language but reports 88% mutual intelligibility between speakers in
Santa María del Mar and San Mateo del Mar, but only 38% mutual intelligibility between
speakers of San Francisco del Mar and San Mateo del Mar (Lewis, Simons and Fennig
2014). This low level of mutual intelligibility is apparently due largely to vowel shifts and
prosodic differences (Kim 2008: 3), since these varieties share 91% of their basic vocab-
ulary (Suaréz 1975: 1). In contrast, INALI (Instituto Nacional de Lenguas Indígenas de
México) lists two Huave ‘variantes’, West Huave and East Huave, where East Huave is
composed of San Dionisio and San Francisco del Mar, and West Huave of San Mateo and
Santa María del Mar (INALI: www.inali.gob.mx/). This grouping is at odds with other
classifications, considering that Kim (2008: 3) reports that the San Dionisio dialect is
most similar to that of San Mateo, and that the Santa María dialect is most similar to that
of San Francisco, although this is complicated somewhat by a higher degree of contact
between Santa María and San Mateo given their geographical proximity. She also reports
that intelligibility between the San Francisco and San Mateo dialects requires significant
exposure, which supports the 38% intelligibility rating given by Ethnologue (Kim 2008:
3).
In pre-colonial times, the Gulf of Tehuantepec area was made sociopolitically import-
ant by trade routes that lay between the central highlands and the Soconusco region in
the state of Chiapas. At that time, Huaves occupied a large portion of the Chiapas coast,
and their population centers were points of support for the Mexica traders who traveled
the ancient salt route (Millán 2003: 7). There was therefore quite a bit of contact between
Huave, Mixtec, Zapotec and Mixe-Zoquean languages for the purposes of trade. Most
of the Huave way of life was also centered around fishing, and particularly shrimping,
which supported their communities economically. This is still the case today, although
the distribution of fishing activities between the towns is un equal, with San Dionisio and
San Francisco accounting for 90% of the harvest, and only 10% from San Mateo (Griffin
2001).
In the 16th century, following the conquest of the area by the Spanish, the popula-
tions in the Tehuantepec region were decimated by disease such that the remaining resi-
dents were relegated to small towns hardly exceeding 100 people (Millán 2003: 8). Kim
(2008: 3) speculates that this post-conquest separation of Huave populations into more
isolated populations marked the beginning of the separation of the Huave dialects known
today. This would give them no more than 500 years of divergence, during which time
they were still in relatively close contact with one another, as well as with Zapotec. Inter-
estingly, Huave speakers seem to have maintained a high degree of monolingualism into
the 19th century, although this was less true in San Francisco del Mar, where there was a
resident Zapotec population (Millán 2003: 10, Kim 2008:4).
These communities, like many other indigenous communities in Mesoamerica and
around the world, have more recently been receiving pressure from globalization and
modernization which threaten their language and their way of life (Kim 2008: 3). Reports
from researchers and INALI put the total speakers of Huave at around 15,000: 10,000–
15,000 from Kim (2008: 1), and 15,993 from the INALI 2008–2012 report. These
numbers include semi-speakers and bilinguals/multilinguals, as most of the population
has become bilingual or monolingual in Spanish, as it is the dominant language in the
area. Reports indicate that there are very few monolingual speakers of Huave, with no
232 Raina Heaton
remaining monolingual speakers of the San Francisco dialect as of 2008 (Kim 2008: 1),
and about 1,550 monolinguals in San Mateo del Mar (Lewis, Simons and Fennig 2014).
San Mateo is also the only dialect still being actively acquired by children, although most
speakers are bilingual in Huave and Spanish.
Reports from San Dionisio indicate that younger people in the community are
Spanish-dominant, and most adults are Spanish-Huave bilinguals. Ethnologue lists one
monolingual speaker of the San Dionisio dialect, data ca. 2005 (Lewis, Simons and
Fennig 2014). Kim (2008: 1–5) says of the San Francisco dialect that as of 2008 when
she conducted her fieldwork, there were no more than 100 residents with a high degree of
fluency in Huave. Most of these speakers were elderly and had not spoken Huave in some
time, since the family members with whom they were accustomed to speaking had passed
away. Adult individuals over the age of 40 were likely to understand and have passive
knowledge of Huave if they grew up in a Huave-speaking household. However, those
who grew up in Spanish-speaking households may only know a few Huave lexical items.
There are no native Huave speakers in San Francisco under the age of 40.
Santa María is the smallest dialect in terms of population and is listed by UNESCO
(Moseley 2010) as ‘severely endangered’. The Expanded Graded Intergenerational
Disruption Scale (EGIDS) (Lewis, Simons and Fennig 2014) lists all Huave dialects
as 7 (‘shifting’), except for San Mateo, which is 5 (‘developing’). The Catalogue
of Endangered Languages lists Huave as ‘vulnerable’, but only with 20% certainty
(“Huave”, Endangered Languages 2015). UNESCO (Moseley 2010) lists San Mateo as
‘vulnerable’, and all others as ‘definitely endangered’ to ‘critically endangered’. This
appears to be the most accurate characterization, since there is clearly a shift towards
Spanish dominance in all dialects except perhaps San Mateo. There does seem to be
some nascent interest in learning and/or preserving the language among the younger
generations, and there is a group actively involved in cultural preservation efforts (Kim
2008: 5).
Suaréz (1975) notably used internal reconstruction and the comparative method based
on the four modern Huave dialects to reconstruct Proto-Huave (which he further com-
pared with Algonquian and Gulf). He reconstructed Proto-Huave with a series of plain
stops and pre-nasalized stops, contrastive vowel length (only in penultimate syllables)
and tone. There were several uncertainties, such as whether lexical tone should be recon-
structed or could have arisen in San Mateo via other means. Also, it is possible that some
segments that Suaréz reconstructed but which only occur in a few cognate sets are not
true proto-phonemes (see Campbell 1997: 161). However, this attempt to apply internal
reconstruction and the comparative method to dialects in order to elucidate the history
of Huave is an important contribution to the historical study of isolates, as this method
is principal among the very few techniques that allows us to recover previous stages for
language isolates.
Of the four Huave dialects, the San Mateo del Mar variety is the most studied to date,
with a grammar (Stairs and Stairs 1981) and a dictionary (Stairs and Hollenbach 1981)
published through the work of the Summer Institute of Linguistics (SIL). There has also
been work on particular aspects of San Mateo del Mar Huave morphology, most nota-
bly Stairs and Hollenbach (1969), Matthews (1972), Noyer (1993; 1997) and Cuturi
and Gnerre (2005). There has also been some work on syntax, e.g. Pike and Warkentin
(1961), Stolz (1996) and Pak (2007; 2010; 2014). Huave phonology has been of partic-
ular interest in the literature, with contributions by Noyer (1991; 2003), Davidson and
Noyer (1997) and Evanini (2007).
The study of San Francisco del Mar Huave has been vastly improved in recent years
due primarily to the work of Yuni Kim. Her dissertation in 2008 discussed in detail
several aspects of Huave phonology and morphology, using primary data from the San
Francisco del Mar dialect. She has since continued to work on Huave morphology and
phonology, with articles on affix order, derivation, loan phonology and event structure
(Kim 2009; 2010; 2011; 2013; inter alia). There remains very little linguistic work on
the San Dionisio and Santa María dialects. The texts published in Radin (1929) are still
the primary documentation of the San Dionisio dialect, and the Santa María dialect is
represented mainly in Suaréz (1975). However, there is some recent work on vowels and
palatalization in the Santa María dialect by Alberto G. Montoya Pérez (2014a; 2014b).
There is also a conference, Jornada de Estudios Huaves (Conference on Huave Studies),
which brings researchers together to discuss anthropological and linguistic work on all
varieties of Huave. The first meeting was in 2010, the second in 2014.
phonologically separate from the predicate, but postverbal subjects are phonologically
grouped with the verb.
In addition to prefixes and suffixes, Huave also has “mobile affixes,” which have long
been a topic of interest in the literature (cf. Matthews 1972; Noyer 1993; Kim 2010).
These affixes can be either prefixes or suffixes depending on the context, i.e. the morph-
ophonological properties of the base to which they attach. They also occur at a fixed
distance from the root in the linear ordering of affixes. These mobile affixes are touted
as being one of the few existing cases of phonologically conditioned affix ordering, the
existence of which has theoretical ramifications for the study of the relationship between
phonology and morphology. Kim (2010) proposes a hierarchical structure for affixes in
Huave and accounts for the phonological conditioning of affix ordering in an Optimali-
ty-Theory framework where phonological well-formedness constraints outrank morpho-
logic alignment constraints. However, true phonologically conditioned affix ordering is
problematic for theories such as Lexical Phonology and Distributed Morphology (Halle
and Marantz 1993) where morphology precedes phonology in the derivation of forms.
Paster (2009: 34) suggests that these mobile affixes, since they consist only of single
consonants, could potentially also be analyzed autosegmentally as floating features which
then associate with the CV tier entirely within the realm of phonology.
Huave also has several interesting phonological features, including a contrast between
a plain and a palatalized series of stops, diphthongization and vowel harmony. The San
Mateo del Mar dialect also has contrastive vowel length, which corresponds to vowel
“aspiration” (vowel plus postvocalic glottal fricative) in San Francisco del Mar (Kim
2008: 11). One of the more extensively researched topics in Huave, however, is lexical
tone, which only exists in the San Mateo dialect. Tone in San Mateo del Mar Huave has
a low functional load, and tonal contrasts are often neutralized by phrasal intonation pat-
terns. Noyer (1991: 277) claims that low tone is assigned to all tone-bearing units which
are not assigned tone lexically or metrically, and the distribution of high tones is based
entirely on metrical constituency. High tone plateaus spread rightward across certain syn-
tactic domains, making tone a factor in determining syntactic boundaries (cf. Pike and
Warkentin 1961: 627; Pak 2007).
3 SERI
are increasing, and the land given to them as an ejido (communal lands) by the Mexican
Republic in the 1970s has allowed them more control over their contact with non-Seri.
Also, despite the 62 kilometers separating the two Seri villages, there is still regular con-
tact between the Seris in both locations (Marlett 1981; 2006).
The Seri traditionally subsisted on fish and other sea life, particularly turtles, as well
as berries, roots and whatever else they could gather from the desert landscape, which
receives less than 10 inches of rain a year (Sheridan 1999: 8). However, the Seri diet was
unique in that they did not have traditional grain staples, as the environment does not
sustain traditional agriculture. Instead, their grain staple came from the seeds of Zostera
marina L., or eelgrass. The Seri are the only people known to have used a grain from the
sea as a primary food source (Felger and Moser 1973; 1985; Sheridan and Felger 1977;
Felger et al. 1980). The use of eelgrass as a grain staple was part of their lifestyle for a
long time, which is evident in the amount of specialized vocabulary surrounding it. For
example, the ripe eelgrass fruit is called xnoois,2 whence comes the Seri term for April,
‘xnoois when there is moon’, in which eelgrass becomes ready to harvest. There are also
separate terms for those who harvest floating eelgrass (capoee) versus those that harvest
growing eelgrass (cotám) (Felger and Moser 1973: 355).
Felger and Moser (1973) also argue that eelgrass has potential as a general food source
for people globally, since it grows in abundance in the warm, shallow coastal areas of
the Gulf of California, close to modern metropolitan areas. It also does not require fer-
tilizers, pesticides or fresh water, and it has a nutritional profile similar to other grains.
Additionally, it lacks any disagreeable flavors which might lead people to reject it (Felger
and Moser 1973: 356). This discovery that eelgrass could be a major source of human
sustenance is not only important in terms of food security but also in terms of endan-
gered language advocacy. This discovery was made by studying the language and culture
of a non-Western group of people whose language is endangered and whose traditional
knowledge of their local ecosystem has expanded the reach of modern science. This is
therefore a perfect case study as to the importance of the study of lesser-known language
groups and how they can make significant, tangible contributions to human knowledge.
First, the language has a voiceless fricative which is labiodental for some speakers (/f/),
but bilabial for others (/ɸ/) (Marlett et al. 2005: 117). /j/ ([y] in Americanist phonetics) is
also sometimes pronounced as a voiced postalveolar affricate, which is a recent develop-
ment due to contact with local Spanish (2005: 118), which has variation between [ʒ] and
[dʒ] as a prestige pronunciation of /j/. In terms of the phonology, there is also variation in
the application of a sibilant assimilation rule, where /s/ before /ʃ/ becomes postalveolar.
However, some speakers systematically do not assimilate, e.g. szatx aha [ˈʃʃɑtχɑʔɑ] ~
[ˈsʃɑtχɑʔɑ] ‘s/he/it will have burs (alguates)’ (Marlett 2006: 4). In addition, there is an
asymmetry in the frequency of the two laterals in Seri, /l/ and the voiceless fricative /ɬ/,
where plain, voiced /l/ appears in very few native words and some loans (Marlett et al.
2005: 117), but /ɬ/ is exceedingly common. Although for some speakers /l/ appears in a
few lexical items where other speakers would use /ɬ/, apparently the fricative is gaining
use (Marlett 2006: 4). This is interesting both because of typological expectations for
voiced /l/ to be less marked and therefore more frequent and because of the phonetic
transfer from Spanish influencing other sounds, but apparently not these laterals. In terms
of inter-speaker variation in the lexicon, there are differences, not only in a few content
words, but also in at least one function word, the determiner cop ‘s/he (standing)’ which
some systematically pronounce as cap, not only in isolation, but also derived forms such
as ticop ~ ticap. There is also evidence of taboo avoidance, where the word otác ‘frog,
toad’ ceased to be used after a woman known as otác quiho ‘she who sees the frog’ passed
away (Marlett 2006: 5).
3.3 Status
Seri currently has between 800 and 900 speakers (Marlett et al. 2005; Marlett 2006),
almost all of whom are ethnic Seri. That is, the language is closely tied to the ethic pop-
ulation, so that non-Seri do not speak Seri, and almost all Seri speak Seri (Marlett 2006:
1). The language is listed by Ethnologue as being in ‘vigorous’ use (Lewis, Simons and
Fennig 2014), whereas the Catalogue of Endangered Languages lists it as ‘threatened’,
with 80% certainty (“Seri”, Endangered Languages 2015). Marlett (2006: 2) reports that
Seri is still the language of everyday communication in Seri communities. Spanish exists
in the community, but only in official domains such as in schools, with some use in
religious services and official, commercial and tourism-related interactions with outsid-
ers. Other adjacent language groups (primarily Uto-Aztecan) have not had more than
minor lexical influences on Seri. There are also some cases of exogamy, where the chil-
dren end up either Spanish-dominant bilinguals or Spanish monolinguals. However, this
does not appear to be having a significant impact on language use amongst the youth in
the community, since generally children of all ages are fluent in Seri (Marlett 2006: 2).
Seri speakers generally have a positive opinion of their language, unlike in many other
Mesoamerican and other Mexican indigenous communities. However, Seri is beginning
to come under some of the same pressures from technology and globalization. More
specifically, since both Seri villages now have consistent electricity, Spanish-language
television programming has started to introduce Spanish into daily life where it was not
formerly present. This has caused some concern relating to potential language loss in the
future (Marlett 2006: 3).
Seri is extremely well documented for a small language of Mexico. The current favor-
able status of the documentation on Seri is due primarily to a large and thorough body
of work published by Stephen Marlett over the past 35 years. Marlett’s dissertation was
Language isolates of Mesoamerica and Northern Mexico 237
“The Structure of Seri” in 1981, and he has since continued to work on topics related to
Seri linguistic structure and development (e.g. Marlett 1988; 1990; 2002; 2005; 2007;
2008a; 2008b; 2010b; 2012b; inter alia), including a full treatment of the grammar of the
language (Marlett forthcoming). In addition to his scholarly work, he has also made the
majority of his Seri information public, discussed in layman-friendly terms through his
personal website. One particularly useful piece of documentation he has made available
is a periodically updated comprehensive bibliography of sources on Seri for a wide vari-
ety of topics including ethnography, anthropology, archaeology, history, art, films, lan-
guage, historical linguistics, literature, literacy materials and other ‘popular’ publications
(Marlett 2013). He has also published two editions of a trilingual Seri-Spanish-English
dictionary with Mary Moser (Moser and Marlett 2005; 2010), with an associated gram-
matical sketch. A full grammar of Seri (in English) is in preparation.
In addition to the solo and collaborative work on Seri conducted by Stephen Marlett, a
large amount of information on Seri was also collected by Edward and Mary Moser under
the auspices of SIL beginning in the 1950s. The research they compiled has been the
foundation for much of the later work (cf. Marlett 1981: vii). They not only worked out
an orthography (Moser and Moser 1965); published a bilingual vocabulary (Moser and
Moser 1961); and compiled data for a grammar, they also published on specific aspects
of the language such as switch-reference (Moser 1978) and pluralization (Moser and
Moser 1976). In addition to strictly linguistic work, they also published on a number of
anthropological topics involving the Seri culture, e.g. Mary Moser (1970a; 1970b; 1988),
Edward Moser (1973) and the aforementioned work on Seri ethnobotany (e.g. Felger and
Moser 1973; 1985; Felger et al. 1980). Prior to the Mosers, the linguistic work which
had been done on Seri was minimal, consisting almost entirely of vocabulary lists (e.g.
McGee 1898; Kroeber 1931).
More recently there have also been students and researchers working in Mexico who
have contributed several theses and other publications to the linguistic literature on Seri.
There is a growing body of work by Carolyn O’Meara (UNAM) relating to issues involv-
ing language, landscape and spatial relationships (O’Meara 2008; 2010; 2011a; 2011b;
2014a; 2014b; O’Meara and Bohnemeyer 2008; Bohnemeyer and O’Meara 2012). Addi-
tional contributions to the understanding of Seri structure come from Munguía Duarte
(2005), a doctoral dissertation on morphophonology and applied linguistics, as well as
from Munguía Duarte (2004; 2006) and Munguía Duarte and López Cruz (2009). There
are also two master’s theses from the University of Sonora, Larios Santacruz (2009)
on ditransitive constructions and Martínez Soto (2003) on discourse and information
structure.
The Hokan hypothesis has had proponents for over a century but still remains con-
troversial. There are differing proposals for which languages may belong to ‘Hokan’,
but generally Shastan, Chamariko, Karuk, Palaihnihan, Yana, Washo, Pomo, Esselen,
Yuman, Chumash, Salinan, as well as Seri and Tequistlatecan, are included (Sapir
1917a; Dixon and Kroeber 1919; Voegelin and Voegelin 1965). The critiques of the
larger Hokan hypothesis apply also to the suggestion of genetic affiliation between Seri
and the other “Hokan” groups (cf. Olmsted 1964; Turner 1967; Klar 1977; Campbell
1997; inter alia). The primary criticisms revolve around a paucity of rigorous compar-
ative data, where either little lexical or grammatical data is presented, or that the data
that have been presented contain a number of flaws, including but not limited to incor-
rect forms, exceptional semantic latitude, inclusion of Pan-Americanisms, onomato-
poeia, monosyllables and nursery forms (see Campbell 1997: 290–195, Chapter 7).
As better data have become available on many of these putative ‘Hokan’ languages,
researchers have been re-evaluating possible relationships among some languages
within the larger proposal and have often rejected the association (e.g. Turner 1967
(Seri-Tequistlatic); Klar 1977 (Chumash)) or have come up with evidence that is at
best suggestive of, but certainly not proof of, genetic relationship (e.g. Haas 1964b;
McLendon 1964; Silver 1964).
There have been several less wide-reaching hypotheses which have examined a closer
relationship between Seri and other Hokan groups which merit discussion here. Brinton
(1891) grouped Seri, Tequistlatecan (Chontal of Oaxaca) and Yuman very early on. In
a similar vein, Lamb (1959) included Seri in a Hokan subgroup ‘Karok-Yuman’, and
Powell (1891) likewise grouped Seri with Yuman. Bright (1956) proposed a relationship
between Seri and Salinan, and Bright (1970) defended the relationship of Seri to Tequist-
latecan. However, Langdon (1974) supported a Southern Hokan group consisting of Seri,
Chumash languages and Tequistlatican, while grouping Yuman and Pomoan (Langdon
1979). These various proposals have since been evaluated further, and comments on each
proposed relationship are provided below.
3.4.1 Seri-Yuman
The putative Seri-Yuman connection, although a long-standing proposal, has suffered
from the same a lack of data and comparative rigor as discussed above for Hokan gener-
ally. It appears to be based on minimal evidence and was opposed early on in Gatschet
(1900b) and by Hewitt in McGee (1898). Years later, Crawford (1976) presented 227
points of comparison involving Seri and various Yuman languages, but this list unfor-
tunately contains serious problems with respect to the Seri data which make it uncon-
vincing (Marlett 2007: 3–4). Marlett (2007) also refers to an unpublished manuscript by
Margaret Langdon which provides 34 putative Seri-Yuman cognates (although with no
mention of regular sound correspondences) and other similarities which could be sugges-
tive of a relationship. The possessive/subject agreement paradigms are nearly identical in
Seri and Yuman, and although many Seri words are polysyllabic while the Yuman forms
are monosyllables, Marlett (2007: Table 2) notes that the second, non-accented syllable
in Seri appears to correspond to the Yuman form. He also notes that body part terms in
Seri often have an initial y-, while many Yuman body part terms begin with the phoneti-
cally similar segment i- (Marlett 2007: 4). So while the evidence appears to be scarce and
relationship has not been demonstrated, some of these types of similarities merit further
careful investigation.
Language isolates of Mesoamerica and Northern Mexico 239
3.4.2 Seri-Tequistlatec
A possible relationship between Tequistlatecan (Chontal of Oaxaca) and Seri was inves-
tigated and subsequently rejected by Turner (1967). He argues that the 6 proposed cog-
nates between Seri and Tequistlatec presented in Kroeber (1915) in a list of 27 pairs is a
misrepresentation of the frequency of potential cognates when one considers the full set
of 100 items from the Swadesh wordlist, or even a list of 500 words from each language.
Of the larger dataset, only 8% of the Swadesh list and 4% of the extended list appeared to
be matches, which, as Turner points out, is about the percentage of similarity one could
expect from chance given two languages with similar phonemic inventories (Turner 1967:
236–237). There is also an apparent lack of similar forms with respect to kinship terms
and numerals, lexical areas which are typically considered to be more conservative. What
is more, the kinship systems of the languages do not make the same distinctions, and the
Tequistlatec numeral system is vigesimal while Seri is decimal. Finally, Seri makes other
lexical distinctions not present in Tequistlatec, e.g. demonstratives which indicate body
position3 (Turner 1967: 238–239).
Bright (1970) responded to the evidence in Turner (1967) by referring to the well-
known maxim that it is impossible to prove non-relation, and he then questioned the
method and semantic categories which Turner used to make the comparison. He con-
tends that the meanings associated with kinship and numerical systems are “highly
culture-bound” and are also subject to borrowing, and therefore are not sufficient to sug-
gest non-relationship (Bright 1970: 289). He also cites a prior study (Bright 1956) in
which he found 40 cognates in a 200-word Swadesh list, which would yield similarity on
par with Kroeber’s 20% possible cognate ratio. He argues that these 40 putative cognates
are affirmed by the identification of nine regular phonological correspondences, each
attested in three to six items (Bright 1970: 288). He admits, however, that some of these
correspondences may be spurious and do not constitute sufficient evidence to convinc-
ingly demonstrate a genetic affiliation between Seri and Tequistlatec (1970: 289).
More recently, Marlett (2007) reports the preliminary results of a small study con-
ducted using WordSurv (Wimbish 1989) and finds an “interesting” number of possible
cognates (Marlett 2007: 12, Table 3.2), i.e. which were found to be to some extent pho-
netically similar. A very preliminary attempt to establish regular correspondence looked
at the lateral fricative in Seri and Tequistlatec and attempted to find a pattern in their
alternation. However, this likewise did not lead to any clear demonstration of regularity
and relationship.
3.4.3 Seri-Salinan
In the same paper which provided data investigating a potential relationship between Seri
and Tequistlatec (Bright 1956), Bright also provided similar data suggesting a possible
relationship between Salinan and Seri. Out of 132 lexical comparisons, 39 pairs showed
either what appeared to be sound correspondence or other phonetic similarity (only 26
of which were given in the publication) (Bright 1956: 46–47). He put forth nine putative
sound correspondences: [Seri:Salinan] k:k, X:k, ʔ:k, ɬ:l, m:n, n:n, p:p, ɬ:t, k:ʔ, with no
stated conditioning environments. Marlett (2008a) looked closely at the forms provided
in Bright (1956) and brings the Seri data up to the present state of knowledge, given his
own fieldwork on Seri and the forms in Moser and Marlett (2005). The result was that no
previous Seri form remained without issue. In some cases the issues were comparatively
minor (e.g. no indication of vowel length), but many are now known to be compositional
240 Raina Heaton
forms and therefore are not individual, uninflected roots which previous comparisons
had assumed them to be (true of at least 10 of the 26 forms; see the appendix to Marlett
2008a). This leaves the Seri-Salinan hypothesis even more tentative than it was previ-
ously, and a cursory look at the new correspondences does not reveal anything more
promising. However, the question may bear revisiting with a different lexical set now that
we have access to better information.
Inalienable:
(1) mi-naail ‘your skin’
i-pxasi ‘his/its flesh’ (Marlett 1981: 66)
Kinship:
(2) ma-paz ‘your paternal grandfather’
a-paac ‘his older sister’ (Marlett forthcoming: 331)
Notice also that glottal stop (represented in the community orthography by h) appears to
be phonemic in initial position (in addition to being phonemic elsewhere), which is less
typical of Mesoamerica, although common in some Austronesian languages. See Moser
and Marlett (1999) for more on Seri kinship.
Although Seri possesses several features which are present in other Mesoamerican
languages, it is not considered part of the Mesoamerican linguistic area (cf. Campbell
et al. 1986). While Seri has about 12 relational nouns, they are not based on body part
terms; it has a decimal, not a vigesimal numeral system; it does not appear to share some
of the common calques present in the Mesoamerican linguistic area; finally, it has SOV
dominant word order and switch-reference which are characteristically lacking from the
other languages of the linguistic area but are present in the neighboring languages (e.g.
Seri, Jicaquean, Coahuilteco and Yuman) (Campbell 1997: 334).
Switch-reference in Seri has been the subject of several investigations (Moser 1978;
Marlett 1984a; Farrell et al. 1991) as it differs somewhat from other switch-reference
systems. In Seri, SS (Same Subject) is unmarked, and DS (Different Subject) is marked
with two elements: ta in irrealis clauses and ma in realis clauses (Marlett 1981: 195).
Switch-reference marking only appears in dependent clauses. However, the grammatical
subject (“final subject”, Farrell et al. 1991) is not the relevant reference point for dictating
the use of switch-reference marking. For example, the following passive clause has the
same grammatical subject in the first clause as in the second clause, and yet DS marking
appears:
This suggests that the language is instead indexing the semantic subject, or the deep
structure subject argument (‘initial subject’). However, the reference point for switch-
reference marking does appear to be the ‘final subject’ in raising constructions, which
Marlett (1981; 1984a; Farrell et al. 1991: 438) argue is because raising clauses are ini-
tially unaccusative and therefore have no ‘initial subject’.
4 PURÉPECHA
4.2 Endangerment
Prior to the arrival of the Spanish, the Tarascan kingdom was one of the important impe-
rial states in Mesoamerica. The Purépecha were in competition for the control of land and
Language isolates of Mesoamerica and Northern Mexico 243
resources with the Aztecs, leading to many armed conflicts (cf. Pollard 1993). However,
after the arrival of the Spanish and the domination of the area in 1530, the region con-
trolled by Purépecha speakers significantly decreased, and the population was integrated
into Spanish colonial structure. Now Purépecha is under many of the same social and
economic pressures from Spanish as other languages in the region, which is slowly caus-
ing its obsolescence.
The speaker numbers reported for Purépecha vary, and even appear to be increasing.
Friedrich (1971a, 1971b, 1975) provides a figure of about 50,000 native speakers in the
early 1970s, but by 2000, the Instituto Nacional de Estadística y Geografía de México
(INEGI) reports 121,409 speakers over the age of 5. Based on his fieldwork experiences
in the mid-1900s, Friedrich (1971b: 168) reported that in about two-thirds of the 50
Tarascan-speaking towns, children had little exposure to Spanish before going to school
(post age 5). This led him to conclude that “the necessary sociolinguistic preconditions
do exist for transmitting the Tarascan language in a relatively stable and continuous man-
ner” (Friedrich 1971b:168). He also provides a very helpful list of 31 towns in relation
to the general competence of the 5–6 year old residents of that town in Purépecha (T) or
Spanish (S). Towns where children are exhibiting dominance in Spanish (9 of the 31) are
the sites of language loss both in terms of declining numbers of speakers and a fall in its
acquisition by children (see Friedrich 1971b: 168, Chart II).
However, by the 2000s when Alejandra Capistrán Garza was conducting her fieldwork
in Puácuaro, one of the towns listed in Friedrich (1971b) as “essentially monolingual
T [Tarascan]”, she reports that most children are no longer learning Purépecha as a first
language. In fact, only about 60% of the adult population of the town are speakers, and all
are bilingual in Spanish (Capistrán 2015: 6). If this type of situation is representative of
the other dialects, then the language is certainly endangered. Capistrán and Nava (1998:
144) also report that, although the Purépecha population has grown since 1940, when
27% of the population spoke the language, only 8.5% spoke it by 1996. Purépecha is
currently listed in the Catalogue of Engendered Languages as ‘threatened’ (with 60%
certainty) (“Purepecha”, Endangered Languages 2015), by Ethnologue as 5 (‘develop-
ing’) (for both languages) (Lewis, Simons and Fennig 2014), and by UNESCO (Moseley
2010) as ‘vulnerable’.
4.3 Documentation
Purépecha has been the topic of linguistic and cultural study since the early colonial era;
the grammars and dictionaries of Bautista de Lagunas (1983 [1574]) and Gilberti (1987
[1559]) are well known. However, most of the linguistic work on the language was from
the 1930s onward. Maxwell D. Lathrop of SIL published linguistic material on Purépe-
cha beginning in the 1930s, which included some initial information about the language
(Lathrop 1937) and short vocabularies (Lathrop 1973; 2007). Some descriptive work
on the grammar was done by Alan Wares (1956) and Mary Foster (1969) (although see
comments in Wares 1972; Friedrich 1973). Morris Swadesh (1969) also wrote a 190-page
grammar sketch and dictionary (published posthumously), based primarily on colonial
sources.
In the 1970s, Paul Friedrich made significant contributions to Purépecha phonology
and dialectology (Friedrich 1971a; 1971b; 1975) and drew attention to its theoretical
implications, e.g. in Friedrich (1969; 1970; 1971c). In the 1980s and 1990s, Paul De Wolf
likewise dedicated many years of his life and scholarship to Purépecha, and most notably
published a series of three books on the language (De Wolf 1989; 1991; 2013), the last
244 Raina Heaton
of which was published posthumously. De Wolf worked primarily in Tarecuato (in the
Eleven Pueblos area), which is a less well-studied dialect area, and dedicated a significant
portion of his final book to Purépecha discourse, making his contribution rather unique
(De Wolf 2013: v).
A considerable amount of work on Purépecha morphosyntax has been published in
the last 20 years. Claudine Chamoreau, in addition to writing a grammar of the language
(2000), has worked on a variety of grammatical topics, including transitivity and gram-
matical relations (1993; 1998; 1999b; 2008; 2012a) and language contact, variation and
change (1995; 2007; 2012b; 2014). The literature on Purépecha has also been significantly
expanded by recent doctoral dissertations and associated publications focusing on Purépe-
cha morphosyntax (Maldonado and Nava 2002; Monzón 2004; Nava 2004; Capistrán 2004;
2006; 2010; 2011; 2015; Villavicencio 2006; Vázquez Rojas 2012). There is also com-
munity interest in Purépecha language and culture, with the stated goal of getting back to
their pre-Hispanic roots (see www.purepecha.mx/). So while Purépecha is now rather well
described for a Mesoamerican language, there is still a lot of work to be done beyond pho-
nology and morphology, since that has been the nexus of most of the linguistic work to date.
Besides the semantic disparity between ‘concave’, ‘turtle’ and ‘leg’, only the Huastec,
Tseltal and Kaqchikel forms are actually cognate, from Proto-Maya *-aqan ‘leg’. The
Yucatec áak ‘turtle’ can only be reconstructed back to western Mayan *ahk, whereas
Proto-Mayan had *peets or *peety (cf. Kaufman 2003), which is hardly similar to aká-.
The issue of word order is also relevant: Purépecha has pragmatically determined word
order, and there has been debate as to whether either SOV or SVO should be considered
‘basic’, or pragmatically unmarked. Capistrán (2002) argues that, at least for the dialect
on which she works, SVO is pragmatically unmarked and SOV serves to emphasize the
object. However, Vázquez Rojas (2012: 6) points out that this is likely only a regional
effect and that the Meseta dialects have SOV order. Purépecha does appear to have several
of the secondary typological characteristics of an SOV language, such as postpositions,
genitive-noun (possessor-possessed) order as shown in (5) above, and overt case marking
on nominals. However, if SOV were to be unmarked, this would be another strike against
its inclusion in the Mesoamerican linguistic area, as the area is characterized by a lack
of SOV order (Campbell et al. 1986: 547–548). However, Purépecha does share many
of the non-diagnostic Mesoamerican linguistic traits, e.g. devoicing of final consonantal
sonorants, voicing of obstruents after nasals, numeral classifiers and verbal directional
morphemes (Campbell 1997: 345–346). This places the language in the interesting posi-
tion of being outside of the Mesoamerican linguistic area, but still sharing many (but not
all) of the relevant traits with the surrounding languages, which were presumably still
obtained from areal diffusion.
5 CUITLATEC
Pérez noted that evidence of some older beliefs were detectable in the stories he collected
from some of the elderly people in 1939 (cf. Hendrichs Pérez 1939: 351–355).
notes they provide specifically on the language is that, at that time, the few who spoke
any Cuitlatec used it for telling scatological jokes (Drucker et al. 1969: 575). Although
this is a highly restricted domain, this shows that there was still a group of people who
knew something of the language, and it had at least one use in the community as of the
late 1960s. In the following years, the language lost all domains of use and was presumed
extinct. However, a team of researchers who went to the Cuitlatec area in 1979 to study
Nahuatl found two elderly women who could still remember some Cuitlatec vocabulary –
Florentina Celso and Apolonia Robles. Interestingly, Apolonia Robles was the grand-
daughter of Hendrichs’ primary informant Constancia Lazaro de Robles. The researchers
gathered what they could, which included 50 Cuitlatec words and some ethnographic
information (Valiñas Coalla et al. 1984: 171). This appears to be the most recent source
which cites original Cuitlatec data.
5.3 Documentation
The linguistic documentation available on Cuitlatec is very limited and, as noted by vari-
ous researchers, appears to exhibit a great deal of inter-speaker variation. Both the limited
corpus and the variation can almost certainly be attributed to the fact that the language
was already moribund by the time anyone attempted to document it. This fact has also
contributed to the difficulties in its classification. Works primarily pertaining to proposals
for genetic affiliation are discussed in Section 4.4.
Although there were some short Cuitlatec wordlists and some ethnographic informa-
tion collected prior to 1930 (e.g. León 1903), the earliest serious linguistic documenta-
tion of Cuitlatec comes from a pair of 1939 articles by Robert Weitlaner and Hendrichs
Pérez, respectively. Both articles mention sounds and some aspects of Cuitlatec grammar.
Hendrichs Pérez primarily provides verb conjugations and attempts to divide them into
classes. He also lists possessive noun and pronoun paradigms and some information on
syllable structure (1939: 355). Weitlaner included a wordlist which served as the start-
ing point for the work of Norman McQuown shortly thereafter. In a later publication,
Hendrichs Pérez (1946) discusses at greater length his methods of data collection, possi-
ble remnants of Cuitlatec in local Spanish and in toponyms, some data from Xinka and
K’ichee’ for possible genetic comparison and some additional points on the grammar.
However, the most important contribution of Hendrichs Pérez (1946) is a vocabulary of
1,221 items, which includes some verb conjugations.
McQuown (1941) is a phonetic study of Cuitlatec and an attempt to make use of what
little the speakers at that time could provide. He worked with Hendrichs Pérez’s pri-
mary informant, Constancia Lazaro de Robles, who was 60–70 years old, and her sister
who was present for some of the sessions. His data came from three sessions of approx-
imately 2 hours, so only about 6 contact hours. He carefully transcribed 450 lexical
items but notes that there was no opportunity to check his transcriptions (McQuown
1941: 240). These lexical items and short phrases are included in the publication. In
addition to providing an inventory of 18 consonants, 8 vowels and an additional 3
sounds (/ɾ, r, s/) borrowed from Spanish, McQuown comments on accent/stress, the
frequency of various sounds, and some phonological rules (e.g. /n/ assimilates to fol-
lowing velar stops (1941: 245), and vowels devoice optionally and word-finally in
unaccented positions).
Most work for the next decade was primarily ethnographic (e.g. Hendrichs Pérez 1946),
or concerned with genetic classification of the language (e.g. Osnaya 1959), and did
Language isolates of Mesoamerica and Northern Mexico 249
not involve efforts to expand the linguistic documentation. Escalante Hernández’s 1962
work, simply titled El Cuitlateco, was the next and probably most comprehensive effort
to document the language before it disappeared. He revises some of the claims made by
McQuown and provides an analysis of Cuitlatec phonetics, phonology and morphology.
He gives little information on syntax, but does include two short texts and a vocabulary
of approximately 900 items. In 1969, Escalante Hernández and others also published an
ethnographic work on Cuitlatec (Drucker et al. 1969). Other work on Cuitlatec included
papers by Ruth Almstedt (1974), which were apparently mainly based on unpublished
data from H. V. Lemley. The last known effort to collect original data on Cuitlatec was
in 1979, and the results are published as part of a larger study by researchers at UNAM
to create a Cuitlatec database to facilitate archiving, to create alphabetized lists and to
do searches for phonetic sequences (Valiñas Coalla et al. 1984). Some of the results of
their analysis of Cuitlatec phonetic sequences led to a revision of previous claims about
Cuitlatec, some of which are discussed in Section 5.5.
6 CONCLUDING REMARKS
Although all of the language isolates spoken in Mesoamerica and northern Mexico are
in different stages of obsolescence, those that still have native speakers have been suffi-
ciently well documented that they are no longer in danger of disappearing without record.
While much more can and should be done to document and study these languages fur-
ther, it is reassuring that much valuable information about what these unique languages
contain has been made available to the world thanks to the dedicated efforts of scholars,
particularly in the past 50 years.
The language isolates Huave, Seri and Purépecha are sufficiently well described that
they can be compared with other languages to test for possible broader genetic relation-
ships, and the unsuccessful result of the various attempts to date supports their current
status as language isolates. While the documentation of Cuitlatec is more limited, it is
also sufficient to place in doubt all the proposals to date for broader affiliations that
would include it. However, some proposals may merit revisiting with more, updated data.
Hopefully the work that points out the weaknesses of previous proposals may also point
to directions for further research.
NOTES
1 I thank Lyle Campbell and Stephen Marlett for their very helpful comments on this
chapter. Any remaining mistakes are my own.
2 The orthography has been updated throughout from how it appeared in older sources to
be in line with the current orthographical standard used in Moser and Marlett (2010).
Many thanks to Stephen Marlett for providing the updated versions of these items.
Language isolates of Mesoamerica and Northern Mexico 251
3 Turner (1967: 238–239) also claims that Seri has separate terms for ‘blue’ and ‘green’,
separate terms for buying food versus buying non-food and separate terms for dying
animals versus humans. However, more recent research does not support the existence
of these lexical differences as such (Stephen Marlett, personal communication, 2015).
4 Glossing conventions: 2 = 2nd person, aux = auxiliary, an = action nominalizer,
dcl = declarative, ds = different subject, gen = genitive, ir = irrealis, pass = passive,
pos = possessive, ra = raising prefix, rlt = realis ‘t’ form, rlyo = realis ‘yo’ form,
sg.s = singular subject, unsp.sbj = unspecified subject, ut = unspecified time.
5 The orthography and glossing of this example have been modified by Stephen Marlett
from the original publication to reflect the current community orthographical system
used throughout this section.
REFERENCES
Almstedt, Ruth. 1974. Cuitlatec: An Example of Linguistic Salvage. Presentation at
the XIII Conferencia sobre las lenguas indigenas americanas, sección VII, America
Latina.
Bohnemeyer, Jürgen and Carolyn O’Meara. 2012. Vectors and Frames of Reference: Evi-
dence from Seri and Yucatec. Space and Time in Languages and Cultures: Language,
Culture and Cognition, ed. by Luna Filipović and Kasia Jaszczolt, 217–249. Amster-
dam: John Benjamins.
Bouda, Karl. 1964. Huavestudien I: Uralisches im Huave. Études Finno-ougriennes 1:
18–28.
Bouda, Karl. 1965. Huavestudien II. Études Finno-ougriennes 2:167–175.
Brand, Donald. 1943. A Historical Sketch of Geography and Anthropology in the Taras-
can Region: Part I. New Mexico Anthropologist 37–108.
Bright, William. 1956. Glottochronologic Counts of Hokaltecan Materials. Language 32:
42–48.
Bright, William. 1970. On Linguistic Unrelatedness. IJAL 36: 288–290.
Brinton, Daniel. 1891. The American Race. New York: D. C. Hodges.
Campbell, Lyle. 1997. American Indian Languages: The Historical Linguistics of Native
America. Oxford: Oxford University Press.
Campbell, Lyle, Terrence Kaufman, and Thomas Smith-Stark. 1986. Meso-America as a
Linguistic Area. Language vol. 62, no. 3: 530–558.
Capistrán, Alejandra. 2002. Variaciones de orden de constituyentes en p’orhépecha. Del
cora al maya yucateco: estudios lingüísticos sobre algunas lenguas indígenas mexica-
nas, ed. by Paulette Levy, 349–402. UNAM: Mexico City.
Capistrán, Alejandra. 2004. Construcciones de doble objeto con verbos trivalentes en
p’orhépecha. VII Encuentro Internacional de Lingüística en el Noroeste, t. I, ed. by
Isabel Barreras Aguilar y Mirna Castro Llamas, 445–460. Hermosillo, Mexico: Uni-
versidad de Sonora.
Capistrán, Alejandra. 2006. Sufijos de aumento de participantes de tipo dativo. ¿Existen
aplicativas en p’orhépecha? Southwest Journal of Linguistics 25, no. 1: 85–113.
Capistrán, Alejandra. 2010. Expresión de argumentos, funciones gramaticales y transitiv-
idad en p’orhépecha. Tesis de Doctorado. México, COLMEX.
Capistrán, Alejandra. 2011. Locative and Orientation Descriptions in Tarascan: Topo-
logical Relations and Frames of Reference. Language Sciences 33, no. 6: 1006–1024.
Capistrán, Alejandra. 2015. Multiple Argument Constructions in P’orhepecha: Argument
Realization and Valence-Affecting Morphology. Leiden, The Netherlands: Koninklijke
Brill.
252 Raina Heaton
Capistrán, Alejandra and E. Fernando Nava. 1998. Medio siglo de una lengua del Occi-
dente de México: del tarasco de 1946 al p’urhépecha de 1996, Antropología e Historia
del Occidente de México, México, SMA-UNAM, 143–163.
Chamoreau, Claudine. 1993. Quelques remarques à propos du sujet en phurhépecha.
Travaux du SELF 3: 103–115.
Chamoreau Claudine. 1995. La comparaison en pʰurhépecha. Un exemple d'évolution
syntaxique. Faits de Langues 5: 139–143.
Chamoreau, Claudine. 1998. Le système verbal du Phurhépecha. Systemes Verbaux, ed.
by Fernand Bentolila, 55–69. Belgium, Peeters : Louvain-la-Neuve.
Chamoreau, Claudine. 1999a. Évolution des indices catégoriels en Purépecha. Faits de
Langues 14: 143–152.
Chamoreau, Claudine, 1999b. Le marquage differentiel de l’objet en purépecha. La Lin-
guistique 35, no.2: 99–114.
Chamoreau, Claudine. 2000. Grammaire du purépecha parlé sur les iles du lac de Patzc-
uaro (Mexique). Lincoln Europa: München.
Chamoreau, Claudine. 2007. Grammatical Borrowing in Purepecha. Grammatical Bor-
rowing in Cross-Linguistic Perspective, ed. by Yaron Matras and Jeanette Sakel, 465–
480. Berlin: Mouton de Gruyter.
Chamoreau, Claudine. 2008. Voz antipasiva en lenguas nominativo-acusativas. El caso
del purépecha. Encuentro de Linguistica en el Noroeste 9: 105–124.
Chamoreau, Claudine. 2012a. Constructions périphrastiques du passif en purepecha.
Une explication multifactorielle du changement linguistique. Changement linguistique
et langues en contact: approches plurielles du domaine prédicatif, ed. by Claudine
Chamoreau and Laurence Goury, 71–99. Paris, CNRS Editions.
Chamoreau, Claudine. 2012b. Contact-Induced Change as an Innovation. Dynamics of
Contact-Induced Language Change, ed. by Claudine Chamoreau and Isabelle Léglise,
53–76. Berlin, Mouton de Gryuter.
Chamoreau, Claudine. 2013. Classificateurs numéraux en purepecha: entre perte de vital-
ité et motivation pragmatique. La Linguistique 49, no. 2: 51–66.
Chamoreau, Claudine. 2014. Enclitics in Purepecha: Variation and Split Localization.
Patterns in Mesoamerican Morphology, ed. by Jean-Léo Léonard and Alain Kihm,
119–143. Paris: Michel Houdiard éditeur.
Crawford, James. 1976. A Comparison of Chimariko and Yuman. Hokan Studies: Papers
from the First Conference on Hokan Languages, ed. by Margaret Langdon and Shirley
Silver, 177–191. The Hague: Mouton.
Cuturi, Flavia and Maurizio Gnerre. 2005. Concomitance in Huave. Conference on Oto-
manguean and Oaxacan Languages (COOL), ed. by Rosemary Beam de Azcona and
Mary Paster, Survey of California and Other Indian Languages Report 13, 51–86.
Berkeley, CA: UC Berkeley Department of Linguistics.
Davidson, Lisa and Rolf Noyer. 1997. Loan Phonology in Huave: Nativization and
the Ranking of Faithfulness Constraints. Proceedings of WCCFL 15, ed. by Brian
Agbayani and Sze-Wing Tang, 65–80. Stanford, CSLI.
De Wolf, Paul. 1989. Estudios Lingüísticos sobre la lengua P’orhé. Mexico City: Colegio
de Michoacán.
De Wolf, Paul. 1991. Curso básico del Tarasco hablado. Zamora/Morelia, El Colegio de
Michoacán – Gobierno del Estado de Michoacán.
De Wolf, Paul. 2013. El idioma tarasco: Sinopsis de la estructura gramatical. LINCOM
Studies in Native American Linguistics, 70, München, LINCOM.
Language isolates of Mesoamerica and Northern Mexico 253
Di Peso, Charles and Daniel Matson. 1965. The Seri Indians in 1692 as described by
Adamo Gilg, S.J. Arizona and the West 7: 33–56.
Dixon, Roland and Alfred Kroeber. 1919. Linguistic Families of California. University
of California Publications in American Archaeology and Ethnology 16, no. 3: 47–118.
Drucker, Susana, Roberto Escalante Hernández, and Roberto Weitlaner. 1969.The Cuit-
latec. Handbook of Middle American Indians: Ethnology: Part One, ed. by Evon Vogt,
565–576. Austin, TX: University of Texas Press.
Escalante Hernández, Roberto. 1962. El Cuitlateco. Mexico: INAH.
Evanini, Keelan. 2007. The Phonetic Realization of Pitch Accent in Huave. Proceedings
of the 33rd Annual Meeting of the Berkeley Linguistics Society, ed. by Zhenya Antić,
Charles Chang, Clare Sandy, and Maziar Toosarvandani, 53–65. Berkeley, CA, Berke-
ley Linguistics Society.
Felger, Richard and Mary Moser. 1973. Eelgrass (Zostera marina L.) in the Gulf of Cali
fornia: Discovery of Its Nutritional Value by the Seri Indians. Science 181: 355–356.
Felger, Richard and Mary Moser. 1985. People of the Desert and Sea: Ethnobotany of the
Seri Indians. Tucson: University of Arizona Press, Reprinted 1991.
Felger, Richard, Mary Moser, and Edward Moser. 1980. Seagrasses in Seri Indian Cul-
ture. Handbook of Seagrass Biology: An Ecosystem Perspective, ed. by Roland Phil-
lips and C. Peter McRoy, 260–276. New York: Garland STPM Press.
Farrell, Patrick, Stephen Marlett, and David Perlmutter. 1991. Notions of Subjecthood
and Switch Reference: Evidence from Seri. Linguistic Inquiry 22: 431–456.
Foster, Mary. 1969. The Tarascan Language. University of California publications in
linguistics 56. Berkeley, University of California Press.
Friedrich, Paul. 1969. On the Meaning of the Tarascan Suffixes of Space. Indiana Univer-
sity publications in anthropology and linguistics, Baltimore: Waverly Press.
Friedrich, Paul. 1970. Shape in Grammar. Language 46, no. 2, Part 1: 379–407.
Friedrich, Paul. 1971a. The Tarascan Suffixes of Locative Space: Meaning and Morpho-
tactics. Indiana University: Bloomington.
Friedrich, Paul. 1971b. Dialectal Variation in Tarascan Phonology. IJAL 37, no. 3:
164–187.
Friedrich, Paul. 1971c. Distinctive Features and Functional Groups in Tarascan Phonol-
ogy. Language 47, no. 4: 849–865.
Friedrich, Paul. 1973. Review of: The Tarascan Language by Mary LeCron Foster. Lan-
guage 49, no. 1: 238–245.
Friedrich, Paul. 1975. A Phonology of Tarascan. Chicago: University of Chicago, Depart-
ment of Anthropology.
Gatschet, Albert. 1877. Der Yuma-Sprachstamm nach den neuesten handschriftlichen
Quellen. Zeitschrift für Ethnologie 9: 341–350, 365–418.
Gatschet, Albert. 1883. Der Yuma-Sprachstamm nach den neuesten handschriftlichen
Quellen. Zeitschrift für Ethnologie 15: 123–147.
Gatschet, Albert. 1886. Der Yuma-Sprachstamm nach den neuesten handschriftlichen
Quellen. Zeitschrift für Ethnologie 18: 97–122.
Gatschet, Albert. 1892. Der Yuma-Sprachstamm nach den neuesten handschriftlichen
Quellen. Zeitschrift für Ethnologie 24: 1–18.
Gatschet, Albert. 1900b. The Waikuru, Seri, and Yuman Languages. Science 12: 556–558.
Gilberti, Maturino. 1559. Vocabulario en lengua de Mechuacan, Transcripción paleográ-
fica de Agustín Jacinto Zavala, 1997, Zamora, Michoacán, México, El Colegio de
Michoacán.
254 Raina Heaton
Greenberg, Joseph. 1987. Language in the Americas. Stanford: Stanford University Press.
Greenberg, Joseph. and Merritt Ruhlen. 2007. An Amerind Etymological Dictionary
(12th ed.). Stanford: Department of Anthropological Sciences, Stanford University.
Griffin, William. 2001. Camaron i cultura en Oaxaca. Cuadernos del Sur, vol. 16. Mex-
ico: Oaxaca.
Haas, Mary. 1951. The Proto-Gulf Word for Water (With Notes on Siouan-Yuchi). IJAL
17: 71–79.
Haas, Mary. 1964b. California Hokan. Studies in Californian Linguistics, ed. William
Bright, UCPL, vol. 34, 73–87. Berkeley: University of California Press.
Halle, Morris and Alec Marantz. 1993. Distributed Morphology and the Pieces of Inflec-
tion. The View from Building 20, ed. by Ken Hale and Samuel Jay Keyser, 111–176.
Cambridge, MA: MIT Press.
Hendrichs Pérez, Pedro. 1939. Un estudio preliminar sobre la lengua cuitlateca de San
Miguel Totolapan, Gro. El México Antiguo 4: 329–362.
Hendrichs Pérez, Pedro. 1946. Por Tierras Ignotas: Viajes Y Observaciones en la Región
Del Río de Las Balsas, vol. 2. México: Editorial Cultura.
Huave, Endangered Languages 2015, The Linguist List at Eastern Michigan University
and The University of Hawaii at Manoa. 2/15/2015. www.endangeredlanguages.com/
lang/4292.
Instituto Nacional de Estadística, Geografía e Informática (INEGI). 2000. Tabulados
Básicos, Estados Unidos Mexicanos, XII Censo General de Población y Vivienda.
Instituto Nacional de Lenguas Indígenas (INALI), México, Lenguas indígenas nacionales
en riesgo de desaparición, Huave.
Kaufman, Terrence. 1988. A Research Program for Reconstructing Proto-Hokan: First
Groupings, unpublished manuscript, University of Pittsburgh.
Kaufman, Terrence. 2003. A Preliminary Mayan Etymological Dictionary, with the Assis-
tance of John Justeson, FAMSI report, www.famsi.org/reports/01051/index.html.
Kim, Yuni. 2008. Topics in the Phonology and Morphology of San Francisco del Mar
Huave. Ph.D. thesis, University of California, Berkeley.
Kim, Yuni. 2009. Alternancias causativas y estructura de eventos en el huave de San
Francisco del Mar [Causative alternations and event structure in S.F. del Mar Huave].
Proceedings of CILLA IV.
Kim, Yuni. 2010. Phonological and Morphological Conditions on Affix Order in Huave.
Morphology 20, no. 1: 133–163.
Kim, Yuni. 2011 [to appear], Fuentes de rasgos fonológicos de préstamos castellanos en
huave de San Francisco del Mar, Submitted to the Proceedings of the I Jornada de
Estudios Huaves.
Kim, Yuni. 2013. Estrategias de pasivización en la morfología verbal del huave. Amerin-
dia 37, no. 1: 273–298.
Klar, Kathryn. 1977. Topics in Historical Chumash Grammar. Ph.D. thesis, University
of California, Berkeley.
Kroeber, Alfred. 1915. Serian, Tequistlatecan, and Hokan. UCP AAE, no. 11, Berkeley,
University of California Press, pp. 279–290.
Kroeber, Alfred. 1931. The Seri. Southwest Museum Papers 6.
Lagunas, Juan Bautista de. 1574. Arte y diccionario con otras obras en lengua Michua-
cana, Intro. de J. Benedict Warren 1983, Morelia, Fimax.
Lamb, Sydney. 1959. Some Proposals for Linguistic Taxonomy. AL l, no. 2: 33–49.
Langdon, Margaret. 1974. Comparative Hokan-Coahuiltecan Studies: A Survey and
Appraisal, Janua Linguarum Series Critica no. 4. The Hague: Mouton.
Language isolates of Mesoamerica and Northern Mexico 255
Marlett, Stephen. 2011b. The Seris and the Comcaac: Sifting Fact from Fiction about the
Names and Relationships. Work Papers of the Summer Institute of Linguistics, Univer-
sity of North Dakota Session #51, pp. 1–20.
Marlett, Stephen. 2012b. Relative Clauses in Seri. In Relative Clauses in Languages
of the Americas: A Typological Overview, ed. by Bernard Comrie and Zarina
Estrada-Fernández, 213–242. Amsterdam: Benjamins.
Marlett, Stephen. 2013. A Bibliography for the Study of Seri History, Language and
Culture. April 2013 revision. www.und.nodak.edu/instruct/smarlett/Stephen_Marlett/
Publications_and_Presentations_files/SeriBibliography.pdf.
Marlett, Stephen. Forthcoming. Cmiique Iitom: The Seri language [provisional title].
Marlett, Stephen, F. Xavier Moreno Herrera, and Genaro G. Herrera Astorga. 2005. Illus-
trations of the IPA: Seri. JIPA 35, no. 1: 117–121.
Martínez Soto, Jorge Armando. 2003. Seguimiento de la referencia en el cuento seri.
Master’s thesis, Universidad de Sonora.
Matthews, Peter. 1972. Huave Verb Morphology: Some Comments from a Non-Tagmemic
Standpoint. IJAL 38: 96–118.
McGee, William. 1898. The Seri Indians: Seventeenth Annual Report of the Bureau of
American Ethnology to the Secretary of the Smithsonian Institution. Washington, DC.
1971 reprint by the Rio Grande Press, Glorieta, New Mexico.
McLendon, Sally. 1964. Northern Hokan (B) and (C): A Comparison of Eastern Porno
and Yana. Studies in Californian linguistics, ed. by William Bright, UCPL, vol. 34,
126–144. Berkeley, University of California Press.
McQuown, Norman. 1941. La fonémica del cuitlateco. El México Antiguo 5: 239–254.
Millán, Saúl. 2003. Huaves. Pueblos indígenas del México contemporáneo. Mexico City,
Comisión Nacional para el Desarrollo de los Pueblos Indígenas.
Montoya Pérez, Alberto. 2014a. Sistema vocálico del huave de Santa María del Mar,
Presented at the II Jornada de Estudios Huaves, Instituto de Investigaciones Antro-
pológicas, UNAM.
Montoya Pérez, Alberto. 2014b. Palatalización en el Huave de Santa María del Mar.
Presented at the X Coloquio de lingüística at ENAH.
Moser, Mary and Stephen Marlett. 2010 (2005). Comcáac quih yaza quih hant ihíip
hac: Diccionario seri-español-inglés, Hermosillo, Sonora, Universidad de Sonora and
Plaza y Valdés Editores, Editions 1 and 2.
Moser, Edward. 1963. Seri Bands. The Kiva 28, no. 3: 14–27.
Moser, Edward. 1973. Seri Basketry. The Kiva 38: 105–140.
Moser, Edward. and Mary Moser. 1961, Vocabulario seri: seri-castellano, castellano-
seri, Mexico City, Instituto Lingüístico de Verano.
Moser, Edward and Mary Moser. 1965. Consonant-Vowel Balance in Seri (Hokan) sylla-
bles’, Linguistics, vol. 16, pp. 50–67.
Moser, Edward. and Mary Moser. 1976. Seri Noun Pluralization Classes. Hokan Studies,
ed. by Margaret Langdon and Shirley Silver, 285–296. The Hague, Mouton.
Moser, Mary. 1970a. Seri Elevated Burials. The Kiva 35: 211–216.
Moser, Mary. 1970b. Seri: From Conception Through Infancy. The Kiva 35: 201–210.
Moser, Mary. 1978. ‘Articles in Seri’, Occasional Papers on Linguistics, vol. 2: 67–89.
Moser, Mary and Stephen Marlett. 1999. Seri kinship terminology. SIL Electronic Work-
ing Papers (1999–1005).
Monzón, Cristina. 2004. Los morfemas espaciales del p’urhépecha; significado y morfos-
intaxis. El Colegio de Michoacán, Zamora.
Language isolates of Mesoamerica and Northern Mexico 257
Moseley, Christopher (ed.). 2010. UNESCO Atlas of the World’s Languages in Danger.
UNESCO Publishing. www.unesco.org/culture/en/endangeredlanguages/atlas.
Moser, Mary B. 1988. Seri History (1904): Two Documents. Journal of the Southwest
30: 469–501.
Munguía Duarte, Ana Lidia. 2004. Relaciones de marcación y armonía relativa. Memorias
del VII Encuentro Internacional de Lingüística en el Noroeste, Tomo 1, ed. by Isabel
Barreras Aguilar and Mirna Castro Llamas, 65–84. Hermosillo: Universidad de Sonora.
Munguía Duarte, Ana Lidia. 2005. Morfofonología del konkaak [sic]: Aplicación de
la investigación lingüística en la educación indígena. Ph.D. thesis, Universidad
Autónoma de Sinaloa, Culiacán.
Munguía Duarte, Ana Lidia. 2006. ‘Alternancias vocálicas en posición intermorfémica en
el konkaak [sic]: control de predominancia morfológica’ in Memorias del VIII Encuen-
tro Internacional de Lingüística en el Noroeste, Tomo 2, ed. by Rosa María Ortiz
Ciscomani, 295–320. Hermosillo: Universidad de Sonora.
Munguía Duarte, Ana Lidia and Gerardo López Cruz. 2009. De la fonología a la práctica
ortográfica: Hacia un sistema de escritura en el konkaak [sic]. Lexicografía y escrit-
ura en lenguas mexicanas, ed. by Andrés Acosta Félix and Zarina Estrada-Fernández,
195–214. Editorial UniSon, Hermosillo.
Nava, E. Fernando. 1994. Los clasificadores numerales del p’urhépecha prehispánico.
Anales de antropología, vol. 61, 299–309, UNAM, México.
Nava, E. Fernando. 2004. La voz media en p’urhepecha un estudio de formas y significa-
dos. Ph.D. thesis, FFL-UNAM, Mexico.
Noyer, Rolf. 1991. Tone and stress in the San Mateo dialect of Huave. Proceedings of
ESCOL 1991, 277–288.
Noyer, Rolf. 1993. Mobile affixes in Huave: optimality and morphological well-
formedness. Proceedings of the Twelfth West Coast Conference on Formal Linguistics,
ed. by Eric Duncan, Donka Farkas, and Philip Spaelti, 67–82. Stanford: CSLI.
Noyer, Rolf. 1997. Features, Positions and Affixes in Autonomous Morphological Struc-
ture. Garland: New York.
Noyer, Rolf. 2003. A Generative Phonology of Huave, unpublished manuscript, Univer-
sity of Pennsylvania.
Olmsted, David. 1964. A History of Palaihnihan Phonology. University of California
Publications in Linguistics, vol. 35, Berkeley: University of California Press.
O’Meara, Carolyn. 2008. ‘Basic Locative Construction in Seri’ in Memorias del IX
Encuentro Internacional de Lingüística en el Noroeste, Tomo 2, ed. by Rosa María
Ortiz Ciscomani, 253–269. Hermosillo: Universidad de Sonora.
O’Meara, Carolyn. 2010. Seri Landscape Classification and Spatial Reference. Ph.D.
thesis, University at Buffalo.
O’Meara, Carolyn. 2011a. Frames of Reference in Seri. Language Sciences 33, no. 6:
1025–1046.
O’Meara, Carolyn. 2011b. The locative definite article hac in Seri. Fonología, morfología
y tipología semántico-sintática, ed. by Ana Lidia Munguía Duarte, Estudios Lingüísti-
cos 1, Hermosillo, Sonora, Mexico: Editorial Universidad de Sonora.
O’Meara, Carolyn. 2014a. Verbos de movimiento en seri y la expresión de trayectoria.
Verbos de movimiento en lenguas de América: Léxico, sintaxis y pragmática, ed. by
Lilián Guerrero, 207–236. México, Instituto de Investigaciones Filológicas, UNAM.
O’Meara, Carolyn. 2014b. Entre lugares, estrellas y vientos: Descripciones de rutas y
narraciones del paisaje en seri. Mapas del cielo y de la tierra. Espacio y territorio en la
258 Raina Heaton
Stairs, Emily and Barbara Hollenbach. 1969. Huave verb morphology. IJAL 35: 38–53.
Stairs, Emily and Barbara Hollenbach. 1981. Gramática huave. Diccionario huave de San
Mateo del Mar, ed. by Glen Stairs and Emily Stairs, 283–391. SIL, Mexico.
Stairs, Glen and Emily Stairs. 1981. Diccionario huave de San Mateo del Mar, Serie
de vocabularios y diccionarios indígenas “Mariano Silva y Aceves” 24, SIL, Mexico.
Stolz, Thomas. 1996. Some Instruments Are Really Good Companions – Some Are Not.
On Syncretism and the Typology of Instrumentals and Comitatives. Theoretical Lin-
guistics 23, no. 1–2: 113–200.
Suaréz, Jorge. 1975. Estudios Huaves. Collección Lingüistica 22, INAH, Mexico.
Swadesh, Morris. 1960. The Oto-Manguean Hypothesis and Macro-Mixtecan. IJAL 26:
79–111.
Swadesh, Morris. 1966. Porhé y Maya. Anales de Antropología 3: 173–204.
Swadesh, Morris. 1967a. Lexicostatistic Classification. Linguistics, ed. Norman McQuown,
vol. 5 of HMAI, ed. by Robert Wauchope, 79–115. Austin: University of Texas Press.
Swadesh, Morris. 1969. Elementos del tarasco antiguo. México: UNAM.
Turner, Paul. 1967. Seri and Chontal (Tequistlatec). IJAL 33, no. 3: 235–239.
Valiñas Coalla, Leopoldo, Mario Cortina Borja, and Miguel Mireles Padilla. 1984. Notas
sobre el Cuitlateco. Anales de Antropologia 21, no. 1, Instituto de Investigaciones
Antropológicas, UNAM.
Vázquez Rojas Maldonado, Violeta. 2012. The Syntax and Semantics of Purépecha Noun
Phrases and the Mass/Count Distinction. Ph.D. thesis, New York University.
Villavicencio Zarza, Frida. 2006. P’orhépecha kaso sïrátahenkwa: Desarrollo del sistema
de casos del Purépecha. Mexico: DF, Colegio de México, Centro de Investigaciones
Superiores en Antropología Social.
Voegelin, Carl and Florence Voegelin. 1965. Classification of American Indian Lan-
guages. AL 7, no. 7: 121–150.
Wares, Alan. 1956. Suffixation in Tarascan. Master’s thesis, Indiana University.
Wares, Alan. 1972. Review of: The Tarascan Language, by Mary L. Foster. Journal of
Linguistics vol. 8: 190–196.
Weitlaner, Robert. 1939. Notes on the Cuitlatec Language. El México Antiguo 4: 363–373.
Wimbish, John. 1989. ‘WORDSURV: A Program for Analyzing Language Survey Word
Lists. Occasional Publications in Academic Computing, vol. 13. Dallas: Summer Insti-
tute of Linguistics.
CHAPTER 10
LANGUAGE ISOLATES IN
SOUTH AMERICA
Frank Seifart and Harald Hammarström
1 INTRODUCTION
South America is the continent with the highest proportion of language isolates: as much
as 60% of the lineages are isolates (no other continent surpasses 50%) and more than
10% of South American languages are isolates (65 out of 574 languages), compared to an
average of less than 2.5% on other continents (Table 10.1). But it is not only the number
of isolates that is reflective of the genealogical diversity in South America. More gen-
erally, this continent exhibits more two-member families, more three-member families,
fewer very large families and so on, compared to the other continents. Entropy (as in
Table 10.1) is a systematic measure of the diversity of a distribution (here, the division
of languages into lineages), and South America shows the highest entropy, which is also
reflected in the average of only about 5 languages per lineage, compared to an average of
about 25 languages per lineage in other continents.
The fact that the proportion of isolates (and linguistic diversity more generally) is so
much greater in South America than on other continents becomes even more intriguing
when considering that South America was the continent that was the last to be populated
by humans, i.e. languages had less time to diverge there than on other continents. Nich-
ols (1990) pushed the argument that diversity can only be the result of early settlement,
implying that the Americas must have been settled several dozen millennia ago, i.e. much
earlier than previously assumed. Specifically regarding South America, if one assumes
that the Americas were settled first by passing through the Bering Strait and further into
South America mainly via the land route whose most narrow stretch is in Panama, then
this idea becomes incongruent with the linguistic diversity in South America, which is
higher than in North America, yet with a strictly later settlement. Nettle (1999), on the
other hand, argues that diversity is the expected result of a relatively recent migration into
an unoccupied area. In this model, diversity results from initial fissioning in a novel area
rich in resources, and lack of diversity arises when there is sufficient time for later expan-
sions to obliterate the diversity from the initial settlement. Blench (2012) is consistent
with this scenario, arguing that the most powerful obliterative expansions are the ones
linked to agriculture and that these happened relatively late in South America. Nettle’s
(1999) model is clearly more consistent with the linguistic as well as archaeological data
for the Americas. But on a world-level, a simple equation of late settlement with high
diversity is difficult to reconcile with the archaeological and linguistic facts of, e.g. the
New Guinea area, so probably more parameters than settlement-depth need to be taken
into account to explain the emergence of linguistic diversity on a global level.
Specifically regarding South American isolates, Dahl et al. (2011) propose a novel
suggestion for their emergence. First, migration routes are calculated using detailed geo-
graphical/geomorphological datasets, such that, if entry into South America was in the
northwest and the migrating humans were aiming to reach the southern tip, what would
be the shortest/least cumbersome way to get there? This procedure yields a route through
the Andes to the southern tip with various wrinkles along the way. Dahl et al. (2011)
note impressionistically that the geographical distribution of language isolates seems to
be concentrated along this route, as if they were ‘dropped-off’ on migrations along the
route. While attractive, the assumption behind obtaining the route in question, namely
that the migrations were destined to reach the southern tip, is more convenient than it is
realistic. Also, objective measures as to the geographical concentration of isolates are still
lacking, as are comparisons to even simpler geologically based co-variation than migra-
tion routes, e.g., simply being in a foothills area between the Andes and the rainforest.
In recent work, Epps (in press and personal communication) links South American
linguistic diversity, and thus also the high number of isolates, to distinctly South Ameri-
can patterns of social organization. She argues that Amazonian people, and maybe South
American people in general (at least prior to the Inca expansion), present particular
dynamics of interaction and corresponding linguistic ideologies. Consistent with Eriksen
(2011), who uses a Geographic Information System (GIS) to reconstruct ancient ethnoge-
netic processes from archaeology, linguistics, geography, and ethnohistory, Epps argues
that Amazonian societies “developed a set of socio-economic practices in which different
groups formed complementary parts of larger systems” rather than being organized in
hierarchical, top-down social structures. A prototypical example of such a multiethnic
and multilingual system is the Vaupés, which encompasses a couple dozen ethnolinguis-
tic groups. Crucially for linguistic diversity, “differences are viewed as essential to the
functioning of the articulated whole” within such systems. This is prototypically exem-
plified in the institutionalized linguistic exogamy in the Vaupés regional system but is
consistent also with remarkably low rates of lexical borrowing across South America
(at least Amazonia) (Bowern et al. 2011; Bowern et al. 2014). This distinctly South Amer-
ican indigenous social structure implies historically relatively little language shift (prior
to the arrival of Europeans), which would wipe out linguistic diversity, and it provides a
motivation for the maintenance of genealogical distinctions with little subsequent diver-
sification, resulting in the long term in a high number of isolates.
If isolates are the result of purely historical processes of language expansions and
language extinction, there is little reason to suspect that language isolates should be struc-
turally different from non-isolates. But the historical processes may be conditioned by
factors that leave structural commonalities of isolates as epiphenomena, for example, if
isolates were more likely to be spoken by hunter-gatherers or if isolates are remnants of
a linguistic area. However, these lines of research have yet to be systematically investi-
gated for the South American continent.
262 Frank Seifart and Harald Hammarström
the Caribbean coast and islands, but there are also some Tupian languages, and a num-
ber of smaller families. This area hosts 13 isolates, 6 of which are extinct. Many of the
isolates of this area were first mentioned, and often documented in more or less a short
wordlist, by Theodor Koch-Grünberg, who travelled the area extensively in the early 20th
century.
2.5 Taruma
The Taruma people lived near the mouth of the Rio Negro River in Brazil in the late 17th
century (Rivière 1966) but subsequently moved to the Southern Guianas where the tribe
diminished and ceased to exist as a separate ethnolinguistic group sometime around the
1920s. Without a separate ethnolinguistic identity, the language was presumed extinct
until three surviving speakers were found living among the Wapishana (Carlin and Mans
2014:82–85). Only one speaker remains today who is no longer completely fluent (Sérgio
Meira, personal communication, 2015). Eithne Carlin and Sérgio Meira have worked
with the last (semi-)speakers and some unpublished textual data collected in the 1920s
has survived, which promises that at least some of the grammatical characteristics of
Taruma will be known. Until now, the only published data consists of wordlists (Lou-
kotka 1949). An ISO 639–3 code for this language has recently been requested.
in an inaccessible region in central Venezuela, contacted first only in the 1970s. There
are descriptions of a number of Yuwana grammatical features (Vilera Díaz 1985; 1987;
Quatra 2008a), and there is a dictionary (Quatra 2008b). Yuwana has a system of nominal
classification reminiscent of the Sáliban languages, and once the Yuwana system has
been sufficiently described, a systematic comparison can be undertaken. Jolkesky (2009)
is an initial comparison involving lexical and some grammatical morphemes of Yuwana,
Sáliban, Andoque (Section 4.1), and Tikuna.
to Crevels 2007) and may be the largest isolate in South America in terms of numbers of
speakers. A number of good grammatical descriptions are available on this language (Slo-
cum 1986, Jung 1989, Rojas Curieux 1998). Páez grammar is noteworthy, among other
things, for its complex phoneme inventory, with various series of consonants (involving
aspiration, prenasalization, and palatalization) and various series of vowels (involving
length, glottalization, and aspiration).
and eventually became extinct in the early, 20th century. There is a certain amount of doc-
umentation of this language, e.g. Middendorf (1892), who did his own fieldwork, as well
as Brüning (2004) and Hovdhaugen (2004) which are based on a more thorough analysis
of older sources, which include the very early de la Carrera (1880 [1644]).
The Andaqui [ana] were once a numerous group in what is now Southern Colombia.
They perished along with their language, of the same name, in fierce warfare against the
Spaniards. Similarities with Chibchan languages (Rivet 1924) and with the neighbouring
isolate Páez (see Section 3.1) have been noted (Adelaar 2004:140), but neither of these
substantiate genealogical relatedness. Linguistic documentation is limited to two (rather
extensive) wordlists (Anónimo 1928b; Vergara y Vergara and Delgado 1860).
Four extinct isolate languages were spoken on the slopes of the Andes: Atacame, Yuru-
manguí, Sechuran, and Tallán. Atacame or Esmeraldeño became extinct in the 19th cen-
tury. The only available Atacame data was collected in 1877 by J. M. Pallares, which was
reproduced and discussed in later publications (e.g., Seler 1902; Jijón y Caamaño 1945).
Atacame had as an interesting grammatical feature classifying prefixes that refer to shape
(example 2), a feature common in Amazonian language. By the time it was documented,
the Atacame language was spoken by a population of predominantly African descent,
which raises the possibility that it is an African language rather than an (adopted) Amer-
indian language. Cursory searches for resemblances with mainly West African languages
(e.g., in terms of classifying prefixes) have been carried out by various individuals, so far
without interesting results.
The only record of Yurumanguí is one wordlist from the 18th century. It was used
by Rivet (1942) to propose a genealogical relation with the putative Hokan languages of
North America, a proposal which has since been rejected (e.g., Constenla Umaña 1991).
The main source for the extinct isolate Sechuran of the coastal plain of Northern Peru
is a wordlist collected in 1863 by Richard Spruce and published by von Buchwald (1919).
Even less material survives of neighbouring Tallán (Ramos Cabredo 1950). There are
occasional lexical links between Tallán and Sechuran (Adelaar 2004:398–400), but the
268 Frank Seifart and Harald Hammarström
very limited data available is not compelling for a genealogical relationship. Culli was
spoken in the Central Andes late into the 20th century (Adelaar 1988), surrounded by
Quechua, and is documented in two wordlists, one of them published by Rivet (1949).
Mutual influence of Culli with surrounding varieties of Quechua can be shown (Adelaar
2004:401–404).
4 WESTERN AMAZON
In the Western Amazon a number of Arawakan and Tupian languages are spoken, and
it is the home of a number of (more or less) small families such as Jivaroan, Zaparoan,
Witotoan, and Tucanoan. In this setting, there are also nine language isolates, which
include three known extinct isolates. It is safe to assume that many more isolates existed
but vanished without leaving traces, given the later onset of colonization in the lowlands,
when compared to the Andes. Missionary sources contain long lists of ethnonyms which
are never heard of in later accounts and are likely to include some isolates that disap-
peared without a discernible linguistic footprint.
Adelaar 2004:456). The language is presumed to be extinct (Lewis, Simons and Fennig
2015), although Michael and Beier (2012) have located a few semi-speakers from whom
they collected another wordlist and did a phonemic analysis.
tikoy-kay-a=sne
b. os mimi:di
kill-inverse-linker =3.fem.absolutive article.neuter.past snake
‘The snake killed her.’
Language isolates in South America 271
6 CENTRAL AMAZON
The Central Amazon region is home to many Tupian languages, interspersed with, among
others, six known isolated languages, two of which are extinct.
referred to as Mura, probably consisting of various dialects, that moved through a vast
territory in Central and Western Amazonia, as far Northwest as the Caquetá River. Pirahã
is known for its small phoneme inventory and complex prosody, which facilitates whis-
tled and hummed speech (Everett 1985). Pirahã appears to have extremely simple clause
structure and is also claimed to lack recursion, numerals, and colour terms (Everett 2005).
Thomason and Everett (2001) argued that all of Pirahã personal pronouns are borrowed
from Nhengatú (Tupian).
7 EASTERN AMAZON
The Eastern Amazon, along the Atlantic coast of Brazil, was once dominated by lan-
guages from the Macro-Gé family. Many of them became extinct during the relatively
early occupation by Europeans. Consequently, only one of the five isolates that are known
from this area now has any known living speakers.
REFERENCES
Adelaar, Willem F.H. 1988. Search for the Culli Language. Continuity and Identity in
Native America: Essays in Honor of Benedikt Hartmann, vol. I, ed. by M. Jansen, Peter
van der Loo and R. Manning, 111–131. (Indiaanse Studies). Leiden: E. J. Brill.
Adelaar, Willem F.H. 2004. The Languages of the Andes. Cambridge: Cambridge Uni-
versity Press.
Alicea, Neftalí. 1975a. Análisis fonémico preliminar del idioma taushiro. Datos
Etno-Lingüísticos 23: 1–65.
Alicea, Neftalí. 1975b. Análisis preliminar de la gramática del idioma Taushiro. (Datos
Etno-Lingüísticos 24). Lima: Instituto Lingüístico de Verano. www.sil.org/americas/
peru/html/pubs/show_work.asp?id=3413.
Anderson, Loretta and Mary Ruth Wise. 1963. Contrastive Features of Candoshi Clause
Types. Studies in Peruvian Indian Languages 1: 67–102. (Summer Institute of Lin-
guistics: Publications in Linguistics 9). The Summer Institute of Linguistics and the
University of Texas at Arlington.
Angenot-de-Lima, Geralda. 2002. Description Phonologique, Grammaticale et Lexicale
du Moré, Langue Amazonienne de Bolivie et du Brésil. Leiden: Rijksuniversiteit te
Leiden.
Anónimo. 1928a. Traduccion de algunas voces de la lengua Guama. Lenguas de América
6: 382–393. (Manuscritos de La Real Biblioteca, Vol. 1, Catálogo de La Real Biblio-
teca). Madrid.
Anónimo. 1928b. Vocabulario Andaqui-Español. Lenguas de América 6: 175–195. (Man-
uscritos de La Real Biblioteca, Vol. 1, Catálogo de La Real Biblioteca). Madrid.
Aragon, Carolina Coelho. 2014. A Grammar of Akuntsú, a Tupían Language. Mānoa:
University of Hawai’i at Mānoa.
Arinterol, Basilio. 2000. Warao. Manual de Lenguas Indígenas de Venezuela (Serie Ori-
genes), Esteban Emilio Mosonyi and Jorge Carlos Mosonyi, 116–183. Caracas: Fun-
dación Bigott.
Arrieta, Anita E. 1993. Tipología morfosintactica del timote. Revista de Filología y
Lingüística de la Universidad de Costa Rica XIX, no. 2: 99–110.
Arrieta, Anita E. 1998. Tipología fonológica del timote. Revista de Filología y Lingüística
de la Universidad de Costa Rica XXIV, no. 1: 85–100.
Language isolates in South America 277
Carrera, Fernando de la. 1880. Arte de la lengua Yunga de los valles del Obispado de
Trujillo. Lima: Imprenta Liberal.
Casamiquela, Rodolfo M. 1983. Nociones de Gramática del Gününa Küne: Présentation
de la langue des Tehuelche Septentrionaux Australs (Patagonie Continentale). Édi-
tions du Centre National de la Recherche Scientifique.
Castellví, Marcelino de. 1940. La Lengua Tinigua. Journal de la Société des Américan-
istes XXXII: 93–101.
Constenla Umaña, Adolfo. 1991. Las lenguas del áreas intermedia. Indroducción a su
estudio areal. San José: Editorial de la Universidad de Costa Rica.
Costa, Januacele da. 1999. Ya:thê, a última língua nativa no Nordeste do Brasil: aspectos
morfofonológicos y morfo-sintáticos. Recife: Universidade Federal de Pernambuco.
Cox, Doris. 1957. Candoshi Verb Inflection. International Journal of American Linguis-
tics 23: 129–140.
Créqui-Montfort, Georges de and Paul Rivet. 1913. Linguistique Bolivienne: La Famille
Linguistique Čapakura. Journal de la Société des Américanistes X: 119–172.
Crevels, Mily. 2002. Itonama o Sihnipadara, Lengua no Clasificada de la Amazonía Boli
viana. (Estudios de Lingüística 16). Departamento de Filología Española, Lingüística
General y Teoría de Literatura, Universidad de Alicante.
Crevels, Mily. 2007. South America. Encyclopedia of the World’s Endangered Lan-
guages, ed. by Christopher Moseley, 103–196. London and New York: Routledge.
Crevels, Mily. 2012a. Itonama. Ambito Andino, vol. 2, ed. by Mily Crevels and Pieter
Muysken, 233–294. (Lenguas de Bolivia). La Paz: Plural Editores.
Crevels, Mily. 2012b. Canichana. Amazonía, vol. 2, Mily Crevels and Pieter Muysken,
415–449. (Lenguas de Bolivia). La Paz: Plural Editores.
Crevels, Mily and Hein van der Voort. 2008. The Guaporé-Mamoré Region as a Lin-
guistic Area. From Linguistic Areas to Areal Linguistics, ed. by Pieter Muysken, 151–
179. (Studies in Language Companion Series 90). Amsterdam, Philadelphia: John
Benjamins.
Crevels, Mily and Pieter Muysken. 2012. Cayubaba. Ambito Andino, ed. by Mily Crevels
and Pieter Muysken, vol. 2, 341–374. (Lenguas de Bolivia). La Paz: Plural Editores.
Croese, Robert A. 1985. Mapuche Dialect Survey. South American Indian Languages:
Retrospect and Prospect, ed. by Harriet E. Manelis Klein and Louisa Stark, 784–801.
Austin: Texas University Press.
Dahl, Östen, Christopher Gillam, David G. Anderson, José Iriarte and Silvia M. Copé.
2011. Linguistic Diversity Zones and Cartographic Modeling: GIS as a Method for
Understanding the Prehistory of Lowland South America. Ethnicity in Ancient Amazo-
nia: Reconstructing Past Identities from Archaeology, Linguistics, and Ethnohistory,
ed. by Alf Hornborg and Jonathan D. Hill, 211–224. Boulder, CO: University Press of
Colorado.
Epps, Patience. In press. Amazonian Linguistic Diversity and Its Sociocultural Cor-
relates. Language Dispersal, Diversification, and Contact: A Global Perspective, ed.
by Mily Crevels and Pieter Muysken. Oxford: Oxford University Press.
Eriksen, Love. 2011. Nature and Culture in Prehistoric Amazonia: Using G.I.S. to Recon-
struct Ancient Ethnogenetic Processes from Archaeology, Linguistics, Geography, and
Ethnohistory. Lund, Sweden: Lund University.
Everett, Daniel L. 1985. Syllable Weight, Sloppy Phonemes, and Channels in Pirahã
Discourse. Proceedings of the Eleventh Annual Meeting of the Berkeley Linguistics
Society, ed. by Mary Niepokuj, Mary van Clay, Vassiliki Nikiforidou and Deborah
Feder, 408–416. Berkeley: Berkeley Linguistics Society.
Language isolates in South America 279
Hargreaves, Inês. 2007. Lista de palavras transcritas por Inês Hargreaves, de dois gru-
pos ao norte do Parque Aripuanã, RO.
Haude, Katharina. 2006. A Grammar of Movima. Nijmegen University Ph.D. disserta-
tion. http://repository.ubn.ru.nl/bitstream/2066/41395/1/41395_gramofmo.pdf.
Haude, Katharina. 2009. Hierarchical Alignment in Movima. International Journal of
American Linguistics 75, no. 4: 513–532.
Hervás y Panduro, Lorenzo. 1787. Saggio Pratico delle lingue. (Idea dell’Universo XXI).
Cesena: Gregorio Biasini all’Insengna di Pallade.
Hervás y Panduro, Lorenzo. 1800. Lenguas y naciones Americanas. (Catálogo de Las
Lenguas de Las Naciones Conocidas, Y Numeracion, Division, Y Clases de Estas I).
Madrid: Imprenta de la Administración del real arbitrio de beneficencia.
Hervás y Panduro, Lorenzo. 1971. Elementos Grammaticales de la lengua Maipure [sic!].
Aportes Jesuiticos a la Filología Colonial Venezolana: Tomo II Documentos, vol. 4/5,
ed. by José del Rey Fajardo, 277–288. (Lenguas Indígenas de Venezuela). Caracas:
Universidad Católica Andres Bello.
Hervás y Panduro, Lorenzo. no date. Lingua Lule.
Hovdhaugen, Even. 2004. Mochica. (Languages of the World/Materials 433). München:
Lincom.
Howard, Linda. 1967. Camsa Phonology. Phonemic Systems of Colombian Languages,
ed. Viola G. Waterhouse, 73–87. Norman: Summer Institute of Linguistics.
Howard, Linda. 1977. Camsa: Certain Features of Verb Inflection as Related to Paragraph
Types. Discourse Grammar: Studies in Indigenous Languages of Colombia, Panama,
and Ecuador, part 2, ed. by Robert E. Longacre and Frances Woods, 273–296. (Sum-
mer Institute of Linguistics Publications in Linguistics and Related Fields 52(2)).
Arlington: Summer Institute of Linguistics and University of Texas at Arlington. www.
sil.org/acpub/repository/15975.pdf.
Hugo, Vitor. 1959. Desbravadores. São Paulo: Missão Salesiana de Humaitá.
Ihering, Hermann von. 1907. A anthropologia do estado de São Paulo. Revista do Museu
Paulista VII: 202–257.
Jahn, Alfredo. 1927. Los Aborígenes del Occidente de Venezuela: Su Historia, Etnografía
y Afinidades Lingüísticos. Caracas: Lit. y Tip. del Comerio.
Jamioy Muchavisoy, José Narciso. 1989. Morfología del verbo Kamëntsa. Santafé de
Bogotá: Universidad de los Andes.
Jamioy Muchavisoy, José Narciso. 1992. Tiempo, aspecto y modo en kamentsa. Memo-
rias del II Congreso del CCELA, vol. 2, 199–207. (Memorias). Bogotá: Universidad
de los Andes, CCELA.
Jamioy Muchavisoy, José Narciso. 1999. Estructuras predicativas del kamëntsa. Con-
greso de Lingüística Amerindia y Criolla. Lenguas Aborígenes de Colombia, 251–284.
(Memorias 6). Santafé de Bogotá: CCELA, UNIANDES.
Jijón y Caamaño, Jacinto. 1941. El Ecuador interandino y occidental antes de la con-
quista castellana, vol. 2. Quito: Editorial Ecuatoriana.
Jijón y Caamaño, Jacinto. 1945. Las lenguas del Ecuador preincáico. Antropología pre-
hispánica del Ecuador, 69–94. Quito: La prensa catolica.
Jolkesky, Marcelo. 2009. Macro-Daha: reconstrução de um tronco lingüístico do noroeste
amazônico. Paper presented at the ROSAE - I Congresso Internacional de Lingüística
Histórica, Salvador do Bahia.
Juajibioy Chindoy, Alberto. 1962. Breve Estudio preliminar del grupo Aborigen de
Sibundoy y su lengua Kamsa en el sur de Colombia. Boletín del Instituto de Antro-
pología [Universidad de Antioquia] II, no. 8: 3–33.
Language isolates in South America 281
Jung, Ingrid. 1989. Grammatik des Paez: Ein Abriss. Osnabrück: Universität Osnabrück.
Kaufman, Terrence. 1990. Language History in South America: What We Know and
How to Know More. Amazonian Linguistics. Studies in Lowland South American Lan-
guages, ed. by Doris L. Payne, 13–73. (Texas Linguistics Series). Austin: University
of Texas Press.
Kaufman, Terrence. 1994. The Americas. Atlas of the World’s Languages, ed. by Christo-
pher Moseley and R.E. Asher, 1–76. Cambridge: Cambridge University Press.
Kerke, Simon van de. 2000. Case Marking in the Leko Language. Ensaios sobre len-
guas indígenas de las tierras bajas de Sudamérica: Contribuciones al 49o Congreso
Internacional de Americanistas en Quito 1997, vol. 1, Hein van der Voort and Simon
van de Kerke, 25–37. (Lenguas Indígenas de América Latina (ILLA)). Leiden:
Research School of Asian, African and Amerindian Studies (CNWS), Universiteit
Leiden.
Kerke, Simon van de. 2002. Complex Verb Formation in Leko. Current Studies on South
American Languages, vol. 3, ed. by Mily Crevels, Simon van de Kerke, Sérgio Meira
and Hein van der Voort, 241–254. (Lenguas Indígenas de América Latina (ILLA)).
Leiden: Research School of Asian, African and Amerindian Studies (CNWS), Univer-
siteit Leiden.
Kerke, Simon van de. 2006. Object Cross-Reference in Leko. What’s in a Verb?, ed.
by Grażyna J. Rowicka and Eithne B. Carlin, 171–188. (LOT Occasional Series 5).
Utrecht: LOT.
Kerke, Simon van de. 2009. El Leko. Ambito Andino, vol. 1, Pieter Muysken and Mily
Crevels, 287–332. (Lenguas de Bolivia). La Paz: Plural Editores.
Key, Harold. 1963. Morphology of Cayuvava. Austin: University of Texas.
Key, Harold. 1974. Cayuvava texts. (Language Data Amerindian Series 4). Dallas: Sum-
mer Institute of Linguistics.
Key, Harold. 1975. Lexicon-Dictionary of Cayuvava-English. (Language Data Amerin-
dian Series 5). Dallas: Summer Institute of Linguistics.
Key, Harold H. 1967. Morphology of Cayuvava. (Janua Linguarum: Series Practica LIII).
Berlin: Mouton de Gruyter.
Koch-Grünberg, Theodor. 1913. Abschluß meiner Reise durch Nordbrasilien zum Ori-
noco, mit besonderer Berücksichtigung der von mir besuchten Indianerstämme.
Zeitschrift für Ethnologie 45: 448–474.
Koch-Grünberg, Theodor. 1922. Die Völkergruppierung zwischen Rio Branco, Orinoco,
Rio Negro und Yapurá. Festschrift Eduard Seler dargebracht zum 70: Geburtstag von
Freunden, Schülern und Verehrern, ed. by W. Lehmann, 205–266. Stuttgart: Stecker
und Schröder.
Koch-Grünberg, Theodor. 1928a. Auake. Sprachen, 308–313. (Von Roroima Zum Ori-
noco: Ergebnisse Einer Reise in Nordbrasilien Und Venezuela in Den Jahren 1911–
1913 4). Stuttgart: Strecker und Schröder.
Koch-Grünberg, Theodor. 1928b. Sapará, Purukotó, Wayumará. Sprachen, 257–272.
(Von Roroima Zum Orinoco: Ergebnisse Einer Reise in Nordbrasilien Und Venezuela
in Den Jahren 1911–1913 4). Stuttgart: Strecker und Schröder.
Landaburu, Jon. 1979. La langue des Andoke. 36: SELAF.
Landaburu, Jon. 2000. La Lengua Andoque. Lenguas indígenas de Colombia: una visión
descriptiva, ed. by María Stella González de Pérez and María Luisa Rodríguez de
Montes, 275–288. Santafé de Bogotá: Instituto Caro y Cuervo.
Lapenda, Geraldo. 1968. Estrutura da Língua Iatê: Falada pelos índios Fulniôs em Per-
nambuco. Recife: Imprensa Universitaria, Universidade Federal de Pernambuco.
282 Frank Seifart and Harald Hammarström
Lapenda, Geraldo Calábria. 1962. O dialecto Xucuru. Doxa (Revista Oficial do Departa-
mento de Cultura do Diretório Acadêmico da Faculdade de Filosofia de Pernambuco
da Universidade do Recife). X, no. 10: 11–23.
Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig (eds.). 2015. Ethnologue:
Languages of the World, Eighteenth edition. Dallas, TX: SIL International. www.
ethnologue.com (13 June 2013).
Loukotka, C̆ estmír. 1968. Classification of South American Indian Languages, ed.
Johannes Wilbert. Los Angeles: Latin American Center, University of California.
Loukotka, Čestmír. 1949. La Langue Taruma. Journal de la Société des Américanistes
XXXVIII: 53–82.
Loukotka, Čestmír. 1955. Les langues non-Tupí du Brésil du Nord-Est. Anais do XXXI
Congresso Internacional de Americanistas 31, São Paulo, 1954, vol. II, ed. by Herbert
Baldus, 1029–1054. São Paulo: Anhembi.
Lozano, Elena. 2006. Textos Vilelas (con notas gramaticales y etnográficas), ed. by
Lucía A. Golluscio. Buenos Aires: Instituto de Lingüística, Universidad de Buenos
Aires.
Luzena, Gerónimo Josef de. no date. Traducion de la lengua española á la otomaca,
taparita y yarura.
Machoni de Cerdeña, Antonio. 1877 [1732]. Arte y vocabulario de la lengua lule o tono-
coté. Buenos Aires: Coni.
Maciel, Iraguacema. 1991. Alguns aspectos fonológicos e morfológicos da língua Máku.
Brasilia: Universidade de Brasília.
Matallana, Baltasar de and Cesareo de Armellada. 1943. Exploración del Paragua. Boletín
de la Sociedad Venezolana de ciencias naturales VIII, no. 53: 61–110.
Meader, Robert E. 1978. Indios do Nordeste: Levantamento Sobre Os Remanescentes
Tribais do Nordeste Brasileiro. (Série Lingüística 8). Brasília: Summer Institute of
Linguistics.
Meland, D. 1968. Fulniô Grammar. (Arquivo Lingüístico 26). Brasilia: Brasilia: ILV.
Michael, Lev and Christine Beier. 2012. Phonological sketch and classification of Aʔɨwa
[ISO 639: ash].
Middendorf, E.W. 1892. Das Muchik oder die Chimu-Sprache mit einer einleitung über
die culturvölker, die gleichzeitig mit den Inkas und Aimaraàs in Südamerika lebten
und einem Anhang über die Chibcha-Sprache. (Die Einheimischen Sprachen Perus).
Leipzig: F. A. Brockhaus.
Migliazza, Ernesto C. 1965. Fonología Makú. Boletim do Museu Paraense Emílio Goeldi,
Série Antropologia 25: 1–17.
Migliazza, Ernesto C. 1966. Esbôço sintático de um corpus da língua Makú. Boletim do
Museu Paraense Emílio Goeldi, Série Antropologia 32: 1–38.
Migliazza, Ernesto C. 1978. Maku, Sape and Uruak Languages: Current Status and Basic
Lexicon. Anthropological Linguistics XX, no. 3: 133–140.
Migliazza, Ernesto C. 1980. Languages of the Orinoco-Amazon Basin: Current Status.
Antropológica 53: 95–162.
Migliazza, Ernesto C. 1983. Lenguas de la Región Orinoco Amazonas: Estado Actual.
América Indígena 43: 703–784.
Migliazza, Ernesto C. 1985. Languages of the Orinoco-Amazon Region: Current Status.
South American Indian Languages: Retrospect and Prospect, ed. by Harriet E. Manelis
Klein and Louisa Stark, 17–139. Austin: Texas University Press.
Migliazza, Ernesto C. 2008. Máku. Paper presented at the 4th Conference on Endangered
Languages and Cultures of Native America, University of Utah.
Language isolates in South America 283
Monserrat, Ruth Maria Fonini. 2000. A língua do povo Mỹky. Rio de Janeiro: Universi-
dade Federal do Rio de Janeiro.
Monserrat, Ruth Maria Fonini and Elizabeth R. Amarante. 1995. Dicionário Mỹky-
Português. Rio de Janeiro: Editora Sepeei/SR-5/UFRJ.
Mosonyi, Esteban Emilio. 1966. Morfología del verbo Yaruro. Caracas: Universidad
Central de Venezuela.
Mosonyi, Esteban Emilio and Jorge Ramón García. 2000. Yaruro (Pumé). Manual de
Lenguas Indígenas de Venezuela, ed. by Esteban Emilio Mosonyi and Jorge Carlos
Mosonyi, 544–593. (Serie Origenes). Caracas: Fundación Bigott.
Muysken, Pieter. 1994. Callahuaya. Mixed Languages: 15 Case Studies in Language
Intertwining, Peter Bakker and Maarten Mous, 207–211. (Studies of Language and
Language Use 13). Amsterdam: Amsterdam: IFOTT.
Muysken, Pieter. 1997. Callahuaya. Contact Languages: A Wider Perspective, Sarah Grey
Thomason, 427–447. (Creole Language Library 17). Amsterdam: John Benjamins.
Nettle, Daniel. 1999. Linguistic Diversity of the Americas can be reconciled with a
recent colonization. Proceedings of the National Academy of Sciences of the USA 96:
3325–3329.
Nichols, Johanna. 1990. Linguistic Diversity and the First Settlement of the New World.
Language 66, no. 3: 475–521. doi:10.2307/414609.
Nimuendajú, Curt. 1925. As Tribus do Alto Madeira. Journal de la Société des Améri-
canistes XVII: 137–172.
O’Hagan, Zachary J. 2011. Informe de campo del idioma omurano.
Obregón Muñoz, Hugo and Jorge Díaz Pozo. 1989. Morfología Yarura. Maracay: Insti-
tuto Universitario Pedagógico Experimental de Maracay.
Olawsky, Knut. 2005. Urarina – Evidence for OVS Constituent Order. Leiden Papers in
Linguistics 2(2): 43–68.
Olawsky, Knut. 2006. A Grammar of Urarina. (Mouton Grammar Library 37). Berlin:
Mouton de Gruyter.
Oramas, Luis. 1916. Materiales para el estudio de los dialectos Ayamán, Gayón, Jira-
jara, Ajagua. Caracas: Litografía del Comercio.
Oré, Luis Jerónimo de. 1607. Rituale seu Manuale Peruanum. Napels.
Orphão de Carvalho, Fernando. 2009. On the Genetic Kinship of the Languages Tikúna
and Yurí. Revista Brasileira de Linguística Antropológica 1, no. 2: 247–268.
Palácio, Adair P. 1984. Guató: a língua dos índios canoeiros do rio Paraguai. Campinas:
São Paulo: Universidade Estadual de Campinas. http://libdigi.unicamp.br/document/
?code=vtls000051737.
Palau, Mercedes and Blanca Saiz. 1989. Moxos: Descripciones exactas e historia fiel de
los indios, animales y plantas de la provincia de Moxos en el virreinato del Perú por
Lázaro de Ribera, 1786–1794. Madrid: El Viso.
Payne, David L. 1981. Bosquejo fonológico del Proto-Shuar-Candoshi: evidencias para
una relación genética. Revista del Museo Nacional 45: 323–377.
Payne, David L. 1989. On proposing deep genetic relationships in Amazonian languages:
The case of Candoshi and Maipuran Arawakan languages. Society for the Study of
Indigenous Languages of the Americas.
Peeke, Catherine. 1973. Preliminary Grammar of Auca. (Summer Institute of Linguis-
tics: Publications in Linguistics 39). The Summer Institute of Linguistics and the Uni-
versity of Texas at Arlington.
Peeke, M. Catherine. 1968. Preliminary grammar of Auca (Ecuador). Bloomington: Indi-
ana University.
284 Frank Seifart and Harald Hammarström
Perozo, Laura, Ana Liz Flores, Abel Perozo and Mercedes Aguinagalde. 2008. Esce-
nario histórico y sociocultural del alto Paragua, Estado Bolívar, Venezuela. Evalu-
ación rápida de la biodiversidad de los ecosistemas acuáticos de la cuenca alta del
río Paragua, estado Bolívar, ed. by Josefa Celsa Señaris, Carlos A. Lasso and Ana
Liz Flores, 169–180. (Boletín RAP de Evaluación 49). Arlington, VA: Conservation
International.
Pike, Kenneth L. and Rachel Saint. 1962. Auca phonemics. Studies in Ecuadorian
Indian Languages, vol. 1, ed. by Benjamin Elson, 2–30. Norman: Summer Institute
of LInguistics.
Postigo, Adriana Viana. 2009. Alguns apontamentos bibliográficos sobre a língua guató
(Macro-Jê). LIAMES 9: 99–106.
Proyecto de Documentación del Idioma Muniche. 2009. Una Breve Descripción del Idi-
oma Muniche. Cabeceras Aid Project.
Quadros, Francisco R. Ewerton. 1892. Memoria sobre os trabalhos de exploração e
observação efetuada pela secção da comissão militar encarregada da linha telegráfica
de Uberaba a Cuiabá, de fevereiro a junho de 1889. Revista do Instituto Histórico e
Geográfico Brasileiro 55, no. 1: 233–260.
Quatra, Miguel Marcello. 2008a. Estructura básica del verbo jodï. Caracas: Ediciones
IVIC.
Quatra, Miguel Marcello. 2008b. Bajkewa jkwïkïdëwa-jya jodï ine – Dodo ine. Dicciona-
rio básico Castellano – Jodï. Caracas: Ediciones IVIC.
Querales, Ramón. 2008. El Ayamán (Ensayo de reconstrucción de un idioma indígena
venezolano). Barquisimeto: Concejo Municipal de Iribarren.
Ramos Cabredo, Josefina. 1950. Ensayo de un vocabulario de la lengua Tallán o Tallanca.
Cuadernos de Estudio del Instituto de Investigaciones Históricas 3, no. 8. 11–55.
Rivet, Paul. 1924. La Langue Andakí. Journal de la Société des Américanistes XVI:
99–110.
Rivet, Paul. 1942. Un dialecte Hoka Colombien: Le Yurumangí. Journal de la Société des
Américanistes 34: 1–59.
Rivet, Paul. 1949. Les langues de l’ancien diocèse de Trujillo. Journal de la Société des
Américanistes de Paris 38: 1–51.
Rivet, Paul and Constant Tastevin. 1920. Affinités du Makú et du Puináve. Journal de la
Société des Américanistes XII: 69–82.
Rivière, Peter G. 1966. Some ethnographic problems of Southern Guyana. Folk 8–9:
301–312.
Rojas Curieux, Tulio Enrique. 1998. La lengua Paez. Bogotá: Ministerio de Cultura.
Romero-Figeroa, Andrés. 1985. OSV as the basic order in Warao. Lingua 66(2–3). 115–
134. doi:10.1016/S0024–3841(85)90281–90285.
Romero-Figeroa, Andrés. 1997. A Reference Grammar of Warao. (LINCOM Studies in
Native American Linguistics 6). München: Lincom.
Rosenblat, Angel. 1936. Los Otomacos y Taparitas de los llanos de Venezuela. Estudio
etnográfico y lingüístico. Tierra Firme 1, no. 4: 227–377.
Sakel, Jeanette. 2004. A Grammar of Mosetén. (Mouton Grammar Library 33). Berlin:
Mouton de Gruyter.
Sakel, Jeanette. 2009. Mosetén and Chimane (Tsimane’). Ambito Andino, vol. 1, ed. by
Mily Crevels and Pieter Muysken, 333–375. (Lenguas de Bolivia). La Paz: Plural
Editores.
Schmidt, Max. 1905. Indianerstudien in Zentralbrasilien: Erlebnisse und ethnologische
Ergebnisse einer Reise in den Jahren 1900 bis 1901. Berlin: Dietrich Reimer.
Language isolates in South America 285
Seifart, Frank and Juan Alvaro Echeverri. 2014. Evidence for the Identification of Cara-
bayo, the Language of an Uncontacted People of the Colombian Amazon, as Belong-
ing to the Tikuna-Yurí Linguistic Family. PLoS One 9, no. 4e: 94814.
Seifart, Frank, Doris Fagua, Jürg Gasché and Juan Alvaro Echeverri (eds.). 2009. A Multi-
media Documentation of the Languages of the People of the Center. Online Publication
of Transcribed and Translated Bora, Ocaina, Nonuya, Resígaro, and Witoto Audio
and Video Recordings with Linguistic and Ethnographic Annotations and Descrip-
tions. Nijmegen: The Language Archive. https://hdl.handle.net/1839/00-0000-0000-
001C-7D64-2@view.
Seler, Eduard. 1902. Die Sprache der Indianer von Esmeraldas. Gesammelte Abhandlun-
gen zur amerikanischen Sprach- und Alterthumskunde, vol. I, 49–64. Berlin: A. Asher.
Slocum, Marianna C. 1986. Gramática Páez. Loma Linda: Editorial Townsend. www.sil.
org/americas/colombia/pubs/abstract.asp?id=20108.
Sturtevant, William C. 2005. History of Research on the Native Languages of the South-
east. Native Languages of the Southeastern United States (Studies in the anthropology
of North American Indians), ed. by Heather Kay Hardy and Janine Scancarelli, 8–65.
Lincoln: University of Nebraska Press.
Tadmor, Uri, Martin Haspelmath and Bradley Taylor. 2010. Borrowability and the notion
of basic vocabulary. Diachronica 27, no. 2: 226–246. doi:10.1075/dia.27.2.04tad.
Tessmann, Günter. 1930. Die Indianer Nordost-Perus: grundlegende Forschungen für
eine systematische Kulturkunde. Hamburg: Friederichsen, de Gruyter.
Teza, Emile. 1868. Saggi Inediti di lingue Americane. Annali delle Università Toscane
(parte prima): Scienze Neologiche X. 117–143.
Thomason, Sarah G. and Daniel L. Everett. 2001. Pronoun borrowing. Proceedings of the
Berkeley Linguistic Society 27: 301–315.
Tobar Gutiérrez, María Elena. 1995. Modo, aspecto y tiempo en Cofán. Bogotá: Univer-
sidad de los Andes.
Tobar Ortiz, Nubia. 2000. La Lengua Tinigua: Anotaciones fonológicas y morfológicas.
Lenguas indígenas de Colombia: una visión descriptiva, ed. by María Stella González
de Pérez and María Luisa Rodríguez de Montes, 669–679. Santafé de Bogotá: Instituto
Caro y Cuervo.
Torero, Alfredo. 2002. Idiomas de los Andes: lingüística e historia. Lima: IFEA, Instituto
Francés de Estudios Andinos : Editorial Horizonte.
Trivero Ribera, Alberto. 2005. Los primeros pobladores de Chiloé. (Working Paper
Series 25). Uppsala: Ñuke Mapuförlaget.
Tuggy, John C. 1966. Vocabulario candoshi de Loreto. (Serie Lingüística Peruana 2).
Yarinacocha: Instituto Lingüístico de Verano. www.sil.org/americas/peru/html/pubs/
show_work.asp?id=2444.
Vaquero, Antonio. 1965. Idioma Warao: Morfología, Sintaxis, Literatura. (Estudios
Venezolanos Indígenas). Caracas: Editorial Sucre.
Vasconcelos, Ione P. 2004. Aspectos da fonologia e morfologia da língua Aikanã. Maceió:
Universidade Federal de Alagoas.
Vergara y Vergara, Jose Maria and Evaristo Delgado. 1860. The Indians of Andaqui, New
Grenada. Bulletin American Ethnological Society I: 53–72.
Viegas Barros, José Pedro. 1990. Dialectología Qawasqar. Amerindia 15: 43–73.
Viegas Barros, José Pedro. 2005. Voces en el viento: Raíces lingüísticas de la Patagonia.
Buenos Aires: Ediciones Mondragon.
Viegas Barros, José Pedro. 2001. Evidencias de la relación genética lule-vilela. LIAMES
1: 107–126.
286 Frank Seifart and Harald Hammarström
Viegas Barros, José Pedro. 2004. Guaicurú no, macro-Guaicurú sí: Una hipótesis sobre
la clasificación de la lengua Guachí (Mato Grosso do Sul, Brasil). Buenos Aires:
CONICET – Instituto de Lingüística, Universidad de Buenos Aires. Manuscript.
Viegas Barros, José Pedro. 2006a. Reconstruyendo la morfosintaxis del proto-Chon.
Paper presented at the “Avances en Lingüística Histórico-Comparativa Aborigen
Sudamericana” en el 52º Congreso Internacional de Americanistas, Sevilla, 17–21 de
julio de 2006.
Viegas Barros, José Pedro. 2006b. Proto-Chon Cultural Reconstructions from the Voca
bulary. Paper presented at the Historical Linguistics and Hunter-Gatherer Populations
in Global Perspective, Workshop at the Max Planck Institute for Evolutionary Anthro-
pology, Leipzig 10–12/08/2006.
Vilera Díaz, Diana. 1985. Introducción morfológica de la lengua Hödi. Universidad Cen-
tral de Venezuela.
Vilera Díaz, Diana. 1987. Introducción a morphosintaxis de la lengua Hoti: el lexema
nominal. Boletín de lingüística 6: 79–99.
Villarejo, Avencio. 1959. Idiomas y dialectos antiguos y actuales. La selva y el hombre,
171–180. Lima: Editorial Ausonia.
Vilte Vilte, Julio. 2004. Diccionario Kunza-Español: Español-Kunza. Codelco Chile.
Voort, Hein van der. 2004. A Grammar of Kwaza. (Mouton Grammar Library 29). Berlin:
Mouton de Gruyter.
Yost, James A. 1981. Twenty Years of Contact: The Mechanisms of Change in Wao
(“Auca”) Culture. Cultural Transformations and Ethnicity in Modern Ecuador, ed. by
Norman E. Whitten Jr., 677–704. Chicago: University of Illinois Press.
Zamponi, Raoul. 2002. Notes on Betoi Verb Morphology. International Journal of Ame
rican Linguistics 68, no. 2: 216–241.
Zamponi, Raoul. 2003. Betoi. (Languages of the World/Materials 428). München:
Lincom.
Zamponi, Raoul. 2008. Sulla fonologia e la rappresentazione ortografica del lule. Arte y
vocabulario de la lengua Lule y Tonocoté, ed. by Antonio Maccioni, xxi–lviii. Cagliari:
Centro di Studi Filogici Sardi.
CHAPTER 11
LANGUAGE ISOLATES IN
THE NEW GUINEA REGION
Harald Hammarström
1 INTRODUCTION
The Greater New Guinea area holds a large number of language isolates, belonging to the
most diverse and isolate regions of the world (Table 11.1, using figures from Hammar-
ström et al. 2015). In the present understanding, as many as 55 languages in this region
are not demonstrably related to any other language. A much lower number of isolates for
New Guinea emerges from the overviews of Foley (2000), Ross (2006), Wurm (1982)
since these authors tend to give the benefit of the doubt in the opposite direction, or, in
the case of Wurm (1982), have far more generous criteria for considering languages to
be genealogically related (cf. Shafer 1965). Taking such, more ‘lumping’, views on the
grouping of the languages of the New Guinea area is not without reason. New Guinea
is the least studied region both in terms of documentation and genealogical relations
(Hammarström and Nordhoff 2012), and there is therefore the expectation that languages
which are not obviously related to their neighbours will prove to be so, once they are
better documented and their potential relations are studied more intensively. However,
empirical evidence from the Americas (Hammarström 2014) suggest that increased doc-
umentation and study does not necessarily lead to a drastically different understanding of
genealogical relations than that of an initial assessment based on the comparison of basic
vocabulary. For this reason we have chosen to adopt a rather strict criterion in the present
survey, whereby a language has to have a bone-fide demonstration of relatedness (cf.
Campbell and Poser 2008) with other language(s) for it not to be considered an isolate.
Every entry, however, does have an individual explanation of why is it considered an
isolate as well as a commentary on the possible genealogical links.
All the languages listed in the present survey have an attestation that exceeds at least
a wordlist of basic vocabulary (Tadmor et al. 2010).1 Usually a sociolinguistic survey
or a vocabulary comparison underlies the language/dialect divisions adopted here (cf.
Hammarström 2015) which determines whether a set of varieties count as an isolate or as
a small family of more than one language and hence not included in the present paper.2
The number of isolates, and the linguistic diversity more generally, has bewildered
every generation of Papuan language researchers. The classic view is that the diversity
is due to some combination of ancient settlement (49,000 years ago) see Summerhayes
et al. 2010 and geography (mountains, forests, swamps, etc.). This view is rarely artic-
ulated (but see Axelsen and Manrubia 2014, Gavin and Stepp 2014, Nettle 1999), and
explanatory models have yet to be worked out. A different view, argued for the New
Guinea area foremost by Laycock (1969, 1982a, b), who had considerable fieldwork
and surveying experience from the Sepik region, is that the key to the diversity lies
in a conscious ideology on the part of the speakers to keep and accentuate linguistic
288 Harald Hammarström
2 EAST NUSANTARA
2.2 Maybrat
Maybrat is spoken by a sizeable population (~ 20,000) in the central area of the Bird’s
Head of Indonesian Papua. The language has a divergent dialect known as Karon
Dori (Dol 2007:8), which is sometimes counted as a separate language. Maybrat is
described in a modern grammar (Dol 2007). The language has long been hypothe-
sized as belonging to a larger grouping in some constellation with other Bird’s Head
languages, but the lexical and grammatical evidence is insufficient for concluding a
genealogical relation (see Klamer and Holton in press and references therein). Like
many other Bird’s Head languages, Maybrat is a very isolating SVO language, and is
famous for lacking grammaticalized tense or aspect (Dahl 2001).
Language isolates in the New Guinea region 289
of Abinomn are the notes on the segmental inventory (Donohue 2007:529) and overt
dual marking on nouns (Donohue and Musgrave 2007:365–366) extracted from Mark
Donohue’s field notes.
1971:70–72, Burung 2000, Laycock 1977). All these researchers had difficulties eliciting
a full pronoun system for Elseng, leading Laycock (1977) to infer the lack of pronominal
distinctions beyond ‘me’ versus ‘the rest’, and this made its way into some secondary
literature as the smallest known pronoun inventory (e.g., Mühlhäusler 1990). However,
Mark Donohue (personal communication, 2008) was able to elicit a minimal-augmented
pronoun system for Elseng (cf. Harbour 2014:133–134). Only a modicum of data on
Elseng grammar is available, but this is enough to gauge that Elseng is an SOV language
(Voorhoeve 1971:72, Burung 2000).
isolate within his Tor-Lakes-Plain stock using unpublished lexical data (of which 40
words were later published). This classification has been retained in all later listings
(e.g. Lewis et al. 2015) except that the Lakes Plain languages were later excised (Clouse
1997), leaving Mawes remaining in a subfamily with Tor and Orya. To be a family-level
isolate (Voorhoeve 1975b:16) within the Tor-Lakes-Plain stock means that the language
“shares 12%-27% cognates on a 100-word list” with at least one other Tor-Lakes-Plain
language. However, the cognate identifications supporting this classification were never
published and fail to reproduce using modern lexical data (Foley in press-a). Indeed,
another independent count (Wambaliau 2006b) has Mawes cognate percentages never
exceeding 6% with any Tor language (nor with any other language in the immediate
region). Therefore, it seems best to consider Mawes an isolate until proven otherwise
(Hammarström 2010a).
A substantial wordlist was finally published in Smits and Voorhoeve (1994) of which
20 words (Galis 1955) and 40 words (Voorhoeve 1975b) had appeared before. An SIL
Indonesia survey report will include 250 words and 15 sentences (Wambaliau 2006b).
The sentences show SOV word order.
Though the speaker number is not low (ca. 850), Mawes is under pressure from Indo-
nesian and can be considered an endangered language (Wambaliau 2006b).
In a shifting situation like this, small language groups may be gradually assimilated
and disappear entirely. In the early 1950s Moraori (Boelaars 1950) of southern Irian
Jaya was spoken by only about forty people, and the tribe was surrounded by the
Language isolates in the New Guinea region 295
numerically much larger and culturally aggressive Marind tribe. All Moraori were
bilingual in Marind, and Marind influence on the language was extensive (Drabbe
1954). It is now likely that Moraori is extinct, or nearly so.
On the contrary, the language was not that quick to disappear. When Donohue (no
date:10) visited a decade later, he reported that 150 out of the 200 inhabitants of the vil-
lage knew the language and conjectured,
It might be that the very history of being surrounded by numerically superior out-
siders has made the Moraori more resistant to the sorts of cultural and linguistic
decay that now face all the ethnolinguistic groups in the Wasur national park region:
a long history of being in contact with a larger group has built in safeguards against
rapid assimilation, and has given them a strong sense of local identity that was less
essential in a large and culturally aggressive group with ties to many different areas.
Nevertheless, the language could ultimately not withstand the pressure, and by the time
of Arka’s (2012:151) fieldwork in 2008, the onset of broken transmission had finally come.
Presently, the village has 119 inhabitants, and while there are older fully fluent speakers,
young Maroris no longer actively speak their language, showing varying degrees of pas-
sive competence and shift to Indonesian/Malay and Marind (Arka 2012:151).
Wurm (1975a:327–335) classified Marori as part of his Trans-Fly stock, a subsec-
tion of Trans New Guinea including the neighbouring Kanum and Yei languages, on
the basis of lexicostatistical figures. The underlying data and cognate judgements were
never published, however, and newer independent assessments show much lower lexi-
costatistical figures with Kanum and Yei (Donohue no date:8). There is in fact a higher
number of (near-identical) matches with Marind, which are presumably loans, given the
sociolinguistic situation. There remains the possibility that Marori is remotely related to
the Trans New Guinea languages (Evans et al. in press), at least that is suggested in its
pronoun forms.
Fieldwork by I Wayan Arka has so far resulted in three publications with nuggets
of Marori grammar. Arka (2013) shows that nominals in Marori can take a completive
aspect clitic =on/=en to express a past property or relation. Marori shows a three-way
number system (singular, dual, and plural) where dual is expressed by a combination
of non-singular and non-plural morphology rather than by dedicated dual morphology
(Arka 2011). The argument marking system of Marori has an unusual combination in that
there is a clitic =i which marks patients, recipients as well as affected participants Arka
(2012:153–154).
between Dibiyaso and Kaluli. These contain a few fairly convincing comparisons where
Dibiyasu p corresponds to Kaluli f. The items in question are common to the entire Bosavi
Watershed group (not just Kaluli), but none are found in the Etoro-Bedamini group. This
suggests, that we are dealing with loans between Dibiyaso and the Bosavi watershed
group. Similarly, Turumsa and Dibiyaso are said to share as much as 19% lexicostatistical
similarity (Tupper 2007), but from a look at the items in question and the sociolinguistic
situation, we find a loan scenario preferable to a genealogical one.
No information on the grammar of Dibiyaso is available, and documentation is thus
imperative.
is a rational analysis of early Purari scripture materials furnished to him by Holmes, while
Holmes himself was struggling to fit Purari grammar into a Latinate frame. For exam-
ple, Holmes (1913:130) marvels at the lack of comparative and superlative constructions
of the kind he was used to seeing from European languages. Further documentation of
Purari grammar can be found in Kairi and Kolia (1977) and Dutton (1979), but despite
the long history of interaction, many aspects of Purari grammar remain to be described.
Purari has a small consonant inventory of only eight stops/liquids and two glides featured
in native words (Kairi and Kolia 1977:3, Dutton 1979:7–8). Like so many Papuan lan-
guages, Purari is SOV with postpositions and has a richer morphology for verbs than for
nouns (Dutton 1979:6–7).
A pidgin language used by the Purari for trading with the seaborn Austronesian-
speaking Motu is documented thanks to the efforts of Dutton (1979). The main lexifier
for the pidgin is Motu.
strongly favour a borrowing scenario. The so-called sound shifts alluded to by Franklin
(1995) are, in fact, perfectly predictable loan renderings given the phonemic systems
of Eleman (which has no n/l/r-phonemic distinction) and Kaki Ae (which has no t/k
distinction).
Clifton (1997) provides a sketch of Kaki Ae grammar. The verb agrees with the subject
and object in person and number (for third person singular objects agreement is optional),
and the subject may take ergative marking. The verb does not have dedicated tense mark-
ing, but a marker (labelled irrealis by Clifton 1997) is used under negation as well as for
future reference. The relative clause precedes the noun.
1P.SG no ’I’
2P.SG ne ’thou’
3P.SG one ’he/she/him’
1P.DU tota ’we two’
1P.PL toto ’we all’
2/3P.DU kita ’you two/they two’
2/3P.PL kiwi ’you all/they all’
5 SEPIK
journey to the Gapun village. When Höltker (1938) visited, he counted only 33 village
inhabitants. Laycock and Z’Graggen (1975:739) report 74 speakers. The detailed socio-
linguistic fieldwork by Kulick and Stroud (1990) counted exactly 89 fluent Taiap speak-
ers, all multilingual to various degrees, but already by then, no child under 10 had an
active command of Taiap.
Laycock and Z’Graggen (1975:757) classified Taiap into Laycock’s wide-ranging
Sepik-Ramu family. The evidence adduced was essentially typological, clearly insuffi-
cient for concluding a genealogical relation.
The sketch by Kulick and Stroud (1992) is an excellent summary of Taiap grammati-
cal features. Taiap is an ergative SOV language with postpositions. Relative clauses fol-
low the head noun. Only animate nouns take obligatory morphological number marking
(singular/dual/plural). Taiap nouns have a male/female gender distinction as revealed by
agreement.
Taiap distinguishes male/female speech in some aspects of its lexicon and verb mor-
phology (Kulick 1987:130).
The Asaba language is still being transmitted to children (Roger Lohmann, personal
communication, 2009).
(Conrad and Dye 1975:14, Aikhenvald 2008). No other argument for a Sepik affiliation
in offered (Laycock and Z’Graggen 1975:738), and Yerakai is not mentioned in Foley’s
re-consideration of the Sepik family (Foley 2005).
There must be an (unpublished) SIL wordlist of Yerakai underlying Conrad and Dye
(1975:14), and there are unpublished field notes by Laycock (no date). No data on Yer-
akai is published.
Doriot (1991) who trekked in parts of the Yetfa-speaking area in April-May 1991. Some-
time between the 14th edition of the Ethnologue (Grimes 2000) and the 15th (Gordon
2005), it was realised that Yetfa and Biksi are so close as to be regarded as one language.
Biksi (by implication Biksi-Yetfa) was placed in the putative Sepik language family
languages by Laycock and Z’Graggen (1975:740–741), and this has often been repeated
since (Lewis 2009). Biksi-Yetfa was not considered by Foley in his re-assessment of
the Sepik family for lack of data (Foley 2005:126–127). The lexical matches adduced
by Laycock to various Sepik languages are sporadic and look more like loans or chance
resemblances than the outcome of genetic inheritance (Hammarström 2008a). The lexical
relations were also investigated independently by Conrad and Dye (1975:19) who found
that Biksi shared no more than 4% probable cognates with any of the languages in the
vicinity to the east, including Abelam.10 (This lexical comparison includes numerals but
no demonstratives or pronouns.) Yetfa-Biksi also shows similarly low figures with lan-
guages neighbouring to the west such as Kimki (Kim 2006).
Scanty notes on grammar can be found in Laycock and Z’Graggen (1975:740–741),
and short wordlists are published in Conrad and Dye (1975), Laycock (1972). An unpub-
lished SIL Indonesia survey contains 250 Yefta words from 5 locations along with 15
sentences (Kim 2006). There are further unpublished wordlists from several locations
collected by Doriot (1991). The sentences of Kim (2006) show that Yetfa-Biksi is an
SOV language.
At this time, Yetfa is still being transmitted to children and so is not an endangered
language (Kim 2006).
5.11 Ap Ma [kbx]
Ap Ma, also known as Botin, Kambot, or Kambrambo, is spoken by some 10,000 people
in 15 villages scattered in the area south of the Sepik River between the Keram and Yuat
Rivers in East Sepik Province, Papua New Guinea. The Ap Ma people have long been
known to German missionaries operating in the Keram area (e.g., Speiser 1944).
Ap Ma shows some typological similarities to the Grass, Banaro, and Ramu languages,
but there is little lexical evidence for any possible genealogical relationship (Z’graggen
1969:168–169, Foley in press-b). For this reason, contra Foley (in press-b), we find insuf-
ficient grounds to classify Ap Ma as genealogically related to any or all of the mentioned
languages.
Published data on the Ap Ma language consists of a wordlist (Z’graggen 1972); two
articles on specific grammatical topics (Pryor and Farr 1989, Pryor 1990); and a fairly
long master’s thesis grammatical description (Wade 1984). Like many Papuan languages,
Ap Ma is an SOV language with medial verbs. The system of medial verbs is particularly
extensive in Ap Ma, realizing nine different tense/aspect relationships with either same
or different subject markers. Furthermore, different subject medial verbs in combination
with various particles realize negation.
6 EAST PAPUAN
Though there are some typological parallels with other non-Austronesian languages
of the East Papuan islands, there are insufficient grounds for concluding a genealogical
relationship for Kuot with any or all of them (Dunn et al. 2002).
The grammar of Kuot is relatively well known thanks to the work of Chung and Chung
(1996) and Lindström (2002). Kuot is famous for being the only non-Austronesian VSO
language of the entire New Guinea Area. Kuot also has other word order feature typically
associated with VSO languages such as prepositions, adjectives following the noun and
postposed relative clauses. Kuot nouns have a covert male/female gender distinction and
overt number marking distinguishing singular, dual, and plural.
Bilua is sometimes grouped with the other Central Solomons languages and beyond
(Wurm 1975b), but closer inspection shows that a genealogical relation is not demonstra-
ble (Dunn and Terrill 2012, Terrill 2011).
A grammar is available for Bilua (Obata 2003). Bilua is an SVO language with post-
positions and adjective-noun order.
dictionary (Henderson and Henderson 1987, 1999). A longer draft grammar is in prepa-
ration by Stephen Levinson, drawing on the earlier work by the Hendersons as well as
long-time fieldwork on the island. A number of papers on specialised topics have already
appeared (e.g., Levinson 2006).
Like many Papuan languages, Yele is an ergative SOV language with postpositions,
noun-adjective order, and postposed relative clauses. Beyond this, Yele has the reputation
of being an extraordinarily complex language, on many levels. The phoneme inventory
includes doubly articulated consonants and additionally distinguishes palatalized and
labialized variants. If analyzed as single segments, the total number of distinctive seg-
ments for Yele is over 90 – the largest phonemic inventory of any non-click language in
the world. The single-segment analysis is justified durationally, as the coarticulated seg-
ments are not different from simplex consonants, but they probably derive recently from
consonant clusters, as Yele otherwise has no consonant clusters (Levinson ms). Gram-
matical categories such as tense, aspect, mood, and person/number of subject and object
expressed in a huge inventory of portmanteau morphemes that are largely unsegmentable
into constituent morphemes. Verbs come in a multitude of irregular paradigms and many
verbs are suppletive across a range of categories (Levinson ms).
7 ACKNOWLEDGEMENTS
The underlying classification of Papuan languages has been elaborated during a long
period of time during which it has benefited from comments and discussions with Tim
Usher, Mark Donohue, Matthew Dryer, and Andrew Pawley. The usual disclaimers apply.
8 ABBREVIATIONS
U undergoer
AUX auxiliary
PST past tense
SPEC specific
ERG ergative
VISUAL visual evidential
SENSORY sensory evidential
RESULT resultative evidential
REASONING inferred evidential
REPORTED reported evidential
SRCE source
FOC focus
IMPER imperative
FPST far past
NOTES
1 Three unclear cases are worth noting. Fabritius (1855) encountered a tribe at the
mouth of Mamberamo and noted down two words (the numerals ‘one’ and ‘two’) of
their language. No language subsequently documented in that (or any other) area has
matching forms, but we exclude it from the present listing since it is clearly insuf-
ficiently attested. Abom is a moribund language encountered on a survey of Tirio
languages at the mouth of the Fly River. A 200-item wordlist was collected but it is
310 Harald Hammarström
difficult to know if some crucial lexical items (and some tiny details of grammar) are
inherited cognates with the Tirio languages or the reflection of language shift (Jore
and Alemãn 2002). The language of Kembra near the confluence of the Sobger and
Nawa Rivers is attested with only a short wordlist taken down in challenging circum-
stances (Doriot 1991). From this it may be guessed that the language is related to
Lepki and Murkim spoken further south.
2 The status of Kaure [nxu] and Narau [bpp] is worth noting. Indications from the
field suggest that the two are in fact intelligible varieties of the same language
(Dommel and Dommel 1991:1–3) but the region in question is poorly surveyed so
we have refrained from asserting that this is the case for the purposes of the present
chapter.
3 From the iso-639-3 code longer lists of alternative names can easily be retrieved.
4 Mor is not to be confused with the Austronesian language with the same name found
on the islands northeast of the city of Nabire (Kamholz 2014). The two languages are
too far away from each other to have had any direct interaction and the homophony
of the names is a coincidence.
5 Not to be confused with several other places and languages in Indonesian Papua also
called Tanahmerah (literally ‘brown earth’ in Malay).
6 There is now a full New Testament translation (Damal people and CMA 1988)
7 Doriot (1991) refers to an unpublished wordlist of Kimki from Mot, but Mot is listed
in survey maps as Murkim speaking (Wambaliau 2004).
8 Mawes spoken in Mawes Wares is not to be confused with the Wares [wai], once a
warlike tribe on the upper Biri River (Oosterwal 1961:26–27) that had to flee to the
coast from their original territory in the 1950s (Koentjaraningrat 1965:135–136).
9 I am indebted to Tim Usher for bringing to my attention how different Wiru actually
is from Engan.
10 The exact languages in question are Yerakai (0%), Chenapian (0%), Bahinemo (1%),
Washkuk (1%), Yessan-Mayo (4%), Abelam (1%), Namie (0%), Abau (0%), May
River Iwam (1%), Musan (0%), Amto (1%), Rocky Peak (0%), Ama (0%), Nimo
(1%), Bo (0%), Iteri (0%), Owiniga (2%), Woswari (0%), Walio (0%), Paupe (0%),
South Mianmin (0%), Nagatman (0%), Busan (1%), and Pyu (1%).
11 Thurston (1982) argues that this is probably the result of an originally Anem-like
population adopting an Austronesian language because many of the Austronesian
features in Lusi appear in a ‘simplified’ form.
REFERENCES
Aikhenvald, Alexandra Y. 2008. Language Contact along the Sepik River. Anthropolog-
ical Linguistics 50: 1–66.
Anceaux, Johannes Cornelis. 1958. Languages of the Bomberai Peninsula: Outline of a
Linguistic Map. Nieuw-Guinea Studiën 2: 109–121.
Arka, I. Wayan. 2011. Constructive Number Systems in Marori and Beyond. The Pro-
ceedings of the International Lexical Functional Grammar (LFG2011) Conference,
University of Hong Kong 19 July 2011, ed. by Miriam Butt and Tracy Holloway King,
5–25. Stanford, CA: CSLI Publications.
Arka, I Wayan. 2012. Projecting Morphology and Agreement in Marori, an Isolate of
Southern New Guinea. Melanesian Languages on the Edge of Asia: Challenges for
the 21st Century, Nicholas Evans and Marian Klamer (Language Documentation and
Conservation Special Publication 5), 150–173. Honolulu: University of Hawaii Press.
Language isolates in the New Guinea region 311
Arka, I Wayan. 2013. Nominal Aspect in Marori. Proceedings of the LFG13 Confer-
ence, ed. by Miriam Butt and Tracy Holloway King, 27–47. Stanford, CA: CSLI
Publications.
Austen, Leo. 1921a. Vocabularies Daru Station, Western Division: Name of Tribe,
Tapapi, Names of Villages, Ubaroniara, Bogabwi. British New Guinea Annual Report
1919–1920: 122–122.
Austen, Leo. 1921b. Vocabularies Daru Station, Western Division: Names of Villages,
Hibaradai, Madawai, and Eriga. British New Guinea Annual Report 1919–1920:
122–122.
Austen, Leo. 1921c. Vocabularies Daru W.D. Station, Western Division: Name of Tribe,
Hiwi, Name of Villages, Kaibenapi, Genapi, Wagumi, Bobonapi, Wariadai and Sarau.
British New Guinea Annual Report 1919–1920: 123–123.
Axelsen, Jacob Bock and Susanna Manrubia. 2014. River Density and Landscape Rough-
ness Are Universal Determinants of Linguistic Diversity. Proceedings of the Royal
Society of London B: Biological Sciences 281(1784): 1–9.
Baron, Wietze. 1983. Kwomtari Survey. Unpublished manuscript, SIL Survey office,
Ukarumpa, now posted at www.kwomtari.net/kwomtari_survey.pdf (15 December
2008).
Berry, Keith and Christine Berry. 1999. A Description of Abun: A West Papuan Language
of Irian Jaya (Pacific Linguistics: Series B 115). Canberra: Research School of Pacific
and Asian Studies, Australian National University.
Bevan, Theodore F. 1890. Toil, Travel, and Discovery in British New Guinea. London:
Kegan Paul, Trench, Trubner.
Boelaars, J.H.M.C. 1950. The Linguistic Position of South-Western New Guinea. Leiden:
E. J. Brill.
Brown, Herbert A. 1973. The Eleman Language Family. The Linguistic Situation in the
Gulf District and Adjacent Areas, Papua New Guinea (Pacific Linguistics: Series
C 26), ed. by Karl J. Franklin, 281–376. Canberra: Research School of Pacific and
Asian Studies, Australian National University.
Brown, Loo N. 1921. Vocabularies Kikori Staion, Delta Division: Name of Tribe, Aurama,
Name of Village, Uo-Ho. British New Guinea Annual Report 1919–1920: 124–124.
Burung, Wiem. 2000. A Brief Note on Elseng. SIL International, Dallas. SIL Electronic
Survey Reports 2000–2001. www.sil.org/silesr/ abstract.asp?ref=2000-001.
Campbell, Carl and Jody Campbell. 1987. Yade Grammar Essentials. Ukarumpa: Unpub-
lished Manuscript, Summer Institute of Linguistics.
Campbell, Lyle and William J. Poser. 2008. Language Classification: History and
Method. Cambridge: Cambridge University Press.
Carter, John, Katie Carter, Bonnie MacKenzie, Janell Masters, Brian Paris, and Hannah
Paris. 2012. A Sociolinguistic Survey of Anem (SIL Electronic Survey Reports 2012–
2041). SIL International.
Chung, Kyung-Ja and Chul-Hwa Chung. 1996. Kuot Grammar Essentials. Two Non-
Austronesian Grammars from the Islands (Data Papers on Papua New Guinea Lan-
guages 42), ed. by John M. Clifton, 1–75. Ukarumpa: Summer Institute of Linguistics.
Clancy, D.J. 1962. Through the Strickland Gorge. Australian Territories 2, no. 1, 12–19.
Clifton, John M. 1994. Stable Multilingualism in a Small Language Group: The Case of
Kaki Ae. Language and Linguistics in Melanesia 25, no. 2: 107–24.
Clifton, John M. 1995. A Grammar Sketch of the Kaki Ae Language. University of North
Dakota Session (Work Papers of the Summer Institute of Linguistics 39), 33–80. Grand
Forks, North Dakota: Summer Institute of Linguistics.
312 Harald Hammarström
Dunn, Michael and Angela Terrill. 2012. Assessing the evidence for a Central Solomons
Papuan family using the Oswalt Monte Carlo Test. Diachronica 29, no. 1: 1–27.
Dunn, Michael, Ger Reesink, and Angela Terrill. 2002. The East Papuan Languages:
A Preliminary Typological Appraisal. Oceanic Linguistics 41, no. 1: 28–62.
Dunn, Michael, Stephen C. Levinson, Eva Lindström, Ger Reesink and Angela Terrill.
2005a. Structural Phylogeny in Historical Linguistics: Methodological Explorations
Applied in Island Melanesia. Language 84, no. 4: 710–759.
Dunn, Michael, Angela Terrill, Ger Reesink, Robert A. Foley and Stephen C. Levinson.
2005b. Structural Phylogenetics and the Reconstruction of Ancient Language History.
Science 309: 2072–2075.
Dutton, Tom. 1979. Simplified Koriki: A second trade language used by the Motu in the
Gulf of Papua. Kivung 12, no. 1: 3–73.
Dye, Wayne, Patricia Townsend and W. Townsend. 1968. The Sepik Hill Languages:
A Preliminary Report. Oceania 39, 146–156.
Evans, Nicholas, Wayan Arka, Matthew Carroll, Yun Jung Choi, Christian Döhler,
Volker Gast, Eri Kashima, Emil Mittag, Bruno Olsson, Kyla Quinn, Dineke Schok-
kin, Philip Tama, Charlotte van Tongeren, and Jeff Siegel. In press. The Languages
of Southern New Guinea. Bill Palmer (ed.), Papuan Languages and Linguistics. Ber-
lin: Mouton.
Fabritius, G. J. 1855. Anteekeningen omtrent Nieuw-Guinea. Tijdschrift voor Indische
Taal-, Land- en Volkenkunde IV. 209–215.
Feuilleteau de Bruyn, W.K.H., J.V.L. Opperman, L. Doorman, and J. Th. Stroeve. 1915.
Ethnographische gegevens betreffende de inboorlingen in het stroomgebied van de
Mamberamo. Tijdschrift van het Koninklijk Aardrijkskundig Genootschap 32: 655–672.
Flint, L.A. 1917–1918. Vocabulary: Name of Tribe, Karami. People. Name of Village,
Kikimairi and Aduahai. Commonwealth of Australia. Papua: Annual Report for the
Year 1917–1918, 96–96.
Foley, William A. 1986. The Papuan Languages of New Guinea (Cambridge language
surveys). Cambridge: Cambridge University Press.
Foley, William A. 2000. The Languages of New Guinea. Annual Review of Anthropology
29, no. 1: 357–404.
Foley, William A. 2005. Linguistic Prehistory in the Sepik-Ramu Basin. Papuan Pasts:
Studies in the Cultural, Linguistic and Biological History of the Papuan-Speaking
Peoples (Pacific Linguistics 572), ed. by Andrew Pawley, Robert Attenborough, Jack
Golson, and Robin Hide, 109–144. Canberra: Research School of Pacific and Asian
Studies, Australian National University.
Foley, William A. In press-a. The languages of Northwest New Guinea. Papuan Lan-
guages and Linguistics, ed. by Bill Palmer. Berlin: Mouton.
Foley, William A. In press-b. The Languages of the Sepik. Papuan Languages and Lin-
guistics, ed. by Bill Palmer. Berlin: Mouton.
Frahm, Roxanne Margaret. 1998. Baniata Serial Verb Constructions. MA thesis, Univer-
sity of Auckland.
Franklin, Karl J. 1968. Languages of the Gulf District: A Preview. Papers in New Guinea.
Linguistics No. 8 (Pacific Linguistics: Series A 16), 19–44. Canberra: Research School
of Pacific and Asian Studies, Australian National University.
Franklin, Karl J. 1973a. Appendices. The Linguistic Situation in the Gulf District and
Adjacent Areas, Papua New Guinea (Pacific Linguistics: Series C 26), ed. by Karl J.
Franklin, 539–592. Canberra: Research School of Pacific and Asian Studies, Austra-
lian National University.
314 Harald Hammarström
Franklin, Karl J. 1973b. Other Language Groups in the Gulf District and Adjacent Areas.
The Linguistic Situation in the Gulf District and Adjacent Areas, Papua New Guinea
(Pacific Linguistics: Series C 26), ed. by Karl J. Franklin, 263–277. Canberra: Research
School of Pacific and Asian Studies, Australian National University.
Franklin, Karl J. 1975a. Comments on Proto-Engan. New Guinea Area Languages and
Language Study Vol 1: Papuan Languages and the New Guinea Linguistic Scene
(Pacific Linguistics: Series C 38), ed. by Stephen A. Wurm, 263–276. Canberra:
Research School of Pacific and Asian Studies, Australian National University.
Franklin, Karl J. 1975b. Isolates: Gulf District. New Guinea Area Languages and Lan-
guage Study Vol 1: Papuan Languages and the New Guinea linguistic scene (Pacific
Linguistics: Series C 38), ed. by Stephen A. Wurm, 891–896. Canberra: Research
School of Pacific and Asian Studies, Australian National University.
Franklin, Karl J. 1995. Some further comments on Kaki Ae. Language and Linguistics in
Melanesia 26: 195–198.
Franklin, Karl J. 2001. Kutubuan (Foe and Fasu) and Proto Engan. The Boy from
Bundaberg: Studies in Melanesian Linguistics in Honour of Tom Dutton (Pacific
Linguistics 514), ed. by Andrew Pawley, Malcolm Ross and Darrell Tryon, 143–
154. Canberra: Research School of Pacific and Asian Studies, Australian National
University.
Franklin, Karl J. and C.L. Voorhoeve. 1973. Languages Near the Intersection of the Gulf,
Southern Highlands and Western Districts. The Linguistic Situation in the Gulf District
and Adjacent Areas, Papua New Guinea (Pacific Linguistics: Series C 26), ed. by Karl
J. Franklin, 149–186. Canberra: Research School of Pacific and Asian Studies, Austra-
lian National University.
Galis, Klaas Wilhelm. 1955. Talen en dialecten van Nederlands Nieuw-Guinea. Tijd-
schrift Nieuw-Guinea 16: 109–118, 134–145, 161–178.
Galis, Klaas Wilhelm. 1956. Ethnologische Survey van het Jafi-district (Onderafdeling
Hollandia) volume 102. Hollandia (Jayapura): Gouvernement van Nederlands Nieuw-
Guinea, Kantoor voor Bevolkingszaken.
Gavin, Michael C. and John Richard Stepp. 2014. Rapoport’s Rule Revisited: Geograph-
ical Distributions of Human Languages. PLoS One 9, no. 9: e107623. 1–8.
Gordon, Raymond G. Jr. 2005. Ethnologue: Languages of the World (15th ed.). Dallas:
SIL International.
Grace, George W. 1956. 1955–1956 Fieldnotes: Notebook 47. Ms.
Graham, Glenn H. 1981. A Sociolinguistic Survey of Busa and Nagatman. Sociolinguis-
tic Surveys of Sepik Languages (Workpapers in Papua New Guinea Languages 29), ed.
by Richard Loving, 177–192. Ukarumpa: Summer Institute of Linguistics.
Grimes, Barbara F. (ed.). 2000. Ethnologue: Languages of the World (14th ed.). Dallas:
SIL International.
Hammarström, Harald. 2008a. A Reclassification of Some West Papua Languages.
Paper Presented at the International Workshop on Minority Languages in the Malay/
Indonesian Speaking World, 28 June 2008 Leiden, The Netherlands.
Hammarström, Harald. 2008b. Two Hitherto Unnoticed Languages from Sobger River,
West Papua, Indonesia. Submitted.
Hammarström, Harald. 2010a. The Genetic Position of the Mawes Language. Paper pre-
sented at the Workshop on the Languages of Papua 2, 8–12 February 2010, Manok-
wari, Indonesia.
Hammarström, Harald. 2010b. The Status of the Least Documented Language Families
in the World. Language Documentation and Conservation 4: 177–212.
Language isolates in the New Guinea region 315
Laycock, Don. 1972. Looking Westward: Work of the Australian National University on
Languages of West Irian. Irian 1(2): 68–77.
Laycock, Donald C. 1969. Melanesia Has a Fourth of the World’s Linguistic Diversity.
Pacific Islands Monthly 9: 71–76.
Laycock, Donald C. 1973. Sepik Languages: Checklist and Preliminary Classification
(Pacific Linguistics: Series B 25). Canberra: Research School of Pacific and Asian
Studies, Australian National University.
Laycock, Donald C. 1975a. Isolates: Sepik Region. New Guinea Area Languages and Lan-
guage Study Vol 1: Papuan Languages and the New Guinea Linguistic Scene (Pacific
Linguistics: Series C 38), ed. by Stephen A. Wurm, 879–886. Canberra: Research
School of Pacific and Asian Studies, Australian National University.
Laycock, Donald C. 1975b. Sko, Kwomtari and Left May (Arai) Phyla. New Guinea Area
Languages and Language Study Vol 1: Papuan Languages and the New Guinea linguis-
tic scene (Pacific Linguistics: Series C 38), ed. by Stephen A. Wurm, 849–858. Can-
berra: Research School of Pacific and Asian Studies, Australian National University.
Laycock, Donald C. 1977. Me and You Versus the Rest: Abbreviated Pronoun Systems in
Irianese/Papuan Language. Irian 7: 33–41.
Laycock, Donald C. 1982a. Linguistic Diversity in Melanesia: A Tentative Explanation.
GAVA’: Studien zu austronesischen Sprachen und Kulturen Hans Kähler gewidmet, ed.
by Rainer Carle, 31–37. Berlin: Reimer.
Laycock, Donald C. 1982b. Melanesian Linguistic Diversity: A Melanesian Choice?
Melanesia: Beyond Diversity, vol. I, R.J. May and Hank Nelson, 33–38. Canberra:
Research School of Pacific Studies, Australian National University.
Laycock, Donald C. and John A. Z’Graggen. 1975. The Sepik-Ramu Phylum. New
Guinea Area Languages and Language Study Vol 1: Papuan Languages and the New
Guinea Linguistic Scene (Pacific Linguistics: Series C 38), ed. by Stephen A. Wurm,
731–764. Canberra: Research School of Pacific and Asian Studies, Australian National
University.
Laycock, Donald C. (no date). Notebook D28. Ms.
Le Roux, C.C.F.M. 1950. Alphabetische woordenlijst: Dèm – Nederlands, Zinnen en uit-
drukkingen: Nederlands – Dèm. De Bergpapoea’s van Nieuw-Guinea en hun Woonge-
bied, vol. II, 852–862, 892–895. Leiden: E. J. Brill.
Le Roux, C.C.F.M. 1950a. 15: Tellen en Rekenen – Maten – Tijdrekening – Windstreken –
Kennis van Kleuren – Gebarentaal. De Bergpapoea’s van Nieuw-Guinea en hun Woonge-
bied, vol. II, 528–553. Leiden: E. J. Brill.
Le Roux, C.C.F.M. 1950b. 25: Taalkundige Gegevens. De Bergpapoea’s van Nieuw-
Guinea en hun Woongebied, vol. II, 776–900. Leiden: E. J. Brill.
Lean, Glendon A. 1986. Sandaun Province (Counting Systems of Papua New Guinea 7).
Port Moresby: Papua New Guinea University of Technology. Draft Edition.
Levinson, Stephen C. 2006. The language of space in Yélî Dnve. Grammars of Space:
Explorations in Cognitive Diversity, ed. by Stephen C. Levinson and David P. Wilkins,
157–203. Cambridge: Cambridge University Press.
Levinson, Stephen C. (ms) A grammar of Yele.
Lewis, Paul M. 2009. Ethnologue: Languages of the World (16th ed.). Dallas: SIL
International.
Lewis, Paul M., Gary F. Simons, and Charles D. Fennig. 2015. Ethnologue: Languages
of the World (18th ed.). Dallas: SIL International.
Lindström, Eva. 2002. Topics in the Grammar of Kuot. Doctoral dissertation, Stockholm
University.
318 Harald Hammarström
Pryor, John. 1990. Deixis and Participant Tracking in Botin. Language and Linguistics
in Melanesia 21: 1–29.
Rascher, Matthias. 1903–1904. Die Sulka: Ein Beitrag zur Ethnographie von Neu-
Pommern. Archiv für Anthropologic, N. F. 1: 209–235.
Ray, Sidney H. 1907. Grammar Notes on the Namau Language Spoken in the Purari
Delta. Linguistics (Reports of the Cambridge Anthropological Expedition to Torres
Straits III), ed. by Sidney H. Ray, 325–332. Cambridge: Cambridge University Press.
Ray, Sidney H. 1919. A New Linguistic Family. The Bible in the World 15: 149–150.
Ray, Sidney H. 1927. The Non-Melanesian Languages of the Solomon Islands. Fest-
schrift Publication d’hommage of ferte au P. W. Schmidt, ed. by Wilhelm Koppers,
123–126. Vienna: Mechitharisten-Congregations- Buchdruckerei.
Reesink, Ger. 2005. Sulka of East New Britain: A Mixture of Oceanic and Papuan Traits.
Oceanic Linguistics 44, no. 1: 145–193.
Reesink, Ger P. 1976. Languages of the Aramia River Area. Papers in New Guinea Lin-
guistics 19 (Pacific Linguistics: Series A 45), 1–37. Canberra: Research School of
Pacific and Asian Studies, Australian National University.
Rentoul, Alex C. 1924. Vocabulary of Words Obtained from a Native of the Dibiasu
Tribe Living in the Country in the Vicinity of the Upper Bamu (or Woi-Woi) Western
Division. Commonwealth of Australia. Papua: Annual Report for the Year 1924,
74–74.
Robide van der Aa, Pieter Jan Baptist Carel. 1879. Reizen naar Nederlandsch Nieuw-
Guinea ondernomen op last der Regeering van Nederlandsche Indie in de jaren 1871,
1872, 1875–1876 door de Heeren P. van Crab en J.E. Teysmann, J.G. Coornengel, A.J.
Langeveldt van Hemert en P. Swaan. The Hague: Martinus Nijhoff.
Ross, Malcolm. 2001. Is There an East Papuan Phylum? Evidence from Pronouns. The
Boy from Bundaberg: Studies in Melanesian Linguistics in Honour of Tom Dutton
(Pacific Linguistics 514), ed. by Andrew Pawley, Malcolm Ross, and Darrell Tryon,
301–321. Canberra: Research School of Pacific and Asian Studies, Australian National
University.
Ross, Malcolm. 2006. Clues to the Linguistic Situation in Near Oceania before Agricul-
ture. Paper for presentation at the symposium Historical linguistics and hunter-gatherer
populations in global perspective, MPI-EVA Leipzig, 10–12 August 2006.
Ross, Malcolm D. 2005. Pronouns as a Preliminary Diagnostic for Grouping Papuan
Languages. Papuan Pasts: Studies in the Cultural, Linguistic and Biological History of
the Papuan-Speaking Peoples (Pacific Linguistics 572), ed. by Andrew Pawley, Robert
Attenborough, Jack Golson, and Robin Hide, 15–66. Canberra: Research School of
Pacific and Asian Studies, Australian National University.
Routamaa, Judy. 1994. Kamula Grammar Essentials. Ms. www.sil.org/pacific/png/
abstract.asp?id=50209 (1 August 2008).
Rumaropen, Benny. 2004. Draft Survei Sosiolinguistik pada ragam Bahasa Kimki di
Bagian Tenggara Gunung Ji, Papua, Indonesia. To appear in the SIL Electronic Survey
Reports.
Rumaropen, Benny. 2005. Laporan sosiolinguistik bahasa Poulle di Kampung Molof dan
Waley, Kabupaten Keerom, Papua, Indonesia. To appear in the SIL Electronic Survey
Reports.
Rumaropen, Benny. 2006. Draft Survey Report on the Kapauri Language of Papua. To
appear in the SIL Electronic Survey Reports.
San Roque, Lila. 2008. An Introduction to Duna Grammar. Doctoral dissertation, Aus-
tralian National University.
320 Harald Hammarström
San Roque, Lila and Robyn Loughnane. 2012. Inheritance, Contact and Change in the
New Guinea Highlands Evidentialitv Area. History, contact and classification of Pap-
uan languages (LLM Special Issue 2012), ed. by Harald Hammarström and Wilco van
den Heuvel, 387–427. Port Moresby: Linguistic Society of Papua New Guinea.
Schellong, Otto. 1890. Weitere Wörterverzeichnisse (Einzelbeiträge zur allgemeinen und
vergleichenden Sprachwissenschaft 7). Leipzig: Wilhelm Friedrich.
Schlatter, Tim. 2003. Tabo Language Grammar Sketch (Aramia River). Ms.
Schmidt, Wilhelm. 1904. Eine Papuasprache auf Neupommern. Globus LXXXVI: 79–80.
Schneider, Joseph. 1962. Grammatik der Sulka-Sprache (Neubritannien) (Micro-Biblioteca
Anthropos 36). Posieux: Anthropos Institut.
Shafer, Robert. 1965. Was New Guinea the Graveyard of 100 South Asian and Pacific
Cultures? Orbis 14, no. 2: 312–385.
Shaw, Daniel R. 1973. A Tentative Classification of the Languages of the Mt. Bosavi
Region. The Linguistic Situation in the Gulf District and Adjacent Areas, Papua New
Guinea (Pacific Linguistics: Series C 26), ed. by Karl J. Franklin, 189–215. Canberra:
Research School of Pacific and Asian Studies, Australian National University.
Sillitoe, Paul. 1993. The Bogaia of the Muller Ranges, Papua New Guinea. Sydney:
University of Sydney.
Silzer, Peter J. and Heljä Heikkinen. 1984. Index of Irian Jaya Languages. Irian XII:
1–124.
Silzer, Peter J. and Heljä Heikkinen-Clouse. 1991. Index of Irian Jaya Languages (Spe-
cial Issue of Irian: Bulletin of Irian Jaya) (2nd ed.). Jayapura: Program Kerjasama
Universitas Cenderawasih and SIL.
Smits, Leo and C.L. Voorhoeve. 1994. The J. C. Anceaux Collection of Wordlists of Irian
Jaya Languages B: Non-Austronesian (Papuan) Languages (Part I) (Irian Jaya Source
Material No. 9 Series B 3). Leiden-Jakarta: DSAL-CUL/IRIS.
Smits, Leo and C.L. Voorhoeve. 1998. The J. C. Anceaux Collection of Wordlists of
Irian Jaya Languages B: Non-Austronesian (Papuan) Languages (Part II) (Irian Jaya
Source Material No. 10 Series B 4). Leiden-Jakarta: DSAL-CUL/IRIS.
Speiser, Felix. 1944. Eine Initiationszeremonie in Kambrambo am Sepik Neuguinea. Eth-
nologischer Anzeiger 4: 153–157.
Stebbins, Tonya N. 2009. The Papuan Languages of the Eastern Bismarcks: Migration,
Origins and Connections. Discovering History Through Language: Papers in Honour
of Malcolm Ross (Pacific Linguistics 605), ed. by Bethwvn Evans, 223–243. Canberra:
Research School of Pacific and Asian Studies, Australian National University.
Strong, Marsh W. 1911. Note on the Tate Language of British New Guinea. Man 11,
no. 101: 178–181.
Summerhayes, Glenn R., Andrew Fairbairn, Matthew Leavesley, Herman Mandui, Judith
Field, Anne Ford, and Richard Fullagar. 2010. Human Adaptation and Plant Use in
Highland New Guinea 49,000 to 44,000 Years Ago. Science 330, no. 6000: 78–81.
Tadmor, Uri, Martin Haspelmath, and Bradley Taylor. 2010. Borrowability and the
Notion of Basic Vocabulary. Diachronica 27, no. 2: 226–246.
Terrill, Angela. 2003. A Grammar of Lavukaleve (Mouton Grammar Library 30). Berlin:
Mouton de Gruyter.
Terrill, Angela. 2011. Languages in Contact: An Exploration of Stability and Change in
the Solomon Islands. Oceanic Linguistics 50, no. 2: 312–337.
Terrill, Angela and Michael Dunn. 2003. Orthographic Design in the Solomon Islands:
The Social, Historical, and Linguistic Situation of Touo (Baniata). Written Language
and Literacy 6, no. 2: 177–192.
Language isolates in the New Guinea region 321
LANGUAGE ISOLATES
OF AUSTRALIA
Claire Bowern
1 INTRODUCTION
Language isolates are languages with no demonstrable relationship to other languages
(Campbell, this volume), in effect, language families with a single member. Under
this definition, strictly applied, there are six language isolates in Australia. However,
the languages of Australia have long been assumed to be all ultimately related to one
another (Wurm 1972; Dixon 1980) at some level of remoteness. Indeed, one might argue
that one of the recurrent “tropes” of work on Australian Aboriginal languages is not
their diversity but the insistence on their similarity, either through shared inheritance
or through longstanding language contact. It is not difficult to find statements that a
particular language is “typical” of the languages in the region or continent. Compare,
for example, Goddard (1985:167), Dixon (1977:1), Butcher (2006:187), and Tsunoda
(2012:1), amongst others.
Despite these assumptions of large-scale similarity, it has not always been the case that
all the languages of Australia were assumed to be related, even within the Pama-Nyun-
gan family. Statements of this type are not hard to find in the literature. Dixon (1976:2)
writes “It has long been believed (although, of course, it has not yet been proved) that
all or most Australian languages are genetically related.” Here, Dixon mentions two lan-
guages that were treated as possible exceptions: Mbabaram and Anewan (also known as
Nganyaaywana). Harvey’s (2001) discussion of relations in the Darwin region, discussed
in Section 2.5, also explicitly presupposes a single family of Australian languages. Aus-
tralian languages are thus a good illustration of Campbell’s point in the introduction to
this volume that there is nothing linguistically special about language isolates, since their
status as isolates may change as our knowledge of language history changes. Overview
and discussion of those languages, particularly Mbabaram, Anewan (Nganyaaywana),
Tiwi, and Anindilyakwa, are provided in Section 2. I also discuss the languages of Tas-
mania in Section 3, since they have sometimes been treated as a single language isolate.
Another striking feature of the Australian continent is the number of primary subgroups
in the major family of the area (Pama-Nyungan; Bowern and Atkinson 2012; Bowern and
Koch 2004), including languages, such as Warumungu, Anewan, and the Torres Strait
language, where the most recent common ancestor is Proto-Pama-Nyungan. That is, they
do not seem to share innovations with any intermediate proto-language. They might be
considered “subgroup-level isolates”. I review a few of these cases in Section 4.
The degree of language attrition and extinction is such in Australia that there are far
fewer languages currently spoken than at the time of European settlement (beginning in
the late 18th century). This has produced quite a few modern isolates; that is, languages
where all known relatives have ceased to be spoken.
324 Claire Bowern
the 20th century.) In the interests of space, I concentrate here primarily on the languages
repeatedly identified as isolates across classifications and on the languages that have fea-
tured in discussions about the universal relatedness of Australian languages.
Several languages of off-shore islands, particularly Tiwi (Bathurst and Melville
Islands) and Anindilyakwa (Groote Eylandt), are very divergent from their nearest geo-
graphic neighbors on the mainland. It is often assumed that long periods of diversification
in isolation have led to languages that share very few features with their neighbors.
Just as some putative isolates elsewhere in the world have been shown to be members
of small families (compare the case of Japanese ~ Japonic discussed in Campbell, this
volume), so too in Australia, the “isolate” status of some languages depends on the anal-
ysis of them as being either a single language with internal dialect diversity, or a cluster
of closely related languages. In all too many cases, there is only one well-documented
language in the family. For example, Wardaman forms a cluster with Yangman and Dag-
oman (Merlan 1994), but the materials for the latter two varieties are so sparse that we
cannot tell how distinct they are from each other, or from Wardaman. An example of a
case where an isolate has been shown to be, in fact, a cluster of related languages is Giim-
biyu. Treated as a single language in Dixon (2002), a recent description (Birch 2006)
shows that there are three languages – Urninangk, Erre, and Mengerre(dji) – that make up
the subgroup. Harvey (2002) treats these as dialects of a single language but on the basis
of admittedly small amounts of material. At no point in the language descriptions do the
authors consider relationships outside this family.
There are also cases of languages which are too poorly described to be classified with
certainty at this point, and some known by name only. These include the Minkin language
of Bourketown (discussed by Evans (1990)), where evidence points to some relationship
with Tangkic, but a remote one at best.
I review below the classifications of each of the languages that Campbell (2013:168)
lists as isolates in Australia. All the languages except for Tiwi and Anindilyakwa are now
extinct. All are found in the Top End region of Australia’s Northern Territory. Map 12.1
gives the region and languages found there; isolates are highlighted.
2.1 Anindilyakwa/Enindhilyakwa
Anindilyakwa (also spelled Enindhilyakwa) is spoken on Groote Eylandt. It has approx-
imately 1300 speakers, according to the Australian national census from 2006. This
number is supported by estimates in van Egmond (2012). The ISO-639 code is aoi; the
Glottolog identification number is anin1240, and the AIATSIS code is N151.
Anindilyakwa is claimed to be an isolate by Rademaker (2014), following O’Grady
et al. (1966). Evans (2005:250) argues that, although there are some similarities between
the language and Wubuy (Heath 1984), there is sufficiently little shared vocabulary that
the language should be considered an isolate, at least at the language family level (that is,
a language family comprising a single language, but perhaps related to other Australian
languages at a more remote level).
Van Egmond (2012) reviews the evidence and suggests, contra previous claims, that
Anindilyakwa is fairly closely related to other Gunwinyguan languages of the area, includ-
ing Wubuy. The linguistic evidence includes partly typological information, including
the presence of lamino-dentals (van Egmond 2012:21, 318). Lexical cognates include –
m+ad̪ aŋkwa ‘flesh’ (< Proto-Gunwinyguan *d̪aŋku ‘meat’ (Harvey 2003a)), – m+akuʎa
‘skin’ < *kuɭak, and ʎang ‘head’ (incorporated element, not the free form) < *Long ~
rong (L is a lateral unspecified for retroflection). Van Egmond (2012:313) suggests that
Pama-Nyungan
Language boundary
Dialect boundary
Pama-Nyungan
about 40% of Anindilyakwa’s basic vocabulary is cognate with items in Wubuy. The
sound correspondences presented by van Egmond (2012) are regular and are attested in
lexical items from a variety of semantic fields (though concentrated in basic vocabulary).
Moreover, Wubuy and Anindilyakwa show a few shared innovations in sound change,
including Proto-Gunwinyguan *d̪ > l̪ . Wubuy and Anindilyakwa share a number of com-
plex predicates, as presented in Table 12.1 (from van Egmond 2012:315ff).
Van Egmond (2012:348ff) also presents evidence that shows that Wubuy, Ngandi
(another Gunwinyguan language), and Anindilyakwa have matches in verb conjugation
membership; that is, the putatively cognate verb roots belong to parallel conjugation
classes, further indicative of genetic relationships.
The linguistic evidence for genetic relationships is corroborated by the archaeological
evidence, which implies that settlement on Groote Eylandt was sporadic after 4000 BP
and increased only about 1300 years ago. This would imply that there was relatively little
time for linguistic diversity between Groote Eylandt and the mainland to accrue and that
it is plausible that Anindilyakwa’s closest linguistic relatives may still be represented in
the region. Note that the composition of the Gunwinyguan family itself is controversial,
with Alpher, Evans, and Harvey’s (2003) and Harvey’s (2003a) proposals being incom-
patible with Green’s (2003) proposal for a macro-family of the Non-Pama-Nyungan lan-
guages of Arnhem Land. Green’s proposal, however, includes only some of the languages
traditionally classed as Gunwinyguan.
Whether or not one treats Anindilyakwa as an isolate or member of the Gunwinyguan
family, the language has a number of interesting typological features and puzzles for lin-
guistic analysis. Anindilyakwa’s phoneme inventory is given in Table 12.2. The different
treatments of the language vary extensively in the contrastive segments they recognize;
Manner of Stop p t ʈ t̪ c k kʷ
articulation Nasal m n ɳ n̪ ɲ ŋ ŋʷ
Lateral l ɭ l̪ ʎ
Vibrant ɾ
Glide ɹ j w
Nasal + stop mp nt ɳʈ n̪ t̪ ɲc ŋk ŋkʷ
Complex kp
segments ŋp
ŋm
a thorough summary is given in van Egmond (2012). Previous treatments recognize one,
two, or four vowels; in the last case, the qualities of the vowels also vary. Van Egmond
argues for an asymmetric, four-way vowel distinction between i, ɛ, ə, and a (that is, with
no phonemic rounded vowels in the language). Stokes (1981:154) also finds four pho-
nemes (i, u, ɛ, and a); Leeding (1989) argues for just ɨ and a as contrastive vowels. The
crux of the difficulty lies in the extent to which vowels and consonants are coarticulated
and thus the degree to which one analyzes distinctive features on the consonant, when
they are also (or even primarily) realized on the following vowel.
Anindilyakwa, like other Non-Pama-Nyungan languages of the region (including lan-
guages of the Gunwinyguan family), is morphologically complex. It shows both noun-
class prefixation and multiple agreement morphology on verbs. The voice system is also
complex, including an applicative prefix mən- and reflexive, reciprocal, and causative
morphology. The language exhibits noun incorporation and prefixation for quantification.
In summary, it is clear from van Egmond’s evidence that Anindilyakwa is a member of
the Gunwinyguan family.
2.2 Kungarakany
In some cases, the classification of languages shifts depending on the catalogue but
in no case is evidence presented in favor or against a particular classification. We are
thus left to rely on the word of the linguists themselves for the claims of relationship.
An example of the problem is given here for Kungarakany (ISO-639 ggk, Glottolog
kung1259, AIATSIS N14), a now-extinct language of the region south of Darwin. Kun-
garakany is treated as an isolate in Wurm (1972), but in the Gungaraganyan subgroup
of Gunwinyguan in Ethnologue (Lewis, Simons, and Fennig 2013), and a member of
the “Arnhem” group in Dixon (2002). None of these sources provides evidence for their
classifications. Evans (1988:92) quotes Harvey (1986) as identifying a discontinuous
Non-Pama-Nyungan family comprising Kungarakany and Warray but does not give evi-
dence for this. Harvey’s (1990: 14–15) grammar of Warray summarizes Harvey (1986)
and says the following:
In brief Warray is a member of the large Kunwinjkuan language family. Within that
family it was probably most closely related to the extinct Wulwulam language. The
most closely related living language is Jawoyn. From the little information available
on Uwiynmil it is clear that it and Warray are reasonably closely connected. Gung-
arakayn is also a member of the Kunwinjkuan language family, but the connection
between it and Warray is much more distant than those between Warray and its east-
ern neighbours. Warray is not related to the other neighbouring languages, except in
so far as these languages are members of the Australian language family.
Parish (1983:1) quotes Tryon (1968:23) as grouping Kungarakany with Larakia and War-
ray in a ‘Northern’ branch of Daly languages; none of these languages is included in the
later publication of the Daly survey (Tryon 1981). Tryon’s (1968:24) putative cognate
percentages between Kungarakany and other languages are in single digits, except for
Warrai [= Warray] (12.8%), Matngala [= Matngele] (11.2%), and Kamor (15.3%). Data
for Larrakia is not presented, and no discussion is given for why Kungarakany should be
grouped with Warrai here, rather than Kamor [=Kamu].
Kungarakany is poorly described. The only available descriptive materials to
my knowledge are in Parish (1983), though Parish mentions that several other lin-
guists have collected materials on the language. The pronouns are structured in a
minimal-augment (Ilocano) system (Greenberg 1988; McGregor and Greenberg 1989).
Both subject and object markers are prefixes on the verb; there is some variation in
person marking depending on tense. The ordering of agreement markers depends on
the person hierarchy, with first person preceding second or third. This is illustrated in
example (1):
(1) a ar-in-kiɲfin
1min.S-2aug.O.nf-leave
‘I left you [pl].’
b kan-i-kiɲfin
1min.O-2aug.S.nf-leave
‘You [pl] left me.’ (Parish 1983:17, bold emphasis added)
The language has extensive morpho-phonological alternations, like many of the lan-
guages of the region. Kungarakany has phonemic /f/, relatively rare for Australian lan-
guages but well represented in the Daly region (about 25% of the languages in Gasser
and Bowern’s 2014 survey of phonological patterns in Australian languages have phone-
mic fricatives). Kungarakany has other features that are shared across many Australian
languages, such as the verb roots bu- ‘hit’ and ni- ‘sit’. Kungarakany also has features
which are found in other Australian languages, but which are not meaningful for estab-
lishing genetic relatedness. For example, verbal negation involves marking the verb with
potential mood and using a preverbal clausal negator moɹoŋ (Parish 1983:38). The same
negation strategy is used in Bardi and other Nyulnyulan languages, as well as a number of
other Non-Pama-Nyungan languages. But irrealis/potential and negation marking is also
common outside Australia, and so its presence in several different areas of the country
should not be seen to signal a particularly close genetic affiliation. We must therefore
treat Kungarakany’s affiliation as “uncertain” at this point.
2.3 Tiwi
Perhaps the prototypical linguistic isolate within the Australian continent is Tiwi (ISO-
639 tiw, Glottolog code tiwi1244, AIATSIS code N20), spoken by about 3000 people
on Bathurst and Melville Islands and in the neighboring city of Darwin. Evans (2005)
says explicitly that no one would doubt that it should be classified as a primary branch
of Proto-Australian (assuming that one believes in the unity of Australian languages).
Dixon (1980:225) includes Tiwi and Djingili as the two languages whose genetic affili-
ation with the rest of Australia remains unresolved. He says that “Australian” under this
definition should be taken as excluding Tiwi and Djingili, but he does, in fact, use data
330 Claire Bowern
1 *ŋa- ŋi-
1/2 *mV- mu-
min
2 *cV- ɲi- (NP), ci-
3 *ka (NP), ø- a- (NP), yi-, ci-
1 *ɲV-rV- ŋi-
1/2 *ŋV-rV- ŋa-
aug
2 *nV-rV-, *ku-rV- ɲi-
3 *pV-rV- wu- (NP), pi-
from Tiwi in adducing support for certain Proto-Australian reconstructions, such as *ŋin
‘2sg’ (Tiwi ŋinhtha) and *ŋayu ‘1sg’ (Tiwi ŋiya). He also states that “Tiwi has a normal
Australian phonemic system” (Dixon 1980:487). Wilson (2013:17) compares Tiwi bound
pronoun markers with those reconstruction for a number of Northern families by Harvey
(2003b:500). That evidence is summarized in Table 12.3. Harvey’s Non-Pama-Nyungan
pronominal reconstructions are presented along with the Tiwi forms that are assumed to
be cognate.
While the resemblances are striking, they are also only a single segment in most cases.
Tiwi has a morpheme rri- which occurs in the same position as the non-singular marker
reconstructed by Harvey (2003b) for Proto-Non-Pama-Nyungan, but it denotes past tense
rather than subject number. While the semantic connections between tense and number
are difficult, there are examples elsewhere in Australia of changes in paradigm struc-
tures where sequences of morphemes have been reanalyzed, such that morphemes for-
merly associated with one category come to be analyzed as denoting something else. For
example, Bowern (2012a) describes a case in the Nyulnyulan family where transitivity
marking is reanalyzed as the exponent of tense. However, in that case, the languages are
sufficiently close that one can easily find matches across other paradigms, as well as in
the lexicon. The reconstructions presented as evidence for Tiwi’s connections to other
Non-Pama-Nyungan languages are much sparser.
Tiwi is perhaps most famous within the literature on linguistics for its ‘young people’s
variety’ (Lee 1987). That is, there is a sharp division between the ‘traditional’ language
and the contact variety that emerged in the 1960s as familiarity with English increased.
Differences between the traditional language and modern Tiwi center on the verb mor-
phology. Traditional Tiwi morphology is considerably more complex than the Modern
variety, where the number of distinctions marked on the verb, as well as their nature, is
much reduced. For example, the minimal-augment system of Traditional Tiwi, which
includes, by nature, distinctions in clusivity, has been lost, and the Modern Tiwi sys-
tem includes only singular and plural. The first person inclusive and exclusive markers
have been refunctionalized (Smith 2008) as tense markers. Traditional Tiwi exhibits noun
incorporation, but the modern language does not allow this. Tiwi has thus been important
in Australian languages as a case study for the types of “simplifying” changes that mor-
phologically complex languages can undergo and how morphological distinctions may
be refunctionalised or obliterated. There is no evidence at this point that Tiwi is closely
related to other Australian languages.
Language isolates of Australia 331
2.4 Mangarrayi
Mangarrayi is another Non-Pama-Nyungan language. It is now extinct but was formerly
spoken inland along the Roper River in the Northern Territory. The ISO-639 code is
mpc; the Glottolog code is man1381, and the AIATSIS code is N78. There is conflicting
classification. Alpher, Evans, and Harvey (2003) include it in the Gunwinyguan family,
but Merlan (1982:x) suggests that it is a member of the same family as Mara, Alawa, and
Warndarrang. Dixon (2002) includes Mangarrayi as the sole member of one of thirteen
primary divisions in his ‘Arnhem Land Group [NB]’. This group includes the members
of both Gunwinyguan and Maran families, but Dixon does not recognize these fami-
lies per se. Merlan’s primary evidence for including Mangarrayi within Maran is shared
archaisms in derivational verbal morphology. Further evidence from shared noun-class
morphology is discussed in Merlan (2003).
Conversely, Alpher, Evans, and Harvey (2003) suggest that Mangarrayi’s verb para-
digms are sufficiently similar to other Gunwinyguan languages that it should be treated as
a branch of Gunwinyguan. Harvey (2012), moreover, casts doubt on the status of Maran,
by suggesting that most of the features shared between Mara and Warndarrang (two of
the four languages of the Maran family) have been borrowed, rather than inherited from a
recent common proto-language. It should be noted that Gunwinyguan itself has been the
subject of some debate. Evans (2003:13) succinctly summarizes the competing positions.
One of the most contentious issues is the inclusion] of Mangarrayi (as discussed above),
Anindilyakwa (for which see van Egmond 2012 and Section 2.1 above), Wardaman and
its close relatives, and Wagiman. The issue for Mangarrayi, then, as Merlan (2003) notes,
is whether all these similarities point to a remote relationship between both Maran and
Gunwinyguan (including Mangarrayi), or whether some of the shared features are better
explained through language contact. Certainly, Mangarrayi has been in fairly intensive
contact with its Gunwinyguan and Maran neighbors for a considerable period of time.
These questions are at this point unresolvable without further systematic study and recon-
struction of the relevant languages.
2.5 Gaagudju
Gaagudju, a language of the Darwin hinterland, is also extinct. Harvey (2002:15–16)
summarizes his position on relationships between Gaagudju and its neighbors as
follows:
Note that Harvey appears to have reversed ‘east’ and ‘west’ in this description, since
Gaagudju was spoken to the east of Darwin, not west, and the other languages Harvey
mentions as being part of the Darwin region Sprachbund are all coastal between Darwin
and Jabiru, but to the east of Darwin.
Just as in the case of Mangarrayi discussed in the previous section, longstanding lan-
guage contact potentially obscures remote relations. Note also that the features that Har-
vey mentions in the quote above are all typological features, and unlikely to be diagnostic
of genetic relationships in the absence of systematic similarities in phonological form.
However, these types of features are similar to what some have used to argue for genetic
relationships (though not Harvey, whose claims for classification are usually based on
shared morphology and structures in complex verbal paradigms). Thus we should proba-
bly treat Gaagudju as another isolate within Australia.
The final set of languages to discuss in this region are Umbugarla and Ngombur.
Umbugarla is poorly attested; the only description to my knowledge is Davies (1989),
a synthesis of brief field notes from Gavan Breen, Mark Harvey, Nicholas Evans, and
Frances Morphy. Harvey (2001:9) suggests that Umbugarla and Ngombur formed their
own family; Davies (1989) contains no information about classification. They remain as
isolates given the lack of further information (and none is likely to be forthcoming in the
future).
Anewan shows reflexes of sufficiently many terms that are securely reconstructed
within Pama-Nyungan, that there should be no doubt that it is a Pama-Nyungan lan-
guage. Its position within Pama-Nyungan is uncertain, however. Bowern and Atkinson
(2012) did not include Anewan in their sample due to lack of data; subsequent expansions
of Pama-Nyungan phylogenetic work have included data from Anewan, but results are
inconclusive. Most frequently, it appears in the tree as a sister to Gumbayngirric (Gumba-
yngirr and Yaygirr), but with fairly low posterior probability (0.7). This is because most
of the cognate items are retentions from Proto-Pama-Nyungan, rather than indicative of
334 Claire Bowern
a closer relationship with other languages in the family. All but one of the items listed
above in (3), for example, are reconstructible to Proto-Pama-Nyungan (the exception is
ikana ‘snow’).
Back in the Non-Pama-Nyungan region, Djingulu (Jingulu, Jingili) was also treated as
an isolate in some early work. Harvey (2008) provides the clearest evidence that Jingulu
is a primary branch of Mirndi and thus related (though fairly distantly) to Jaminjungan
and Ngurlun languages (including Wambaya). This follows earlier work by Chadwick
(1984; 1997).
3 TASMANIAN LANGUAGES
The third ‘language’ discussed by Wurm (1971; 1972) as potentially not related to other
Australian languages was Tasmanian. From the earliest descriptions of the languages
of Tasmania, there have been doubts about the number of languages represented by the
sources. Tasmanian languages are attested by about 10,000 words of vocabulary and a
few short sentences (from a lingua franca used on the Flinders Island Mission). The
materials were recorded between 1770 and the early 1900s. Tasmanian languages were
not historically named, though local bands were; this has exacerbated the confusion about
the number of distinct languages represented in the sources.
The extant records for Tasmania were published in Plomley (1976). Earlier works on
the languages, including Schmidt (1952), Roth (1899), and Jones (1974), came to different
conclusions about the number of languages represented in the sources, and their relation-
ships to the languages of the Australian mainland. Some early work suggested that Tasma-
nian languages were related to those of the Australian mainland, because of certain lexical
resemblances. However, those resemblances stem from a single source – a Ben Lomond
vocabulary recorded by Charles Robinson (son of George Augustus Robinson). Amery
(1996) has shown conclusively that the words in that list are Kaurna, a Pama-Nyungan
(Thura-Yura) language of the Adelaide Plains in South Australia, and were most likely
recorded from a sealer. Moreover, those words are all on a single manuscript page and in
a different handwriting from the rest of the Ben Lomond vocabulary. Once those words
are removed from the Tasmanian dataset, there is no resemblance between the languages
of Tasmania and those of the Australian mainland.
In Bowern (2012b), I discuss previous family divisions within Tasmania, including the
number of languages represented in the data. I found evidence that the extant materials
for Tasmania cover at least 12 distinct languages, from 4 or 5 different families. Only
26 ‘cognates’ are attested across each of the families, and these words are either clearly
introduced items, such as cattle, flora/fauna terms, such as boobyalla ‘native willow
(Acacia longiflora) – which is probably a loan – or mythological terms which might also
be expected to be subject to diffusion. While there was some evidence that the Southeast
and Eastern families were related more distantly, there was no evidence of a single Tas-
manian language family, let alone a single Tasmanian language.
Dixon (1980; 2002). Just like the Non-Pama-Nyungan families with multiple one- or
two-language families, so too in Pama-Nyungan there are numerous subgroups with one
or two languages and little evidence of where they appear in the family.
Work by Bowern and Atkinson (2012) and subsequent work on further language
relationships by Bowern has resolved many (but not all) of these family-level isolates
within Pama-Nyungan. Bowern and Atkinson (2012) found four coordinate branches of
Pama-Nyungan but did not have sufficient evidence to reduce the number of primary
branches further. Subsequent work (cf. Bowern 2015) suggests that there are two main
branches of Pama-Nyungan: a Western branch (also found by Bowern and Atkinson 2012)
that includes the Yolngu, Warluwaric, Pilbara, Ngumpin-Yapa, Wati, and Nyungic sub-
groups and an Eastern branch with further divisions. A tree based on the Stochastic Dollo
model of cognate evolution for 104 representative languages is given in Figure 12.1.
Subsequent work on the internal structure of Pama-Nyungan has clarified the clas-
sification of a number of these “isolates” within the family. I here briefly discuss some
illustrative languages: Warumungu, Bigambal, Anewan, and the Western Torres Strait
language (Kala Lagaw Ya and related dialects).
Dharumbal
Bindal
Wulguru
Mbabaram
Dyirbalic
Maric
Language isolates of Australia 337
closely related varieties spoken on the western islands of the Torres Strait (between Cape
York Peninsula and the Papua New Guinea mainland). Surrounding languages are ‘Pap-
uan’ to the east and north, and the Northern Paman group of Paman languages to the
south. Though the Western Torres Strait language is clearly Pama-Nyungan, it is not
closely related to its nearest neighbors. It shares none of the sound changes that charac-
terize Paman languages, for example, especially the complicated stratigraphies of North-
ern Paman (Hale 1964). Lexically, it suffers in classification from the same problems as
Bigambal and Nganyaaywana: viz., that its shared features are shared retentions from a
high level in the family. In Bowern and Atkinson (2012), its nearest relative is Kukatj.
Kukatj is usually classified as Paman but without further discussion or with contradic-
tory statements. Breen (1992:2), for example, says both that Kukatj is the sole member
of a Flinders Paman subgroup, which is most closely related to Norman Paman and that
there are no higher level subgroups that include Norman Paman, Kukatj, and no further
languages (which implies by definition that Norman Paman is therefore not the most
closely related subgroup to Kukatj). It is consistent with previous cursory classifications
that Kukatj is a sister to Paman (along with Western Torres) rather than a subgroup within
Paman. The relative closeness of Kukatj and Western Torres in Bowern and Atkinson
(2012) is interesting, since the two languages are spoken on opposite sides of the Paman
subgroup. This possibly implies that the Paman group was an expansion in Cape York
Peninsula that replaced earlier linguistic communities.
Davis (2004:234) suggests that the Western Torres Strait Islands have been contin-
uously settled and fished for the last 3,000 years (see also Carter 2001), implying that
Western Torres is probably not a recently introduced language to the region.
4.3 Warumungu
Warumungu is still spoken; speakers live in the Northern Territory, in the area of Tennant
Creek. There is a Warumungu sketch grammar (Simpson and Heath 1982) and learn-
er’s guide (Simpson 2002) but little other available materials. Simpson (2008) treats the
classification of Warumungu as uncertain, since the language is bordered by several sub-
groups of Pama-Nyungan, as well as Non-Pama-Nyungan languages. As in the case of
other family-level isolates discussed above, Warumungu has features that set it apart from
its neighbors. For example, phonologically, Warumungu has a fortis-lenis stop contrast,
which is absent from other Pama-Nyungan languages in the region.
REFERENCES
Alpher, Barry. 2004. Pama-Nyungan: Phonological Reconstruction and Status as a Phy-
logenetic Group. Australian Languages: Classification and the Comparative Method,
ed. by Claire Bowern and Harold Koch, 93–126. Amsterdam: John Benjamins.
Alpher, Barry, Geoffrey N. O’Grady, and Claire Bowern. 2008. Western Torres Strait
Language Classification and Development. Morphology and Language History: In
Honour of Harold Koch, ed. by Claire Bowern, Bethwyn Evans, and Luisa Miceli,
1–15. Amsterdam: John Benjamins.
Alpher, Barry, Nicholas Evans, and Mark Harvey. 2003. Proto Gunwinyguan Verb Suf-
fixes. The Non-Pama-Nyungan Languages of Northern Australia: Comparative Stud-
ies of the Continent’s Most Linguistically Complex Region, ed. by Nicholas Evans.
Canberra: Pacific Linguistics, 305-352
Amery, R. 1996. Kaurna in Tasmania: A Case of Mistaken Identity. Aboriginal History
20: 24–50.
Bani, Ephraim. 1976. The Language Situation in Western Torres Strait. Languages of
Cape York Peninsula, Queensland, ed. by Peter Sutton, 3–6. Canberra: Australian
Institute of Aboriginal Studies.
Bani, Ephraim and Terry Jack Klokeid. 1972. Kala Lagau Langgus-Yagar Yagar: The
Western Torres Strait Language. Manuscript. Canberra, ms.
Barrett, Bevan. 2005. Historical Reconstruction of the Maric Languages of Central
Queensland. Canberra: Australian National University MA Thesis.
Birch, Bruce. 2006. Erre, Mengerrdji, Urningangk: Three Languages from the Alligator
Rivers Region of North Western Arnhem Land, Northern Territory, Australia. Jabiru:
Gundjeihmi Aboriginal Corporation.
Blake, Barry J. 1990a. Languages of the Queensland/Northern Territory border: Updating
the Classification. Language and History: Essays in Honour of Luise A. Hercus, ed.
by Peter Austin, R.M.W. Dixon, Tom Dutton, and Isobel White, 49–66. (Pacific Lin-
guistics C-116). Canberra: Dept. of Linguistics, Research School of Pacific Studies,
Australian National University.
Blake, Barry J. 1990b. The Significance of Pronouns in the History of Australian Lan-
guages. Linguistic Change and Reconstruction Methodology, ed. by Philip Baldi,
Language isolates of Australia 339
Harvey, Mark. 2001. A Grammar of Limilngan: A Language of the Mary River Region,
Northern Territory, Australia. Pacific Linguistics, Research School of Pacific and
Asian Studies, the Australian National University.
Harvey, Mark. 2002. A Grammar of Gaagudju. Berlin: Walter De Gruyter.
Harvey, Mark. 2003a. An Initial Reconstruction of Proto Gunwinyguan Phonology. The
Non-Pama-Nyungan Languages of Northern Australia: Comparative Studies in the
Continent’s Most Linguistically Complex Region, ed. by Nicholas Evans, 205–268.
Canberra: Pacific Linguistics.
Harvey, Mark. 2003b. Reconstruction of Pronominals Among the Non-Pama-Nyungan
Languages. The Non-Pama-Nyungan Languages of Northern Australia: Compara-
tive Studies of the Continent’s Most Linguistically Complex Region. Canberra: Pacific
Linguistics.
Harvey, Mark. 2008. Proto Mirndi: A Discontinuous Language Family in Northern Aus-
tralia. Canberra: Pacific Linguistics.
Harvey, Mark. 2012. Warndarrang and Marra: A Diffusional or Genetic Relationship?
Australian Journal of Linguistics 32, no. 3: 327–360. doi:10.1080/07268602.2012.7
05578.
Harvey, Mark. nd. Northern Australian Aboriginal Languages. Unpublished map.
Heath, Jeffrey. 1984. Functional Grammar of Nunggubuyu. Canberra: Australian Insti-
tute of Aboriginal Studies.
Heath, Jeffrey. 1990. Verbal Inflection and Macro-Subgroupings of Australian Lan-
guages: The Search for Conjugation Markers in Non-Pama-Nyungan, Linguistic
Change and Reconstruction Methodology, ed. by Phillip Baldi, 403–417. The Hague:
Mouton.
Jones, R. 1974. Tasmanian tribes. Aboriginal Tribes of Australia, ed. by Norman Tindale,
319–354. Berkeley, CA: University of California Press
Lee, J. 1987. Tiwi Today: A Study of Language Change in a Contact Situation. Canberra:
Pacific Linguistics.
Leeding, Velma. 1989. Anindilyakwa Phonology and Morphology. PhD thesis, Depart-
ment of Anthropology, University of Sydney.
Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig. 2013. Ethnologue: Languages
of the World (17th ed.). Dallas, TX: SIL International.
Ling Roth, H. 1899. The Aborigines of Tasmania. Halifax: F King and Sons.
McGregor, W.B. 2002. Verb Classification in Australian Languages. Berlin: Walter de
Gruyter.
McGregor, William B. and J.H. Greenberg. 1989. Greenberg on the First Person Inclu-
sive Dual: Evidence from Some Australian Languages. Studies in Language, 13, no.
2: 437–458.
Merlan, Francesca. 1979. On the Prehistory of Some Australian Verbs. Oceanic Linguis-
tics 18, no. 1: 33–112.
Merlan, Francesca. 1982. Mangarrayi. Amsterdam: North Holland Publishing Company.
Merlan, Francesca. 1994. A Grammar of Wardaman: Language of the Northern Territory
of Australia [Wardaman/English]. Berlin: Mouton de Gruyter.
Merlan, Francesca. 2003. The Genetic Position of Mangarrayi: Evidence from Pronomi-
nal Prefixation. The Non-Pama-Nyungan Languages of Northern Australia: Compar-
ative Studies of the Continent′ s Most Linguistically Complex Region, ed. by Nicholas
Evans, 353–367. Canberra: Pacific Linguistics. www.jstor.org/stable/24046742 (10
March 2016).
342 Claire Bowern
O’Grady, Geoffrey N., C.F. Voegelin, and F.M. Voegelin. 1966. Languages of the World:
Indo-Pacific Fascicle Six. Anthropological Linguistics 8, no. 2: 1–197.
Oates, W.J. and Lynette F. Oates. 1970. A Revised Linguistic Survey of Australia.
(Australian Aboriginal Studies/Linguistic Series 33/12). Canberra: Australian Institute
of Aboriginal Studies.
Parish, Lucy. 1983. Some Aspects of Kungarakany Verb Morphology. MA Thesis, Austra-
lian National University, Canberra.
Plomley, N.J.B. 1976. A Word-List of the Tasmanian Aboriginal Languages. N. Plomley
in association with the Government of Tasmania. Hobart.
Rademaker, Laura. 2014. Language and Australian Aboriginal History: Anindilyakwa
and English on Groote Eylandt. History Australia 11, no. 2: 222.
Ray, Sidney H. and Alfred C. Haddon. 1891. A Study of the Languages of Torres Straits,
with Vocabularies and Grammatical Notes (part I). Proceedings of the Royal Irish
Academy (1889–1901) 2: 463–616.
Schmidt, W. 1952. Die Tasmanischen Sprachen. Utrecht: Spectrum.
Schmidt, Wilhelm. 1919. Die Gliederung der australischen Sprachen: Geographische,
bibliographische, linguistische Grundzüge der Erforschung der australischen
Sprachen. Vienna: Mechitharisten-Buchdruckerei.
Sharpe, Margaret C. 1994. An All-Dialect Dictionary of Banjalang, an Australian Lan-
guage No Longer in General Use. First Asia International Lexicography Conference,
Manila, Philippines-1992, ed. by Bonifacio Sibayan and Leonard Newell, 35–48.
Manila. http://www-01.sil.org/asia/Philippines/ling/Margaret_C._Sharpe._An_all-
dialect_dictionary_of_Banjalang,_an_Australian. . . . pdf (18 February 2014).
Simpson, Jane. 2002. A Learner’s Guide to Warumungu: Mirlamirlajinjjiki Warumun-
guku apparrka. Alice Springs, Australia: IAD Press.
Simpson, Jane. 2008. Reconstructing pre-Warumungu pronominals. Morphology and
Language History: In Honour of Harold Koch, ed. by Claire Bowern, Bethwyn Evans,
and Luisa Miceli, 71–87. (Current Issues in Linguistic Theory 298). Philadelphia: John
Benjamins.
Simpson, Jane and Jeffrey Heath. 1982. Warumungu Sketch Grammar. Cambridge, MA:
MIT and Harvard University. Australian Institute of Aboriginal and Torres Islander
Studies MS 1860, ms.
Smith, John Charles. 2008. The Refunctionalisation of First Person Plural Inflection in
Tiwi. Morphology and Language History, 341–348. Amsterdam: John Benjamins.
Stokes, Judith. 1981. Anindilyakwa Phonology from Phoneme to Syllable. Australian
Phonologies: Work Papers of Summer Institute of Linguistics–Australian Aborigines
Branch, Series A 5. 139–181.
Tindale, Norman Barnett and Joseph Benjamin Birdsell. 1941. Tasmanoid Tribes in North
Queensland. Hassell Press.
Tryon, Darrell T. 1968. The Daly River Languages: A Survey. Pacific Linguistics. Series
A. Occasional Papers (14): 21–46.
Tryon, Darrell T. 1981. Daly Family Languages, Australia. repr. of 1974. Vol. 32. (Pacific
Linguistics : Series C, Books). Canberra: Department of Linguistics, Research School
of Pacific Studies, The Australian National University.
Tsunoda, Tasaku. 2012. A Grammar of Warrongo. Berlin: Walter de Gruyter.
Wafer, Jim and Amanda Lissarrague. 2008. A Handbook of Aboriginal Languages of New
South Wales and the Australian Capital Territory. Nambucca Heads, NSW: Muurrbay
Aboriginal Language and Culture Co-operative.
Language isolates of Australia 343
Wilson, Aidan. 2013. Tiwi Revisited: A Reanalysis of Traditional Tiwi Verb Morphology.
MA Thesis, University of Melbourne, Melbourne.
Wurm, S.A. 1971. Classifications of Australian Languages, Including Tasmanian. Part 1,
721–778. (Current Trends in Linguistics 8). The Hague: Mouton.
Wurm, S.A. 1972. Languages of Australia and Tasmania. The Hague: Mouton.
CHAPTER 13
ENDANGERMENT OF
LANGUAGE ISOLATES
Eve Okura
1 INTRODUCTION
The death of a language is great loss to humanity on scientific, personal, social, and cul-
tural levels. In Michael Krauss’ words:
In this circumstance, there is a certain tragedy for the human purpose. The loss of
local languages, and of the cultural systems that they express, has meant irretrievable
loss of diverse and interesting intellectual wealth, the priceless products of human
mental industry.
(Krauss 1992:36)
When that dying language is an isolate, the loss is even more dramatic. An entire system
for coding human knowledge disappears from the earth with no way of retrieving that
information. Although the loss of any language is a tragedy, in some cases, there are at
least genetically related languages – surviving sister languages – that may share and thus
preserve in a sense some of the features and richness lost in the extinct language. In the
case of isolates, there are no surviving relatives.
How many of the world’s language isolates are in danger of disappearing? What is
the urgency of the situation? This chapter defines language endangerment and how it
relates to language isolates. It reports on the current endangerment status of the world’s
language isolates and discusses factors contributing to language loss and why we should
care, and it considers what is being done in attempts to reverse language loss with regard
to isolates, with suggestions for what can be done. It also reviews language isolates that
have recently become dormant.
While there are differences in opinion regarding whether some languages are isolates
or not, making compilation of a definitive comprehensive list of the world’s language
isolates very difficult, there is general agreement about the majority of language isolates
of the world and their total approximate number. This paper relies on the list of language
isolates in Campbell (this volume). That list contains 159 language isolates. Of the 159
language isolates in the world, 4 are regarded as “safe” (2.5%) – meaning that approxi-
mately 97.5% of the world’s language isolates are either extinct or endangered.
The LEI gives a rating for each of these four factors for a language and combines the
scores to determine an overall level of endangerment. In addition to an endangerment
level, each language’s endangerment status is also given a percent of certainty, based on
for how many of the four factors information was available. If data is not available for all
four factors for a given language, then the endangerment level is assigned based on the
information that is available, but with a lower percent of certainty.
Each of the four factors is ranked 0–5 (“Safe” to “Critically endangered”). Lee and Van
Way emphasize that the LEI measures endangerment as opposed to vitality, so the higher
the score, the greater the language’s risk.
The level of endangerment is calculated using the four factors in the following way:
Intergenerational transmission is given twice as much weight as the other factors, since
it is the most significant factor in determining whether or not a language remains in use.
The LEI categories are:
Safe: 0%
Vulnerable: 1–20%
TABLE 13.1 LANGUAGE ENDANGERMENT INDEX (LEI)
Critically Severely endangered Endangered (+3) Threatened (+2) Vulnerable (+1) Safe (0)
endangered (+5) (+4)
Absolute no. of 1–9 speakers 10–99 speakers 100–999 speakers 1,000–9,999 speakers 10,000–99,999 >100,000 speakers
native speakers speakers
Intergenerational “There are only “Many of the “Some adults in “Most adults in “Most adults and some “All members of the
transmission a few elderly grandparent the community are the community are children are speakers.” community, including
speakers.” generation speak speakers, but the speakers, but children children, speak the
the language, but language is not are generally not.” language.”
the younger people spoken by children.”
generally do not.”
Speaker number “A small percentage “Less than half “About half of “A majority of Most community “Almost all
trends of the community of the community community members community members members “speak the community members
speaks the language, speaks the speak the language.” speak the language. language. Speaker speak the language,
and speaker numbers language, and Speaker numbers Speaker numbers are numbers may be and speaker numbers
are decreasing very speaker numbers “decreasing steadily, gradually decreasing.” decreasing, but very are stable or
rapidly” are decreasing at an but not at an slowly.” increasing.”
accelerated rate.” accelerated rate”
Domains of use “Used only if a “Used mainly just “Used mainly just “Used in some non- “Used in most “Used in most
few very specific in the home and/or in the home and/or official domains along domains except for domains, including
domains” (e.g. with family”; “may family, but remains with other languages, official ones” (e.g. official ones” (e.g.
“ceremonies songs, not be the primary the primary language and remains the “government, mass “government, mass
prayer. . . limited language even in of these domains for primary language used media, education, media, education,
domestic activities”) these domains for many” in the home for many” etc.”) etc.”)
many”
Threatened: 21–40%
Endangered: 41–60%
Severely endangered: 61–80%
Critically endangered: 81–100%
If a language has 100,000 native speakers or more, but no information was available for
the other three factors, it is labeled “at risk” in ELCat, as even a language with 100,000
speakers will disappear from use within a couple of generations if it is not being passed
on to children (the “intergenerational transmission” factor). The LEI category “dormant”
is for languages that have no known native speakers, that are sometimes referred to as
“extinct.” Many prefer to avoid the term “extinct” because it may discourage language
group members who otherwise might attempt to revive their heritage languages. Many
find the term “extinct” offensive because, though attributed to a people’s language, it
can get misconstrued to mean that the people and their identity are also extinct. So the
term “dormant” is used in the Catalogue of Endangered Languages. Some communities
actually prefer the term “extinct” for their language because it elevates awareness of their
plight. Both terms are used in this chapter, reflecting both views.
Africa (6)
Isolate ISO 639–3 Endangerment status # of L1 speakers Source and notes
1 Bangi Me dba threatened 2000 Blench 2005, ELCat 2015
2 Hadza hts threatened ~800 Blench, this volume
3 Jalaa* (=Cun Tuum) cet dormant 0? Blench, this volume; 10–99. A small number
of elderly people among the Dijim speak this
language (ELCat 2015, Blench 2011)
4 Laal gdm endangered 300 Mous 2003
5 Ongota bxe critically endangered 8 Graziano 2003, ELCat 2016
6 Sandawe sad vulnerable 60,000 Brenzinger 2011, ELCat 2016
Australia (7)
7 Bachamal* wdj dormant 0 Campbell, this volume
8 Gaagudju* gbu dormant 0 Bowern, this volume
9 Kungarakany* ggk dormant 0 Bowern, this volume; Evans 2001
10 Mangarrayi* mpc dormant 0 Bowern, this volume
11 Tiwi tiw threatened (ELCat) 3000 Bowern, this volume
12 Umbugarla* (Ngurmbur) umr dormant 0 Bowern, this volume; Evans 2001
13 Wagiman* (Wageman) waq dormant 0 Campbell, this volume
Central America and Mexico (4)
14 Cuitlatec* – dormant 0 Heaton, this volume
15 Huave hue, huv, hvv, hve endangered 15,993 Heaton, this volume, citing INALI
(2008–2012)
16 Purépecha (Tarascan) tsz, pua threatened 117,221 Capistrán 2015, ELCat 2016
17 Seri sei endangered 760? 800–900 Heaton, this volume
Eurasia (11)
18 Ainu* ain dormant 0 Campbell, this volume
19 Basque eus safe 714,136 Gobierno Vasco 2012, ELCat 2015
20 Burushaski bsk threatened 100,300 Munshi 2014, ELCat 2016
21 Elamite* elx dormant 0 Campbell 2013
22 Hattic* xht dormant 0 Campbell 2013
23 Hruso-Aka hru threatened <3,000 Van Driem 2011, ELCat 2016
24 Kassite* – dormant 0 Michawolski, this volume
25 Kusunda kgg critically endangered 2 Gautam 2012, ELCat 2016; some competent
speakers in year 2004; Georg, this volume
26 Nihali nll threatened 1,000–2,000 Hammarström 2010, ELCat 2016
27 Nivkh (Gilyak) niv severely endangered 700 14% of ethnic population Georg, this volume;
200 in 2010 census
28 Sumerian* sux dormant 0 Campbell 2013
North America (23)
29 Adai* xad dormant 0 Mithun in this volume
30 Alsea* aes dormant 0 Mithun in this volume
31 Atakapa* aqp dormant 0 Mithun in this volume
32 Beothuk* bue dormant 0 Mithun in this volume
33 Cayuse* xcy dormant 0 Mithun, this volume
34 Chimariko* cid dormant 0 1950s Martha Zigler
35 Chitimacha* ctm awakening 0 Galla 2009: 176
36 Coahuilteco* xcw dormant 0 Mithun, this volume
37 Cotoname* xcn dormant 0 Mithun, this volume
38 Esselen* esq awakening 0 Shaul 2014:56 1st CA lang ext?
(Continued )
TABLE 13.2 (CONTINUED)
39 Haida (Xaad Kil) hai critically endangered 30 Ignace 2006, ELCat 2016
40 Karankawa* zkk dormant 0 Campbell 2013
41 Karuk kyh critically endangered <12 Golla 2011, ELCat 2016
42 Kootenai (Kutenai) kut severely endangered 26 FPCC; ELCat 2016
43 Natchez* ncz awakening 0 K.T. Fields 2016, personal communication
44 Siuslaw* sis dormant 0 Mithun, this volume
45 Takelma* tkm dormant 0 Mithun, this volume
46 Tonkawa* tqw dormant 0 Campbell 2013
47 Tunica* tun awakening 0 Tunica Tulane Language Project 2012; ELCat
2016
48 Washo was severely endangered ~20 Golla 2011, ELCat 2016
49 Yana* ynn dormant 0 Mithun, this volume
50 Yuchi (Euchee) yuc critically endangered 5 Austin 2008, ELCat 2016
51 Zuni zun threatened ~9,000 ELCat 2016
Pacific (54)
52 Abinomn bsa critically endangered ~300 Hammarström, this volume
53 Abun kgr threatened ~3,000 Hammarström, this volume
54 Afra ulf threatened 115 Hammarström, this volume
55 Anêm anz safe ~843 Hammarström, this volume
56 Ap Ma kbx vulnerable ~10,000 Hammarström, this volume; ELCat 2016,
Lewis, Simons, and Fennig 2016 “developing”
57 Asaba seo threatened 180 Hammarström, this volume, citing Little
2008:2
58 Baiyamo ppe threatened 70 Hammarström, this volume
59 Banaro byz threatened 2,569 Laycock 1973, ELCat 2016
60 Bilua blb threatened 8,740 Hammarström, this volume
61 Bogaya boq endangered 300 Hammarström, this volume
62 Burmeso bzu endangered 250 Lewis, Simons, and Fennig 2016
63 Busa (Odiai) bhf endangered 240 Hammarström, this volume; citing Graham
1981
64 Damal (Uhunduni, Amung) uhn threatened 4,000–5,000 ELCat 2016 citing Wurm 2007
65 Dem dem threatened 1,000 Hammarström 2010, ELCat 2016
66 Dibiyaso dby threatened 1,950 Hammarström, this volume
67 Duna duc safe 25,000 San Roque 2008; “developing”; Lewis,
Simons, and Fennig 2016
68 Elseng (Morwap) mrf endangered 300 Laycock 1973, ELCat 2016
69 Fasu faa threatened 1,200 Hammarström, this volume
70 Guriaso grx endangered 160 Hammarström, this volume
71 Kaki Ae tbd endangered 266 Wurm 2007, ELCat 2016
72 Kamula xla endangered 800 Hammarström, this volume
73 Kapauri khp vulnerable ~200 Hammarström 2010, ELCat 2015
74 Karami* xar dormant 0 Hammarström, this volume; Campbell, this
volume
75 Kehu khh severely endangered 25 Lewis, Simons, and Fennig 2016
76 Kibiri-Porom prm threatened 1,180 Hammarström, this volume
77 Kimki sbt at risk Hammarström 2010
78 Kol kol threatened 4,000 Hammarström, this volume
79 Kosare kiq endangered ~250 Hammarström, this volume; Lewis, Simons,
and Fennig 2016
(Continued )
TABLE 13.2 (CONTINUED)
80 Kuot kto threatened 2,400 Hammarström, this volume; Lewis, Simons,
and Fennig 2016
81 Lavukaleve lvk threatened 1,700 ELCat 2016 citing Terrill 2003
82 Masep mvs severely endangered 25 Lewis, Simons, and Fennig 2016
83 Mawes mgk endangered ~850 ELCat 2016 citing Hammarström 2010
84 Maybrat ayz safe ~20,000 Hammarström, this volume; Ethnologue
status: “developing”
85 Mor (of Bomberai) moq severely endangered 60 Wurm 2007 (data from 1977)
86 Moraori (Marori) mok endangered 413 Arka 2012 ELCat 2016; according to
UNESCO (2010) there are 50 speakers (which
would make it severely endangered)
87 Mpur akc threatened 7,000 Odé 2002, ELCat 2016
88 Pawaia pwa threatened 4,000 Hammarström, this volume
89 Pele-Ata ata threatened ~2,000 Hammarström, this volume
90 Powle-Ma (Molof) msl endangered 200 Lewis, Simons, and Fennig 2016
91 Purari iar threatened 7,000 Hammarström, this volume
92 Pyu pby endangered ~100 Hammarström, this volume
93 Sause sao endangered 300 Wurm 2007, ELCat 2016
94 Savosavo svs threatened 2,500 Wegener 2008, ELCat 2016
95 Sulka sua threatened 2,500 Hammarström, this volume
96 Tabo (Waia) knv threatened 3,000 Hammarström, this volume
97 Taiap gpn severely endangered 75 Hammarström, this volume
98 Tambora* xxt dormant 0 Hammarström, this volume
99 Tanahmerah tcm endangered 500 Hammarström, this volume
100 Touo tqu threatened 1,870 Hammarström, this volume
101 Wiru wiu vulnerable 15,300 Hammarström, this volume; Lewis, Simons,
and Fennig 2016
102 Yale (Yalë, Nagatman) nce endangered ~600 Hammarström, this volume; Lewis, Simons,
and Fennig 2016
103 Yele (Yélî Dnye) yle threatened 6,000 Hammarström, this volume
104 Yerakai yra endangered 380 Lewis, Simons, and Fennig 2016
105 Yetfa-Biksi yet threatened 1,000 Hammarström, this volume
South America (54)
106 Aikanã tba vulnerable 175–200 van der Voort 2013
107 Andaqui* ana dormant 0 Siefart and Hammarström, this volume
108 Andoque ano endangered 597 Crevels 2012
109 Arára do Rio Branco* axg dormant 0 Crevels 2012, ELCat 2016
(Arára do Beiradão, Mato
Grosso)
110 Awaké* (Uruak, Arutani) atx dormant 0 Campbell; ELCat 2016
111 Betoi-Jirara* dormant 0 Siefart and Hammarström, this volume
112 Camsá (Kamsá) kbh threatened 4,773 Crevels 2012
113 Candoshi-Shapra cbu vulnerable 1,586 Crevels 2012
114 Canichana* dormant 0 3 rememberers in year 2000
115 Cayubaba* (Cayuvava) cyb dormant <10 Siefart and Hammarström, this volume
116 Chiquitano cax threatened ~4,665 Crevels 2012
117 Chono* – dormant 0 Siefart and Hammarström, this volume
118 Cofán (A’ingaé) con threatened 1,017 Crevels 2012, ELCat 2016
(Continued )
TABLE 13.2 (CONTINUED)
level for these “safe” or “dormant” language isolates that were not in ELCat. Where more
recent data was available (e.g. from the chapters in this volume), I used the more recent
source for numbers of native speakers.
As the other chapters in this volume detail the geography, the histories, and the unique
typological features of individual isolates, in this chapter, I concentrate only on endanger-
ment statuses and numbers of speakers. Languages are presented in alphabetical order by
region (also in alphabetical order). The content of this chapter concerns endangerment of
language isolates rather than the nuances of which languages are in fact isolates, which
are unclassified, and which are perhaps small families. For these distinctions and discus-
sions of the status of particular languages, I defer to the other chapters in this volume.
Dormant: 55
Critically endangered: 11
Severely endangered: 10
Endangered: 21
Threatened: 43
Vulnerable: 10
Safe: 4
At risk: 1
Awakening: 4
____
Total: 159 language isolates
One of the challenges in keeping up with a language’s endangerment level is that its sta-
tus changes: it is a moving target. In some cases, even the most recent published speaker
number data for a language is already several years old. When there are only a few speak-
ers left or when the only speakers are very elderly, in those few years the numbers of
speakers may have changed significantly. Unfortunately, unless active efforts are made to
turn the tides of language loss, those changes usually result in an even more precarious
situation for the language. Even when there appear to be recent sources, sometimes the
data cited in these publications is from a much earlier source. There is a constant need for
more up-to-date information on the situation of these languages.
The following section provides a brief analysis of the endangerment status of the lan-
guage isolates for each of nine categories, the five LEI endangerment levels and the
four additional statuses, that is: dormant, awakening, critically endangered, severely
endangered, endangered, threatened, vulnerable, safe, and “at risk.” Each of the sections
discusses specific examples of language isolates from Table 13.2 and explains them in
greater detail. This breakdown shows concrete examples of ratings of the four factors and
how these were used to calculate the overall endangerment status of the language isolates.
It may be that there are many more extinct language families than the 95 of which we are
certain but that we only have clear information about these. It is also possible that some of
those 55 dormant language isolates were actually related to other now-extinct languages
about which no information has survived, and so based on the information available to us
today, they remain dormant language isolates.
In any event, this amount of extinction reveals a huge loss of the linguistic knowledge.
With 55 out of the 159 isolates already dormant, only 104 language isolates (65%) remain
in use today in any form, and as seen in Figure 13.1 and in Table 13.2, many of the still
surviving language isolates are highly endangered.
It may be instructive to examine a relatively recent case of language extinction in order
to understand the process from language endangerment to extinction and the rate at which
this process can occur. The dormant isolate Munichi was spoken in Peruvian Amazonia,
in the “southwestern part of the departamento of Loreto” (Michael et al. 2013). The lan-
guage is categorized as dormant, as there are no known fluent speakers. However, ELCat
does cite “3 rememberers” – those who are not currently fluent in the language, but who
do remember aspects of the language (ELCat 2015, Michael et al. 2013). In 2008 and
2009 linguists conducted fieldwork on the language. At that time, there were ten remem-
berers, none of whom self-identified as fluent in the language, and only three of whom
remembered any significant amounts of the language. These three last rememberers who
worked with field linguists were Alejandrina Chanchari Icahuate (approximately 90 years
old at the time), Melchor Sinti Saita (approximately 70 years old at the time), and Donalia
Icahuate Baneo (57 years old at the time of the fieldwork, born in 1951). However, of
even these three strongest rememberers, all of them “either had not used the language for
most of their adult life or were never fully fluent in the language” and were more com-
fortable speaking in either Spanish or Quechua (Michael et al. 2013: 311). The eldest,
Alejandrina, had not used the language in decades. Melchor had never been fully fluent.
According to oral histories, the last fluent speakers were born from 1915–1925. Around
1930 the language had already become moribund.1
There are other similar examples of language isolates that have gone from having a
couple of rememberers to no speakers very rapidly. Some of these other dormant isolates
threatened: 27%
dormant: 35%
endangered: 13%
critically endangered: 7%
severely endangered: 6%
are recognized as having had no known L1 speakers for some decades. Takelma, of Ore-
gon, was extinct by the 1940s, and at least one speaker of Alsea, also of Oregon, survived
into the 1940s. Chief Benjamin Paul and Delphine Decloux Stouff were the last two
speakers of Chitimacha, in Louisiana; Mrs. Stouff died last, in 1940. Sesostrie Youch-
igant was the last speaker of Tunica, in Louisiana; died after 1950. Madeline England,
the last speaker of Kungarakany (Australia), listed as dormant, passed away in 1989
(Frawley 1997: 119). Although not included in the list of 159 isolates in this chapter due
to purposes of consistency, Tuxá of Brazil (Seifart and Hammarström, this volume) had
two rememberers in 1961 (Moseley 2010). By the 1970s the language was found to have
no known native speakers (Meader 1978). (Because of the very limited attestation of this
language, many consider it not an isolate, but unclassified.)
“Today” in the quotation refers to 2005. In sources 11 years later, the numbers have
dropped. According to the most recent count, Kwaza has lost 72% of its speakers since
2005. As of the year 2016, there are 7 people living who speak the Kwaza language (Sei-
fart and Hammarström, this volume). This more recent number actually changes Kwaza’s
status from severely endangered to critically endangered. Ofayé (Opayé) has 12 speakers.
The count for Trumai was 51 speakers in 2007. There may be even fewer now.
the label for the specific LEI status between “threatened” and “severely endangered.”
In Section 3.4 the term “endangered” refers to the specific LEI status. Laal is the only
endangered isolate (in the LEI sense of the word), spoken in Africa.
The majority of language isolates categorized as “endangered” (in LEI’s technical
sense) – 16 out of 21 – are spoken in the Pacific. They are: Bogaya (300 speakers), Bur-
meso (250 speakers), Busa (Odiai) with 238 speakers, Guriaso with 160 speakers, Kaki Ae
(266 speakers), Kamula (800 speakers), Kosare (about 250 speakers), Mawes with about
850 speakers, Moraori with 50 speakers, Morwap (Elseng) with 300 speakers, Powle-Ma
(Molof) with 200 speakers, Pyu (about 100 speakers), Sause (300 speakers), Tanahmerah
(500 speakers), Yale with about 600 speakers, and Yerakai (380 speakers). (See Hammar-
ström, this volume.)
Two of the 21 “endangered” isolates are spoken in South America. Andoque is spoken
by about 597 speakers in southern Colombia. According to data from 2001, Jotí was spo-
ken by 767 speakers in central Venezuela, 100% of the ethnic population.
Huave and Seri are the two isolates spoken in the Mexico with the LEI level “endan-
gered.” Huave is spoken in the state of Oaxaca along the Pacific by about 15,993 accord-
ing to the 2010 census. Seri is spoken by about 760 people in Sonora, Mexico, near the
Gulf of California and just west of Hermosillo.
3.6 Vulnerable
There are ten vulnerable isolates: one in Africa, Sandawe; there are three in the Pacific:
Ap Ma, Kapauri, and Wiru. The majority of vulnerable isolates – six – are in South Amer-
ica: Aikanã, Candoshi-Shapra, Moseten-Chimane, Páez, Pirahã, and Warao in northeast-
ern Venezuela. The geographic distribution of vulnerable isolates reveals that there are
no vulnerable isolates in North America, Central America, Mexico, Australia, or Eurasia.
While vulnerable is still a level of endangerment, it is the most vital below “safe.” It
may be that those isolates that are located in more geographically remote areas are more
“protected” from outside influence, enabling them to maintain a higher level of vitality.
Although there may still be pressure from dominant and neighboring languages to shift,
the pressure has not been as pronounced.
Endangerment of language isolates 361
3.7 Awakening
Currently several language isolates are “awakening,” including Chitimacha in South-
ern Louisiana, Esselen in California, Natchez in Oklahoma (originally from Louisiana),
and Tunica in Louisiana. Natchez, usually reported as dormant in some of the litera-
ture, is currently being awakened (Kent T. Fields, personal communication, October 27,
2016). Kent Fields, who has some knowledge of the language, has developed the Natchez
Nation’s online dictionary, and he is adding to it daily.
As a recently awakening isolate, Tunica has no native speakers. All who have some
degree of proficiency in it are learning it as a second language. Heaton (in press) has
analyzed and described some of the unique grammatical features of Tunica, including
its interaction between gender, “animacy, definiteness, and number” (Heaton in press).
One such feature is that Tunica is “one of the few languages of the world that shows
evidence of an unmarked feminine gender,” as opposed to the masculine gender being
the unmarked one (Heaton in press). This is one example of the significant contributions
to language typology that isolates can offer. When a language isolate is endangered, in
addition to the personal human and cultural losses, we are at risk of losing a greater than
normal piece of the puzzle in the field of linguistics as a whole.
Endangered
endangered
endangered
endangered
Threatened
Awakening
Vulnerable
number of
Critically
Dormant
Severely
isolates
number
At risk
Total
Total
Safe
Africa 1 0 1 0 1 2 1 0 0 6 5
Australia 6 0 0 0 0 1 0 0 0 7 1
Central America/ 1 0 0 0 2 1 0 0 0 4 3
Mexico
Eurasia 5 0 1 1 0 3 0 1 0 11 5
North America 13 4 3 2 0 1 0 0 0 23 10
Pacific 2 0 1 4 16 24 3 3 1 54 49
South America 27 0 5 3 2 11 6 0 0 54 27
Totals: 55 4 11 10 21 43 10 4 1 159 100
While South America and the Pacific have the same number of language isolates,
there are 27 extinct isolates in South America and only 2 extinct isolates in the Pacific.
There are many variables, so the reason for this extreme difference in vitality rates of
isolates in these two regions cannot be known with certainty. However, as mentioned
previously, the relative geographic isolation of islands in the Pacific may be a factor in
preserving the vitality of language isolates, especially in New Guinea, where there are
many languages and where contact with the outside world has been more recent and
less intense.
How does the distribution of endangered isolates compare with endangered languages
in general? Table 13.4 shows the count of endangered languages in each region accord-
ing to the Catalogue of Endangered Languages.2 The number in parentheses next to
the region name is the number of endangered language isolates in the region to show a
comparison.
1 Natural disasters
2 Disease
4 Famine
5 Economic (e.g. environmental changes, deforestation)
6 Political (war, genocide)
(Crystal 2000:70–76)
The following section provides examples of language isolates that have gone extinct due
to natural disasters and disease. Cases involving natural disasters are particularly tragic
because of loss of human life and language all at once. Their unpredictable nature and
devastating effects can cause a vibrant community and language to disappear almost
instantaneously. One historical example of this is Tambora, a language isolate that was
spoken on Sumbawa Island in Indonesia. It was spoken by roughly 11,000 people. In
1815 Mount Tambora erupted, killing virtually every person in the area, including all
speakers of Tambora (Raffles 1817, Donohue 2007).
Although located in Indonesia, Tambora was clearly not an Austronesian language
(Donohue 2007). All is known of the Tambora language are a few lexical items.
Taushiro is a critically endangered language isolate spoken in Peru. One of the primary
factors causing it to become critically endangered was disease, in combination with cul-
tural factors:
Due to an epidemic disease in the same decade and to the fact that most survivors
have intermarried with non-Taushiro speakers and have adopted Spanish or a variety
of Quechua, the language is now on the brink of extinction with 1 speaker out of an
ethnic group of 20.
(Crevels 2012: 213; ELCat 2015)
Taushiro has only one speaker and an ethnic population of only 20 individuals. As can
be seen in this case, disease and other physical dangers not only reduce language use,
but they put the entire ethnic population of a group at risk. This can be contrasted with
metadata for languages at risk due to cultural dangers.
All members of a language community have the right to interrelate with and receive
attention from the public authorities in their own language. This right also applies to
central, territorial, local and supraterritorial divisions which include the territory to
which the language is specific.
(UNESCO 1996:7)
6.1.2 Education
Appropriate policies regarding minority language use in education could also help curb
the loss of language isolates and languages generally. Section II, Article 23, of the Uni-
versal Declaration on Linguistic Rights states that:
(2) Education must help to maintain and develop the language spoken by the lan-
guage community of the territory where it is provided; (3) Education must always
Endangerment of language isolates 365
be at the service of linguistic and cultural diversity and of the harmonious relations
between different language communities throughout the world.
(UNESCO 1996:9)
Currently, the opposite of this is happening in many locations – where official policies
favor compulsory education in the majority language, further pushing out endangered
languages.
However, Romaine (2002) points out that the creation of language policy alone is not
enough to ensure the maintenance of a language. Many language policies (including the
Universal Declaration on Language Rights) sought ideals, but the policies are not imple-
mented or enforced. The number one factor resulting in language death is a lack of inter-
generational transmission – i.e., parents do not teach the language to their children in the
home or children do not attempt to learn their parents’ language, so the next generation
does not speak the language.
The solution may not be to abandon efforts to create effective language policies (e.g.
granting minority languages official status, providing for multilingual education). Rather,
efforts could be supplemented with a focus on implementing policies. In the past, govern-
ment and institutional policies have often played a major role in the demise of languages
(e.g. the well-known cases of boarding schools in the US, Canada, and Australia, where
indigenous children were forbidden to speak their native language or were punished for
doing so).
Nonetheless, changes in official policies have also opened up the way for language revi-
talization programs in some contexts. For example, in 1978, an amendment to the state
of Hawai‘i constitution designated Hawaiian as an official language on par with English
(Lucas 2000, Romaine 2002).3 This opened the way for the Hawaiian language revital-
ization movement. The next three decades saw the development of Hawaiian immersion
schools, Hawaiian immersion programs at the university level, students writing masters
theses completely in Hawaiian, and Hawaiian language announcements at local airports,
among other breakthroughs. Official status for and education in the language can assist
the revitalization of a language. However, while official status and formal education can
assist language maintenance, these do not guarantee intergenerational transmission. They
do, however, provide the possibility of expansion of domains where the language can be
spoken, increasing the likelihood of use.
Hinton’s (2013) Bringing Our Languages Home emphasizes the importance of inter-
generational transmission within the home. There have been examples of successful
efforts of intergenerational transmission within the home, without assistance from official
status or formal education policies, e.g. the Baldwin family’s successful revitalization of
the Miami language (Baldwin et al. 2013, Hinton 2013).
6.2.1 Karuk
Karuk, spoken in northern California, is another isolate that is being revitalized. There
are fewer than a dozen native speakers of Karuk; however, many members of the Karuk
366 Eve Okura
tribe are studying and learning the language. However, in contrast to Tunica, Karuk
revitalization efforts began before its last fully fluent native speakers passed away, so
although it is critically endangered, it has never gone through a period of being com-
pletely dormant (i.e. of having no living speakers).
This desire for the upcoming generation to have access to the Karuk language has
motivated development of Karuk language programs for children. There is currently a
curriculum for kindergarten to third grade (approximately 5 year olds to 8 year olds)
focused primarily on language revitalization, but instruction also includes traditional
cultural knowledge, values, and practices. The tribe is working on a curriculum for kin-
dergarten to 12th grade (for those 5 years old to 18 years old) for any schools that are
interested in incorporating it. The community is working with a linguist. In addition,
from 2008 to 2011 the Karuk tribe received an ANA (Administration for Native Amer-
icans) grant for a Master-Apprentice program. There are also online resources, includ-
ing a Karuk-English online dictionary and Karuk texts4 (Beck, personal communication,
August 20, 2015; Cramblit 2011).
6.2.2 Natchez
The Natchez Nation is working to awaken the Natchez language. Kent T. “Hutke” Fields
is heading the effort with the compilation of an online English-Natchez dictionary.5 The
language is considered dormant, now awakening, and has had no known native speakers
since at least 1965. Fields has some knowledge of the language. The Natchez Nation also
participates in the Breath of Life workshop held at the Smithsonian Institution as part of
their efforts to recover as much of the language as possible.
6.2.3 Tunica
The Tunica community and linguists are collaborating to revitalize the Tunica language,
one of five awakening isolates. The Tunica-Biloxi Tribe of Louisiana and Tulane Uni-
versity work together to develop the Tunica Language Project. Efforts began in 2010.
Since then, the partnership has developed summer language camps since 2012, a practical
orthography, children’s books, and audio recordings in Tunica6 (Maxwell 2014).
he cites are language isolates or possible language isolates.7 Table 13.4 reports levels of
documentation for the least documented isolates according to Hammarström (2010).
The reality is, if the language has not been recorded and becomes extinct, there is
no extant scientific/linguistic method to retrieve that lost knowledge. If there has been
linguistic documentation prior to the death of the last fluent speaker (video and audio
recordings with transcriptions and morpheme-by-morpheme glosses, dictionaries, and
grammars are best; scant wordlists and grammar sketches are better than nothing), then it
may be possible to attempt to “wake” the language up.
Dem, Kapauri, Kimki, and Nihali are categorized as less at risk than some of the oth-
ers. However, even though Kapauri (Kapori) is categorized as “vigorous” by Ethnologue
(due to high intergenerational transmission), its total number of speakers is only about
200. (The Catalogue of Endangered Languages considers it “vulnerable.”) As was seen
in the case of Tambora, even a language that has several thousand speakers could disap-
pear in an instant if catastrophe were to strike.
7 CONCLUSION
The vast majority of the world’s living language isolates – 100 out of 104 (96%) – are in
danger of disappearing. Approximately 35% of all known language isolates have already
368 Eve Okura
become extinct. Of the living language isolates, even within the handful of “safe” or
“vigorous” languages (of which there are only four), Anêm, Duna, and Maybrat may be
considered to have relatively small numbers of speakers.
Intergenerational transmission in the home is the primary way for a language to be per-
petuated. While official policies can be helpful, even without such policies, individuals,
families, and communities can collaborate to revitalize a language on their own. However,
official policies – including guaranteeing language rights and formal education – when
implemented, can provide external reinforcement of efforts from within the home and the
community.
For linguists (including linguistic students) looking for a language to document, some
of the languages that most urgently need to be documented are isolates that are both
endangered and among the least documented language families of the world. The most
urgent ones include: Busa, Guriaso, Kapauri, Kimki, Mawes, Mor of Bomberai, and
Sause. Their situation is dire because: (1) there is little to no linguistic description of
these languages; (2) they are on the verge of disappearing, which would eliminate any
chance of ever learning about these languages; and (3) they are one-member language
families. Once these isolates disappear, not only do we lose all chance at learning about
them directly, but there are no other languages related to them to give us any idea of the
genetic or typological uniqueness of the language or of the richness of information they
might have contained and provided.
NOTES
1 Campbell defines “moribund” as having fewer than ten speakers (1997: 107). Gener-
ally, those few speakers are very elderly, and no children speak the language anymore
(Crystal 2000: 21).
2 Data is from the August 2015 version of ELCat. There may have been minor changes
to the catalogue since then.
3 www.unesco.org/most/vl4n2romaine.pdf.
4 http://linguistics.berkeley.edu/~karuk/; see also: http://linguistics.berkeley.edu/~karuk/
links.php.
5 www.natcheznation.com/Language.html, accessed 11/1/16.
6 http://tunica.wp.tulane.edu/about/about-the-tunica-language-project/ (see also: http://
tulane.edu/liberal-arts/newsletter/tunica-jan-2014.cfm).
7 Hammarström (2010) lists 5 of the 23 least documented language families as possi-
bly isolates: (1) Shom Pen, (2) Asaba, (3) Baiyomo, (4) Mor, and (5) Tanahmerah.
Lepki in Hammarström’s (2010) list may be considered unclassified (cf. Campbell, this
volume).
REFERENCES
Arka, Wayan. 2012. Projecting morphology and agreement in Marori, an isolate of south-
ern New Guinea. Language Documentation & Conservation Special Publication No. 5
(December 2012) Melanesian Languages on the Edge of Asia: Challenges for the 21st
Century, ed. by Nicholas Evans and Marian Klamer, 150–173. https://scholarspace.
manoa.hawaii.edu/bitstream/10125/4563/1/arka.pdf
Austin, Peter, ed. 2008. 1000 languages: The worldwide history of living and lost tongues.
London: Thames & Hudson.
Endangerment of language isolates 369
Blench, Roger. 2005. Baŋgi me, a language of unknown affiliation in Northern Mali.
Cambridge: Cambridge University Press.
Blench, Roger. 2011. An Atlas of Nigerian Languages. [Available at http://www.
rogerblench.info/Language/Africa/Nigeria/Atlas%20of%20Nigerian%20
Languages-%20ed%20III.pdf] [accessed 12–26–2016.]
Blench, Roger. 2016. Language isolates in Africa. Language Isolates, ed. by Lyle Camp-
bell, Alex Smith, and Thomas Doughtery. London: Routledge.
Baldwin, Daryl, Karen Baldwin, Jessie Baldwin, and Jarrid Baldwin. 2013. Starting from
Zero. Bringing Our Languages Home, ed. by L. Hinton. California: Heyday.
Brenzinger, Matthias. 2011. The twelve modern Khoisan languages. Khoisan Languages
and Linguistics, ed. by Alina Witzlack-Makarevich and Martina Ernszt. Proceedings of
the 3rd International Symposium, July 6-10, 2008, Riezlern/Kleinwalsertal.
Campbell, Lyle. 1997. American Indian Languages: The Historical Linguistics of Native
America. Oxford: Oxford University Press.
Campbell, Lyle. 2013. Historical Linguistics (3rd ed.). Cambridge: MIT Press.
Campbell, Lyle. In press. Languages of South America. Atlas of the World’s Languages,
ed. by Christopher J. Moseley and Ronald E. Asher. London: Routledge.
Capistrán, Alejandra. 2015. Multiple Object Constructions in P’orhépecha. Leiden &
Boston: BRILL. http://csh.izt.uam.mx/sistemadivisional/SDIP/proyectos/archivos_rpi/
dea_25585_39_467_1_2_1.%20Galera%20_Capistran%20Garza_text_proof-02.pdf.
Cramblit, André. 2011. Karuk Language Restoration Committee. Native News Network.
http://nativenewsnetwork.posthaven.com/karuk-language-restoration-committee
(20 August 2015).
Crevels, Mily. 2012. Language Endangerment in South America: The Clock Is Ticking.
The Indigenous Languages of South America: A Comprehensive Guide, ed. by Lyle
Campbell and Verónica Grondona, 167–234. Berlin: Mouton de Gruyter.
Crystal, David. 2000. Language Death. Cambridge: Cambridge University Press.
Donohue, Mark. 2007. The Papuan Language of Tambora. Oceanic Linguistics 46, no. 2:
520–537.
Evans, Nicholas. 2001. The Last Speaker Is Dead-Long Live the Last Speaker! Linguistic
Fieldwork, ed. by Paul Newman and Martha Ratliff, 250–281. Cambridge: Cambridge
University Press.
Fishman, Joshua. 1991. Reversing Language Shift. Clevendon: Multilingual Matters.
Frawley, William J. 1997. International Encyclopedia of Linguistics. Oxford University
Press.
Galla, Candace K. 2009. Indigenous language revitalization and technology from tradi-
tional to contemporary domains. Indigenous Language Revitalization. Encouragement,
Guidance & Lessons Learned, ed. by Jon Reyhner and Louise Lockard. Flagstaff, AZ:
Northern Arizona University.
Gautam, Bimal. 2012. Nepal’s mystery language on the verge of extinction. British
Broadcasting Corporation (BBC). 13 May 2012. http://www.bbc.com/news/world-
asia-17537845.
Gobierno Vasco (Basque Government). 2012. V. Inkesta Soziolinguistikoa. Servicio Central
de Publicaciones del Gobierno Vasco. http://www.euskara.euskadi.eus/contenidos/
noticia/inkesta_soziol_2012/es_berria/adjuntos/Euskal_Herria_Inkesta_Sozioling
uistikoa11_es.pdf.
Golla, Victor. 2011. California Indian Languages. Berkeley: University of California
Press.
370 Eve Okura
Moseley, Christopher (ed.). 2010. Atlas of the World’s Languages in Danger. Paris: The
United Nations Educational, Scientific and Cultural Organization. [Online version:
www.unesco.org/culture/languages-atlas/en/atlasmap.html. 28 April 2016.
Odé, Cecilia. 2002. A Sketch of Mpur. In Ger P. Reesink (ed.) Languages of the Eastern
Bird’s Head, 45–107. Canberra: Australian National University.
Olawsky, Knut. 2006. A Grammar of Urarina. (Mouton Grammar Library, 37.) Berlin,
New York: Mouton de Gruyter.
Raffles, Stamford. 1817 [1830]. History of Java, Vol. 2, appendix F, 198–199. London:
Black, Parbury and Allen (1817); London: J. Murray (1830).
Rogers, Christopher and Lyle Campbell. 2015. Endangered Languages. Oxford Research
Encyclopedia of Linguistics, ed. by Mark Aronoff. http://linguistics.oxfordre.com/.
Romaine, Suzanne. 2002. The Impact of Language Policy on Endangered Languages.
International Journal on Multicultural Societies. 4(2). www.unesco.org/most/vl4n
2romaine.pdf.
San Roque, Lila. 2008. An introduction to Duna grammar. Ph.D. Dissertation. Australian
National University. [Available at www.academia.edu/10281818/An_introduction_to_
Duna_grammar [accessed 3–14–2016.]
Shaul, David L. 2014. Linguistic Ideologies of Native American Language Revitaliza-
tion: Doing the Lost Language Ghost Dance. New York: Springer.
Silva, Léia de Jesus. 2010. Diagnóstico sociolinguístico do povo rikbaktsa. Levantam-
ento realizado no quadro do Projeto de Documentação da Língua Rikbaktsa. Setembro/
outobro de 2010. Museu do Indio/UNESCO. http://prodoclin.museudoindio.gov.br/
images/conteudo/rikbaktsa/produtos_pesquisadores/Diagnóstico_sociolin
gu%C3%ADstico-fim_recebido_lea.pdf
UNESCO (United Nations Educational, Scientific, and Cultural Organization). 1996. Uni-
versal Declaration on Linguistic Rights. Barcelona: United Nations. Available online:
http://unesdoc.unesco.org/images/0010/001042/104267e.pdf [accessed 4-12-2016.]
van der Voort, Hein. 2005. Kwaza in a Comparative Perspective. International Journal of
American Linguistics. October 2005, 71(4): 365–412.
van der Voort, Hein. 2013. Fossilised fictive quotation: Future tense in Aikan~a. Boletim
do Museu Paraense Em’ilio Goeldi. Ciências Humanas. 8(2). May/Aug 2013. 359–377
http://www.scielo.br/scielo.php?script=sci_arttext&pid=S1981-81222013000200009.
van Driem, George. 2011. Tibeto-Burman subgroups and historical grammar. Himalayan
Linguistics, 10 (1) [Special Issue in Memory of Michael Noonan and David Watters]:
31–39. http://www.himalayanlanguages.org/files/driem/pdfs/2011TBsubgroups.pdf.
Wegener, Claudia U. 2008. A grammar of Savosavo: A Papuan language of the Solomon
Islands. Radboud Universiteit Nijmegen: Ph.D. thesis. http://pubman.mpdl.mpg.de/
pubman/item/escidoc:102834:4/component/escidoc:2300986/Wegener_2008_A%20
grammar%20of%20Savosavo.pdf.
Wurm, Stephen. 2007. Australasia and the Pacific. Encyclopedia of the World’s Endan-
gered Languages, ed. by Christopher Moseley, 425–577. London: Routledge.
INDEX
Catalogue of Endangered Languages 117, 232, Elamite 6, 26, 29 – 32, 33, 35 – 6, 347, 349
236, 243, 345, 347, 356, 357, 358, 359, 361, Elseng 7, 291 – 2, 351, 360
362, 367 endangered languages xii, xiii, 117, 149, 229, 232,
Caucasian languages 29, 32, 62, 68, 118, 140 235, 236, 242 – 3, 247, 250, 263, 273, 292, 293,
Cayubaba 9, 270, 353 301, 344 – 68
Cayuse 7, 198, 204, 206, 207, 210, 349 Enindhilyakwa see Anindilyakwa
Celtic languages 5, 14, 41, 42, 63 Esmeralda 9, 267, 354
Chadic languages 167, 174, 175, 181 Esselen 8, 199, 207, 209, 210, 211, 212, 238,
chance resemblance 165 – 6, 198, 203, 217, 218, 349, 361
264, 303, 305 Eteocretan 4, 43
Chapacuran languages 270, 273 Ethnologue 4, 162, 165, 178, 231, 232, 236, 242,
Chibchan languages 264, 265, 267, 244 243, 305, 328, 347, 361, 367
Chilanga 9, 229, 244 Etruscan 3, 7, 14, 24, 40 – 1
Chimané see Mosetén-Chimané Euchee see Yuchi
Chimariko 8, 199, 207 – 8, 210, 211, 212, 349 Evans, Nicholas 295, 324, 325, 327, 328, 329,
Chiquimulilla 9, 229 331, 333, 358
Chiquitano 9, 272, 353 extinct languages 4, 6, 7, 10, 11, 20, 105, 139,
Chitimacha 8, 214, 215, 216, 217, 232, 349, 358, 140, 150, 152, 157, 165, 167, 168, 179, 209,
361 213, 229, 246, 247, 248, 261, 262, 263, 264,
Chono 9, 275, 353 265, 269 – 70, 273, 274, 275, 288, 295, 300,
Chukchi-Kamchatkan languages 6, 139, 150 – 1 323, 325, 328, 331, 344, 347, 356, 357, 358,
Coahuilteco 8, 9, 194, 212, 213 – 14, 241, 349 362, 363, 367 – 8
Cofán 9, 268, 353
cognates 2, 62, 63, 73, 146, 148, 149, 163, 164, Fasu 7, 296, 351
165, 166, 177, 180 – 1, 183, 195, 202, 203, 204, Finno-Ugric languages 40, 83, 139, 141, 149
205, 207, 212, 216, 217, 218, 233, 238, 239, Foley, William A. 287, 292, 293, 294, 297,
244, 245, 249, 274, 293, 295, 298, 299, 300, 304, 305
303, 305, 325, 327, 329, 330, 332, 333, 334, Frachtenberg, Leo 196, 204 – 5, 206, 207
335, 336 Fulniôulniô 9, 274, 354
comparative method 11, 15, 59, 60, 62, 63, 71, 78,
104, 118, 153, 157, 233, 238, 249, 262, 308 Gaagudju 6, 331 – 2, 348
contact see language contact Gatschet, Albert 8, 193, 194, 195, 197, 204, 206,
Cotoname 8, 194, 213, 214, 349 208, 214, 215, 235, 238
counting see numeral systems Gayón 10, 265
Coyón see Gayón Georg, Stefan 2, 6, 16, 142, 144, 146, 151, 349
Crawford, James 197 – 8, 216, 238 Germanic languages 2, 3, 59, 118
Crevels, Mily 266, 270, 271, 273, 358, 359, 363 Gilyak see Nivkh
Cuica see Timote-Cuica Glottolog 165, 262, 288, 325, 328, 329, 331
Cuitlatec 9, 229, 230, 244, 246 – 50, 348 Goddard, Ives 118, 193, 194, 198, 200
Culli 9, 268, 354 Gorrochategui, Joaquín 2, 12, 71
cuneiform 19, 20, 21, 26, 27, 28, 30, 33 – 4, 35, grammaticalization 75 – 8, 80, 82, 83, 112,
44; see also writing systems 270, 289
Cunza see Kunza Great Andamanese 6, 140, 151, 152, 157 – 9
Cushitic languages 165, 167, 177, 178 Greenberg, Joseph 73, 100, 162, 163 – 5, 171, 179,
180, 237, 244, 329
Dahl, Östen 261, 288 Guachí 9, 275, 354
Damal 7, 291, 351 Guamó 9, 265, 354
dead languages see extinct languages Guanche 6, 167, 168, 183, 184
Dem 7, 291, 351, 367 Guató 9, 273, 354
Dene languages 29, 60, 118, 140, 199, 200, 201, Guazacapán 9, 229
202, 210, 211 Güldemann, Tom 164, 165, 167, 180, 181
de Saussure, Ferdinand 11, 12 Gulf hypothesis 197, 215, 216, 232, 233
Dibiyaso 7, 296 – 7, 351 Gumuz 6, 166, 167, 168, 177, 178
diffusion xiii, 13, 164, 204, 264, 268, 334, 338; Gününa-Küne 10, 275
see also areal linguistics Gunwinyguan languages 7, 324, 325, 327,
Dimmendaal, Gerrit 71, 165, 182 328, 331
Dixon, R.M.W. 71, 164, 165, 323, 324, 325, 328, Guriaso 7, 303, 351, 360, 367, 368
329, 330, 331, 333, 335
Dixon, Roland 199, 207, 208, 209, 210, 211, 238 Haas, Mary 83, 202, 210, 214 – 16, 232, 238
documentation see language documentation Hadza 6, 165, 167, 168, 171 – 2, 348, 360
Dompo 6, 167, 181 – 2, 183 Haida 8, 200 – 2, 350
Dravidian languages 24, 26, 32, 38, 73, 74, 139, Hammarström, Harald 5, 7, 9 – 10, 262, 263, 265,
155, 156, 157, 165 287, 288, 289, 292, 293, 305, 358, 359, 360,
Duna 7, 298 – 9, 351, 361, 368 366, 367
Index 375
250, 269, 270, 290, 291, 297, 301, 302, 303, Non-Pama-Nyungan languages 324, 326, 327,
304, 305, 306, 307, 325, 327, 332, 338, 356, 328, 329, 330, 331, 332, 334, 335, 337
361, 368 Northern Picene 4, 41
loanwords xiii, 11 – 13, 25, 39, 63 – 4, 69, 72, 79, Nostratic hypothesis 24, 32, 163
100 – 2, 166, 177, 182, 195, 197, 202, 207, 216, numeral systems 111, 148 – 9, 206, 209, 229, 239,
236, 244, 269, 276, 288, 289, 291, 295, 296, 241, 245, 250, 270, 274, 291, 296, 305
297, 298, 300, 302, 303, 305, 324, 334; see also
borrowing Odiai see Busa
Loukotka, C̆ estmír 262, 264, 269, 274 Odulic see Yukaghir
Lule 11, 14, 275 Ofayé 10, 354, 359
lumping vs. splitting 162, 163 – 4, 287 O’Grady, Geoffrey 325, 333, 334, 335
Omotic languages 165, 166, 178
Macro-Gé languages 270, 274, 359 Omurano 10, 269
Máku 10, 265, 354 Onge 157 – 8
Mangarrayi 6, 331, 332, 348 Ongota 6, 167, 168, 178, 348
Marlett, Stephen 230, 234, 235, 236, 237, 238, onomatopoeia 69, 198, 238, 244
239, 240, 241 Opayé see Ofayé
Marori 7, 294 – 5, 352 Oropom 5, 6, 167, 179
Masep 7, 290, 352, 359 orthography see writing systems
Matanawí 10, 274, 354 Otanabe see Muniche
Mato Grosso Arara 10, 274, 353 Otomaco 10, 265
Mawes 7, 292 – 3, 352, 360, 367, 368
Maybrat 7, 288, 352, 361, 368 Páez 10, 265 – 6, 267, 355, 360
Mbabaram 323, 333, 336 Pama-Nyungan languages 14, 323, 324, 326, 333,
Meillet, Antoine 11, 12, 62, 75, 83 334 – 7
Meroitic 6, 167, 168, 179, 347 Pankararé 5, 274
Michelena, Luis see Mitxelena, Koldo Papuan languages 287, 297, 298, 300, 305, 309,
Minoan Linear A 4, 39, 40, 42 – 3, 44 324, 337
Mithun, Marianne 8, 195, 200, 202, 212, 217 Pawaia 7, 297, 352
Mitxelena, Koldo 61, 62, 64 – 6, 68, 69 – 70, 71, Payaguá 10, 275, 355
72, 76, 79, 80, 81, 82, 83 Pele-Ata 7, 306, 352
Mochica 10, 266 – 7, 354 Penutian hypothesis 199 – 200, 205, 206, 210 – 12,
Molala 9, 198, 200, 204, 206, 207, 210, 211 213, 232
Molof see Powle-Ma personal names see proper names
Mongolic languages 139, 151 Philistine 39 – 40
Mor 7, 289, 352, 359, 367, 368 phylogenetic relationship see genetic relationship
moribund languages see endangered languages Picean see Northern Picene
Morwap see Elseng Pinche see Taushiro
Mosetén-Chimané 10, 272, 354, 360 Pirahã 10, 273 – 4, 355, 360
Movima 10, 270, 354 place names see toponyms
Mpra 6, 167, 168, 181 – 3 Poser, William xiii, 14, 15, 262, 287
Mpur 7, 289, 352 Powell, John Wesley 195, 197, 198, 199, 202,
Munda languages 74, 155, 156 204, 205, 206, 207, 208, 209, 210, 213, 214,
Muniche 10, 269, 354, 357 215, 216, 238
Mura 10, 274 Powle-Ma 7, 294, 352, 360
Muskogean languages 197, 215, 216, 217, proper names 2, 11, 12, 19, 25, 26, 29, 32, 33, 34,
232 35, 37, 39, 44, 60, 182
mutual intelligibility xi, 3 – 4, 104, 146, 195 – 6, Puelche see Gününa-Küne
198, 200, 217, 231, 347 Puinave 10, 263, 355
Muysken, Pieter 10, 11, 266, 270 Pumé 10, 264, 355
Puquina 10, 266, 355
Nagatman see Yale Purari 7, 297 – 8, 352
Nahali 6, 140, 155 – 7, 349, 367 Purépecha 9, 229, 230, 242 – 6, 250, 348,
Namau see Purari 360
Natchez 8, 214, 215, 216, 217, 232, 350, 361, Purí-Coroado 10, 355
365, 366 Pyu 7, 301, 352, 360
Nettle, Daniel 260, 261, 287
Nganyaaywana see Anewan Quechuan languages 244, 265, 266, 268, 363
Niger-Congo languages 163, 164, 170, 183
Nihali see Nahali Raetic 4, 41
Nilo-Saharan languages 163, 165, 166, 167, 175, revitalization see language revitalization
177, 178, 179 Rigsby, Bruce 198, 206, 207
Nivkh 6, 100, 102, 139, 146 – 9, 349, 359 Rikbaktsá 10, 355, 359
Index 377
Romance languages 2, 12, 59, 61, 64, 66, 71, 79, Tartessian 5, 14, 41 – 2
80, 82 Taruma 10, 264, 355
Ross, Malcolm 287, 289, 296, 297, 306, 308 Tasmanian languages 333, 334
Ruhlen, Merritt 118, 237 Taushiro 10, 269, 355, 358, 363
Ryukyuan languages 3, 151 Tawasa 9, 197
Tequiraca 10, 269, 355, 367
Sabela see Waorani Thomason, Sarah 59, 274
Salinan 9, 199, 207, 209 – 10, 211, 212, 238, Tibeto-Burman languages 24, 73, 139, 154, 156
239 – 40 Timote-Cuica 10, 265
Salvadoran Lenca see Chilanga Timucua 9, 193, 197 – 8, 214, 216, 244, 264
Samoyedic languages 83, 139, 140, 149, 157 Tinigua 11, 266
Sandawe 6, 165, 167, 168, 171, 179 – 80, 348, 360 Tiwi 6, 323, 325, 329 – 30, 348, 360
Sands, Bonny 165, 167, 171, 172, 173, 180 Tlingit 199, 201 – 2
Sapé 10, 263, 355 Tol 9, 229; see also Jicaque
Sapir, Edward 196 – 9, 201, 202, 205, 206, 207, Tonkawa 8, 193, 194, 214, 350
208, 209, 210, 211, 212, 213, 214, 215, 216, toponyms 4, 8, 11, 12, 14, 25, 35, 41, 61, 67,
232, 238 80, 100, 101, 193, 206, 248, 262; see also
Sause 7, 294, 352, 360, 367, 368 proper names
Savosavo 7, 308, 352 Touo 7, 308, 352
Schmidt, Wilhelm 307, 308, 324, 334 Trans New Guinea languages 289, 293, 294, 295,
Schuchardt, Hugo 60, 63, 67 297, 298, 299, 301
Sec see Sechura Trask, Robert 2, 3, 12, 13, 59, 62, 67, 70, 72, 73,
Sechuran, Sechura 10, 267 76, 78, 85, 87
Seifart, Frank 9, 10, 268, 276, 358, 359 Trumai 10, 274, 355, 359
Semitic languages 19, 20, 24, 38, 39, 40, 43, 44, Tungusic languages 102, 139, 140, 146, 151
45, 59, 62 – 3, 67 Tunica 8, 214, 215, 216, 217, 232, 350, 358, 361,
Sentinelese 5, 6, 158 365, 366
Sepik languages 302, 303 – 4, 305 Tupian languages 263, 268, 270, 272, 273,
Seri 9, 199, 210, 229, 230, 234 – 41, 250, 348, 360 274, 275
Shabo 6, 166, 167, 168, 177 – 8 Turkic languages 2, 72, 73, 139, 140, 142, 151
Sibundoy see Camsá typology see linguistic typology
Sinitic languages 67, 139
Sino-Tibetan languages 139, 140 Uhlenbeck, Cornelius 60, 63, 64, 68, 78, 82
Siouan-Yuchi hypothesis 200, 215, 216 Uhunduni see Damal
Siuslaw 8, 196, 204 – 5, 210, 212, 350 unclassified languages xi, 4, 5, 6, 7, 8, 9, 10, 11,
Slavic languages 2, 63, 118 15, 20, 32, 33, 39, 40 – 5, 167, 193 – 5, 213, 262,
Solano 4, 8, 194, 214 273, 292, 293, 294, 304, 333, 356, 358
sound correspondences 62, 165, 166, 199, 204, uncontacted languages xii, 5, 152, 158, 269, 276
205, 206, 215, 238, 239, 291, 327 undeciphered scripts 4, 5, 32, 35, 36, 38, 42 – 5,
Sprachbund see linguistic area 179
spurious languages 5, 37, 167, 179 UNESCO 232, 243, 364, 365
Starostin, Sergej A. 118, 151, 166 Uralic languages 24, 62, 67, 72, 73, 83, 139, 140,
Suaréz, Jorge 12, 231, 232, 233 141, 142, 149, 232
Suarmin see Asaba Urarina 10, 269, 355
substrate languages 27, 82, 162, 167, 176, Uruak see Arutani
177, 184 Usku see Afra
Sulka 7, 307, 352 Utian languages 199 – 200, 210, 212
Sumerian 6, 20 – 4, 26, 27, 33, 34, 35, 37, 43, Uto-Aztecan languages 200, 212, 213, 236
140, 349
Swadesh, Morris xiii, 196, 205, 214, 215, 232, Vajda, Edward 2, 142, 144, 146
239, 243, 244, 249 van Egmond, Marie-Elaine 325, 327, 328, 331
Swanton, John 193, 194, 197, 214, 215 Viegas Barros, José Pedro 10, 11, 14, 275, 276
Vilela 11, 14, 275
Tabo 7, 296, 352 Vovin, Alexander 12, 100, 101, 104, 105, 106, 151
Taiap 7, 301 – 2, 352, 359
Tai-Kadai languages 100, 139 Waia see Tabo
Takelma 8, 200, 205 – 6, 210, 211, 212, 350, 358 Waorani 10, 269, 355
Takelman hypothesis 200, 205 Warao 10, 244, 264, 355, 360
Tallán 267 Warumungu 323, 335, 336, 337
Tambora 7, 288, 352, 363, 367 Washo 8, 199, 207, 208 – 9, 210, 211, 212, 238,
Tanahmerah 7, 289, 352, 360 350, 359
Taparita 10 Westphal, E.O.J. 165, 167, 180
Tarascan see Purépecha Wiru 7, 301, 353, 360
378 Index