Paleo-Balkan languages

Although it's been a fascinating hobby of mine for many years to follow the story of Indo-Europeans as a linguistic, cultural, and even genetic group, it's been the fairly recent studies of Corded Ware genetics and other archaeogenetic studies have really brought that back into my consciousness, and no doubt you've noticed that I've posted more about it recently than I have in quite a while.  Even as I'm not necessarily posting about stuff that's new even to me, just that it's been rolling around in my head for months (or longer) and coalescing finally into actual posts.  I realize that probably not very many people are really interested in that that come my way—although the spread of Indo-European with the maps post does have a fair number of page-views, so maybe I don't know what I'm talking about anyway.

So... the Balkans.  Holy cow, the Paleo-Balkan linguistic situation is difficult.  What in the world was going on?  There are a number of neolithic cultures throughout the Balkans and surrounding area that are not believed by anyone (except proponents of the non-mainstream Anatolian Urheimat theories) to be Proto-Indo-European, although some of them would have been neighbors of early Proto-Indo-European languages spoken by members of the Dnieper-Donets and Samara cultures, which differed in everything from skeletal robustness and almost every aspect of material culture.  In fact, as these Balkan-Danubian Neolithic cultures developed, they became actually the most populous cultures on the planet at the time, and vast (for neolithic standards) cities of up to 50,000 people or even more are posited to have been built there.  They thrived the most during the Atlantic time period, which was a climatic optimum for much of Europe (warmer and wetter than today—tell that to your climate change catastrophe Green Cult friends) which, of course, made it easy for them to thrive.  They are characterized as having had relatively little social stratification or specialization and to have practiced some subsistence farming, supplemented by some hunting and gathering.  The nearly 3,000 year long Cucuteni-Tripolye cultural complex is the apex of this Old European expression.  The best guess for what happened to it is the Blytt-Sernander sub-boreal climate phase, which was like a Neolithic Dust Bowl—colder and drier temperatures for many years devastated the agricultural ability of the territory to support the population that lived there.  On the steppes, the rise of pastoral nomadism increased sharply, also believed to be a response to this climate change, and paved the way for Yamnya expansion at the expense of the Cucuteni-Tripolyans, in turn leading to the spread of Indo-European dialects into the Balkans, and the Old European way of life was gone for good.  I have not yet seen any archaeogenetic studies that suggest for sure that there was significant population replacement but there may have been.

Regardless, there was population mingling going on for some time.  Prior to the collapse of the Balkan-Danubian Old Europe, the Bug-Dniester culture, which would have been early Proto-Indo-European originally, is believed by David Anthony to have been absorbed into the Old European Criş farmer system and been "Old Europeanized."  On the other hand, the Suvorovo-Novodanilovka complex is supposed to have been a number of Sredi Stog early PIE speakers who moved into and dominated the lower Daubian region which had formerly been Old European—this movement is supposed to represent the splitting off of the Anatolian language family.  Later, the Usatovo culture came out of the steppes into the Balkan region, possibly representing "Indo-Europeanization" and installation of elites and client kings from the Yamnaya horizon in the region.  Later genuine folk migration followed into this area, but here we get more into just-so stories that are difficult to determine.  How and even if the Corded Ware horizon came from this interaction or from some other happening independently to the north  is difficult to determine.  It's not necessary to derive all of the Indo-European languages of Europe from movement into the Balkan-Danube region—but Anthony kind of does.

Needless to say, however, clearly the Old European linguistic and cultural horizons disappeared and cultures descended from the Yamnaya common PIE speakers prevailed.  Whether it was population replacement, or population replacement supplemented with absorption and client relationships that transformed formerly Old European peoples into Indo-European peoples culturally and linguistically is unclear, but one or both of them clearly happened.  And the modern languages of Europe came out of the mix.

Probably the languages of Northern Europe came from the Corded Ware horizon rather than the cultures that remained in the Balkan-Danubian region.  The Germanic, Baltic and Slavic languages probably have their genesis here.  The Celtic and Italic may have come out of the Balkans, but moved further west fairly early and missed out on subsequent development within the Balkans and links between the Balkans and the steppes.  There's believed to have been a lingering continuum of some IE dialects into a very late PIE that had already shed the "fringe" languages to the west, the northwest and the far east.

However, discounting these families, there are still a lot of groups of IE that have to come out of the Balkans over the next couple thousand years, and how that happened is often difficult to determine.  Exactly how those languages relate to each other is equally difficult to determine.  They may belong to families, or they may be individualized families that are only known from (poorly attested) single languages.  We just honestly don't know.  There are, however, three languages that appear to have originated in this paleo-Balkan complex that still survive, although with varying degrees of distortion due to political and linguistic pressure from Hellenization, Romanization, Slavicization, Iranianization, Turkification, etc.: Albanian, Greek, and Armenian.  Tying these to para-historical languages, however, is fraught with difficulty.

Let's explore the landscape just a bit and see if we can at least see what the scope of the problem is.  Let's start with languages that are derived from the Balkans, but best attested elsewhere:

  • Anatolian.  Although they are attested in Anatolia (duh) they are presumed to have arrived there from the Balkans, and are part of the first wave of IE languages to leave the steppes and assorted steppe river valleys.  That said, when subsequent waves hit the Balkan-Danubian region, languages related to those that later appeared in the Anatolian area were probably still spoken here, along with some non-Indo-European languages.  There are calls to relate some later-appearing languages (like Mysian) to the Anatolian language family.  Strabo himself calls Mysian a blend of Lydian and Phrygian.  The supremely poorly attested Paeonian language, probably related to Mysian in some way, may be an Anatolian relative that survived in the Balkans to (just barely) be noted in the historical epoch in its original homeland.  Others have suggested that it's probably just related to Phrygian and not a member of the Anatolian family at all.
  • Armenian.  This is supposed to have come from the Balkans largely because of historical testimony rather than archaeological or linguistic testimony.  Given that ancient writers didn't have our same linguistic paradigm, they could be wrong.  However, Armenian is often considered to have originally been quite close to Greek, either as part of a Sprachbund or as a genetic close cousin.  It's obviously also had a lot of late contact influence from Iranian languages, particularly Parthian and Persian, but it's been suggested by some that a very late PIE continuum containing the Indo-Iranian languages, the Graeco-Phrygian languages and Armenian may have persisted and had some unique development.
  • Phrygian.  Attested by the Phrygians, who invaded Anatolia after the fall of the Hittite Empire and the rest of the Bronze Age Collapse, according to Herodotus, who says that they were south Balkan tribe called the Bryges before being known as the Phrygians.  Although a very poorly known language, it has a number of similarities with Greek in particular, and is usually presumed to have been relatively closely related to Greek.  It shares the augment, a supposed late isogloss that affected Greek, Armenian, Phrygian and Indo-Iranian, and has some sound changes that appear to be common with Germanic—although that theory has been in and out of favor and exactly what it means is, needless to say, very unclear.
  • Mysian.  As noted above, Mysian was called, by Strabo, a mix of Lydian (an Anatolian language) and Phrygian, and they are recorded as living just to the east of the Troad along the Dardanelles and Propontis coast.  It may have been an Anatolian language, or some other Paleo-Balkan language; Athenaeus wrote that it was related to the Paeonian language, spoken north of Macedonia.
  • Messapic.  First attested in the boot of Italy, Messapic is clearly an Indo-European language, and not related to the Italic family.  Some believe it shows links to the Illyrian language, but both are too poorly known to say this with much confidence.
  • Greek.  And, let's not forget the most famous, well-known, earliest attested language to have come out of the Balkan complex; Greek itself.  The earliest actual Greek texts, from the Bronze Age Mycenaean palace civilization, are already well established in Greece, but archaeologically it's not hard to trace the arrival of kurgan-like burial rites and material culture from the Balkans.  This, along with apparent connections to other languages like Phrygian and Armenian that were also derived from the Balkans, as well as late shared isoglosses that appear to come from Indo-Iranian means that they Greeks had to most likely have come from the Balkans as well.
And of the languages that are later attested as native to the Balkans when they first show up, we have yet more:
  • Macedonian.  This is a poorly known language that clearly underwent a lot of influence from classic Greek—but some suggest that it is not merely a northern dialect of Greek, but a separate language altogether.  Some suggestions are that it is a close sister language of Greek; part of a Hellenic family, or a Creole of sorts between Illyrian and Greek, or a language that unites Greek with maybe Thracian or Phrygian, or perhaps just a part of a Sprachbund that included very archaic proto-Greek/Phrygian/Thracian/Illyrian.
  • Thracian is well known from the southeastern Balkans, and the Thracians lived there long enough to be well-known to both Greeks and Romans, although neither bothered to record much of their language, much to our disappointment when it comes to trying to classify it.  While it's often been compared to Illyrian, Phrygian or other Balkan languages, for the most part, it's now only considered to have (probably) been closely related to Dacian.  There are some interesting proposals that there may well have been a dialect continuum from Thrace all the way into the steppes between Thracian and Scythian, and some have proposed the even more poorly known language of the historical Cimmerians (who destroyed Phrygian power in Anatolia) as the "missing link" between Thracian and Iranian. 
  • Dacian.  While most presume that the language of the Getae and Dacians (said by various ancient historians to be the same language) is closely related to Thracian (Strabo seems to have believed so) some modern linguists say that this cannot be determined and spot what they believe to be significant differences between the scanty remnants we have of Thracian and Dacian, making their close association impossible.  This is the minority opinion, however—Dacian is usually believed to be the northwestern extension of a Daco-Thracian language family.
  • Illyrian.  The Illyrians were a fairly populous group, known to both the Greeks and the Romans, and often traditional rivals to the northern Greeks; both Alexander and Phillip before him fought Illyrians on numerous occasions.  It's worth noting, however, that the Greek concept of Illyrioi and the Roman concept of Illyricum were quite different, and may not have made up the same bodies of people completely.  Some linguists have suggested that this marks rather notable dialectical differences, but it may have been more a case of familiarity; the Greeks referred to the tribes that bordered them directly only.  Making up a broad band along the eastern shore of the Adriatic Sea and inland a fair bit from there, the Illyrian region bordered Greek on the southeast tip, Thracian north of that, Dacian north of that, and various Celtic tribes to the direct north, It's believed to be related to the Venetic language, which appeared in northeastern Italy to the northwest of the Illyrian band, but this is uncertain.  A small portion of the coast surrounded on land by the Illyrian area, and near to the Venetic area in the northwest is where the Liburnian language was spoken, but this is also believed (based on geographical convenience rather than sound linguistic data, of which there is practically none) to have been closely related to Illyrian as well, if not a dialect of it.  Few linguists are comfortable linking it to any other Balkan language except by obvious contact with Celts, Thracians, and Greeks.  Most likely it was a once significant family of its own right, and Venetic and Liburnian are the only two languages that are separate from a vast unknown sea of "Illyrianness"—and again, based on geographic convenience, probably Albanian is descended from it.  But all of that is really quite speculative.
  • Paeonian.  As mentioned above, this was spoken between Macedonian and Illyrian and is referred to by ancient historians as being similar to the Mysian language spoken near Troy.  If that's true, then it could be an Anatolian language, or close relative, that never made it into Anatolia and retained its historical Balkan-Danubian location.  Of course, it's also certainly possible that Mysian was not an Anatolian language at all, but one related to Greek, Macedonian and Phrygian.
All in all, the Paleo-Balkan linguistic situation is quite a mess.  Curiously, the very earliest proto-writing in the world, predating Sumerian cuneiform by a thousand years, is demonstrated in the Balkans—the Old European Vinča script.  But that was Old European, and after what was probably a climate change crisis, followed by a social crisis and invasion of foreign pastoralists representing the influx of Indo-European language, culture, and economy into the region, it went illiterate and did not become literate until it became a series of Roman provinces, really.  When that happened, the native languages were ignored, however, except for a few curiosities noted by some historians, and the writing was all done in Latin or Greek.  Waves of linguistic and political influence from Greek (ancient, classic and later Byzantine), Latin, the expansion of the south Slavic tribes, the migration of the Huns and later the Magyars, and the final domination politically by the Ottomans and the Austro-Hungarian empire has obscured whatever there once was there.  It is now impossible to sort out what was going on linguistically for much of this period, and the best we can do is extrapolate with our fingers crossed based on the identification of tribal names and peoples referred to by the Greeks and Romans, Ottomans and Byzantines.  By the early Middle Ages, most of these languages were either already extinct or fast heading that way with the exception of those that are still spoken today—Greek, Armenian and Albanian—to be replaced by south Slavic languages, or descendants of Vulgar Latin like Romanian and Moldavan, or intrusions from further east like Hungarian.

The curious thought here, though, about the Paleo-Balkan mess, is that we have just enough information to discern what a mess is it and how much fluctuation of peoples and languages was going on, but not enough to decipher it.  Does this mean that our simplistic view of the Corded Ware horizon staidly evolving into Baltic, Slavic and Germanic, etc. without interruption, or the Andronovo evolving into Iranian, etc. are too simplistic, and we just have no idea how to even discern how much churn and fluctuation was going on, which may well have been as bad as it was in the Balkans?  Interesting question.  But we don't know what we don't know, and we don't know how to know any more than we do now, barring the completely unexpected discovery of some new ancient texts buried somewhere.

Based on all of that; do I dare say what I think the linguistic situation may have looked like?  Sure—with the clear caveat that my opinion is fairly speculative, and I'm just a fan of the discipline, not a professional.  I think the Old European languages made up a family that was related in some way or other to the Minoan language, or the Etruscan language (or maybe they were two separate waves of linguistic penetration.)  During the height of their success, they were actually drawing early proto-Indo-Europeans into their orbit rather than the other way around—the Bug-Dniester culture, for example.

Later, an early wave of early proto-Indo-Europeans moved into the area following changes in agriculture and culture, as well as probably prompted by some climate change.  This is the source of the Anatolian language family.  Although it probably did not completely replace the Old European languages. on its way to Anatolia, it did leave population of Anatolian language speakers in the area, which left some of the poorly attested substrates that later waves of Indo-Europeanization dominated.

As subsequent migration moved into the area, population hybridization picks up so much of the Indo-European genetics that, like the Corded Ware situation, although probably not as drastically, we start to get a fairly thorough Indo-Europeanization.  This relatively early post-PIE dialect continuum stretches from the new frontier in the Balkans to equally expanded new frontiers in the east, out beyond the Volga and across the distant steppes of Central Asia, where the Afanasevo culture has recently appeared.  There is continued movement west, and dialects of late PIE that will later emerge as Celtic and Italic (and maybe Nordwestblock, or other today unknown groups), although those that remain in the Balkans and just north of them retain still enough contact with their original homeland in the steppes that linguistic isoglosses can still be shared with later-PIE that will develop into Indo-Iranian.  Even the earliest northern Corded Ware cultures that will develop into Slavic and Baltic have some lingering linguistic contacts.  The languages that remain in the Balkans start to split a bit and undergo further development.  From a continuum that includes pre-Greek, pre-Armenian and pre-Phrygian, first Greek and then Armenian and Phrygian eventually migrate out of the Balkans and establish colonies in Greece, Armenia and Anatolia.  They may have been pushed southward by population pressure (possibly associated with the migrations of the Sea Peoples ultimately), possibly by the arrival of a large Thraco-Dacian language group from further north, which may explain observed similarities between what little we know of Thraco-Dacian and both Baltic and Iranian languages.  The Thraco-Dacians might well have moved into the area due to population pressure from even further away, including the expansion of early Celts as the La Tene culture had a much larger geographic extension than its ancestor the Hallstatt.  Another group, the Illyrian (and Venetic, etc.) languages hug the east coast of the Adriatic Sea, not closely related to the Greek-Armenian-Phrygian group, nor the more recently arrived Thraco-dacian group.

For many generations, echoes of the former languages still linger.  Pelasgian, Minoan, Lemnian, etc. might be remnants of the Old European languages that presided here in the earlier Neolithic.  Anatolian substrates might have lingered in the form of the Paeonian and Mysian languages.  Some forms of the family that broke apart and gave us Greek, Phrygian and Armenian lingered, perhaps, also to be seen in Mysian and Macedonian.  The Illyrians manage to withstand pressure from all comers to some extent, but the rest of the region is eventually swamped by the Thraco-Dacian group.  These are in turn pushed by the Celts and then decimated linguistically by the Romans in historical times—but the Romans are unable to continue to dominate politically.  While some languages descended from Latin still remain (Romanian, Moldavan) and the continued spread of later Byzantine Greek is politically important, linguistically, the spread of south Slavic tribes becomes the most important "recent" development, as it swamps much of the Latin and possibly Greek that had replaced the Thraco-Dacian.  Illyrian survived, much distorted by pressure from various languages, as Albanian—although I do like the idea of Albanian being instead a much distorted Thracian or Dacian language, which originally came from outside the region, replacing and displacing language groups like Greek, Armenian, Phrygian, etc. because it offers a possible explanation for surprising correspondences between Albanian and Germanic and Balto-Slavic.  But, I think the fact that Albania is located smack dab in the same territory as Illyricum makes the former a better null hypothesis.

Finally, squashed between Latin and Byzantine Greek, swamped by Slavic migrants, and later dominated by both the Austrians and the Ottomans (as well as having to suffer further invasions by the Huns, the Avars, the Magyars, etc.) none of the original paleo-Balkan languages survive in the Balkan-Danubian region with the exception of Albanian.

