25000 entries!

Quite the milestone. The 25000th entry is the Turkmen word for “strawberry”: ýer tudanasy (yer tūdanasï). It’s a compound, with ýer meaning “ground”. The tudana (note the long vowel) part is somehow related to tut “mulberry” and refers to the fruit of the mulberry tree. It’s one of those words that was likely passed back and forth between languages.

I’ve added a few new references and have gone back to add long vowels into the Turkmen data. Turkmen preserves long vowels in places that other languages don’t, so it’s vital for reconstruction.

I’ve already got a list of 27 potential new glosses to add like “chalk” and “jaw” and “knot”, as well as a ton of verbs of motion like “get on”, “enter”, “exit”, “arrive”. Maybe once I’ve finished with the data from the last 50 glosses I’ll add these new ones.

24000 entries!

While going through all of my sources again, I realized that I hadn’t entered much for Dolgan. This has been pretty quick going, thanks to Stachowski’s Dolganischer Wortschatz. Today I reached 24000 entries. Entry 24000 is the Dolgan word for “glass” – hǟrkälä. This word is pretty interesting – it’s ultimately a borrowing from Russian зеркало, which means mirror. It means mirror, too, in Dolgan, but also means glass. I haven’t been able to find any other forms with that meaning for Dolgan. Russian зеркало was borrowed into Sakha as сиэркилэ, where it also means mirror. I’m not convinced the Dolgan form is descended from the Sakha form, as the vowels are pretty different, but there is likely some relationship. It is likely that early Russian traders traded manufactured goods like mirrors with the locals, who borrowed the word. Apply some vowel harmony and Dolgan’s strong dislike for /s/-sounds, and you get hǟrkälä. As glass was the only unknown component of these traded mirrors, the terms became conflated.

As a side note, I chose the term “glass” because I wanted to see if the early Turks had access to this technology. Most Turkic languages either use terms for manufacture products (such as bottles or mirrors) to mean “glass”, or borrow from other languages. This indicates that glass was unknown to them in ancient times. Also, glassmaking was developed only about 4000 years ago in Mesopotamia and only in the 5th Century CE in China. So any glass objects that the oldest Turkic civilizations would have had would have to come from the Middle East or Europe, and would not have been made locally. This may tell us something about their metallurgical practices, as it is believed that glass was discovered as a byproduct of metallurgy, when hot metal came into contact with sand.


Naturally, after adding 50 new glosses to the database I’ve run across a new one that I’d like to add: chalk. Chuvash has пурӑ, пур, Kazakh and Kymyk have бор

Wiktionary suggests that the Kazakh for comes from Russian бор “boron”, but this is clearly conflating the boron meaning with the chalk meaning. Fedotov says this is a native Turkic term and ties it to Sakha буор “earth, clay”. (Tuvan has пор and Tòfa has бор for clay as well). In Bashkir, the form is either бур or аҡбур, suggesting that the original term may have referred to crumbly stone or soil, with color terms used to distinguish between chalk, clay, etc.

I’ve entered forms for the latest 50 glosses for Turkish, Tuvan, Dzhungar Tuvan, Sakha, and Chuvash, and I’m working on Azerbaijani. Once I’ve made my rounds, I may add chalk to the database, plus whatever else I find.

As you can see above, there are a lot of cases where it could be useful to suggest related terms. Knowing that chalk is related to earth and clay could be beneficial. I may work on this in the near future as well.

Kinship terminology

Something that has been irking me is the inability (so far) to have kinship terminology in this site. The problem is that English, Russian, German, and French have kinship systems that are a bit more basic that those found in Turkic. Many of the languages I am aware of employ complex systems that distinguish maternal or paternal relationship, relative age, and gender. This means that many grammars and dictionaries will translate a term simply as brother (rather than older or younger brother) or aunt (rather than father’s sister or mother’s brother’s wife).

A second issue is that there is considerable variation both between languages/varieties and within languages. This makes comparison difficult. Also, many terms are borrowed from other languages, such as Turkish hala and teyze (maternal and paternal aunt, respectively), which were borrowed from Persian.

Perhaps I’ll work out a new scheme for more complicated lexemes and morphemes. Some day I’d like to have kinship terms, case morphology, verbal morphology and other forms; for now I’ll focus on more easily defined terms.

23000 surpassed, considering even more glosses

I’ve nearly completed adding Karakhanid data from Dīwān Luγāt at-Turk. This has brought me to 23000 entries. This also means that I’m nearly out of sources to consult until I can visit a bigger library (which is still difficult due to COVID).

I have considered adding 50 new glosses, which would bring my total up to 500. I’m considering new body parts/functions (palm, feces, pus, sole, hoof, vein), some plants and animals (cockroach, juniper), directional/positional terms (top, bottom, interior, side), and a few random conceptual and cultural terms (wedding, color, thief). I have 38 terms so far; the Dīwān index was very helpful in choosing these. Once I have decided, I’ll post them here. I’ll also do some background work to ensure that I’m not going back to the same sources and looking for terms that aren’t in there. It’s frustrating.

As side note, I have really enjoyed reading the following article: Janhunen, Juha. “Issues of Comparative Uralic and Altaic Studies (3): The Turkic Plural in *-s.” Altai Hakpo, 2017. He breaks down a lot of issues relating to the unusual number of paired items ending in /z/, the z~r controversy, and typological issues related to paired/plural items.

So close to 23000…

I’ve finished adding forms from both Azovian and Georgian Urum, as well as Iraqi Turcoman. I’m so close to 23000 entries, but have hit a bit of a block in terms of finding more sources for data. I don’t have access to the massive library collections that I used to, so I’m unable to get more data for Salar and other languages.

However, I am considering adding about 20-25 new glosses, including palm, glass, moustache, shovel, feces, poison, and coal. This is no small matter, as it means that I have to go back to all of my previously consulted sources and get new data. It’s a lot to keep track of and I’d prefer to have a large chunk to work on rather than a handful of easily mislaid words. Before I do any of this I’ll be updating the missing page to ensure that I’m focusing only on newly added glosses, rather than old ones that I know I can’t get translations for. We shall see. I may just try to come up with 50 so I don’t feel like I’m hopping between sources every day.

Nobody but spam bots seems to ever read this, but if anyone has any leads on Kondoma or Upper Shor, I’d appreciate it. I’ve seen some scholars brush it under the rug as just an endangered dialect, but I think it’s a linchpin that holds together the classification of Turkic. Losing Lower Chulym was devastating, but I think that Kondoma Shor is similar enough that it could fill in the missing insights that I had hoped would come from Lower Chulym.

Even more new glosses

I’ve somehow found some free time and have added several more glosses to the database. These new glosses are: ankle, badger, bird cherry, blanket, brick, calf, candle, clay, cradle, cream, dough, fast, flour, footprint, frost, grow, hail, hedgehog, hemp, lead, maple, marmot, millet, moss, mulberry, naked, oak, oats, owl, pillow, pine, poplar, rib, rice, ring, rye, silk, sword, thread, tin, turtle, urine, wax, well, willow, wrist.

I was very inspired by Stachowski’s 2008 Names of cereals in the Turkic languages when I added all of the grain terms. I thought it would be interesting to do a few other common cultural terms, such as metals and tree names. The metals in particular follow similar Wanderwört-like patterns as they have made their way through Eurasia. I already had iron, gold, silver, and copper; I have now added lead and tin, completing the list of common iron age materials.

Bird cherry was added because I kept running across it every time I searched for “egg”. It’s very similar in form, which is intriguing. I have no theories on their relationship yet.

It’s been fascinating to see what technologies can be reconstructed based on vocabulary. The ancient Turks definitely had metallurgy, bricks for building permanent structures, and grain production. But they seem not to have had apiculture judging by the lack of reconstructable terms for wax. They liked honey, but didn’t work with wax. Interesting.

I’ve also added the latest update date to languages, so you can see if I have searched for the most recent glosses. Not all languages have this yet, as I haven’t been able to work on many of them.

New Glosses

I broke down and added 36 new glosses today. I’ve added forms that I’ve found interesting before, such as candle, mulberry, and silk, two metals: tin and lead, some body parts: ribs, wrist, and ankle, and a variety of other forms that I wanted for the reconstruction of Proto-Turkic. I’ve started looking up these forms, so we’ll see if they go anywhere.

2021 Updates

Work has been busy! I’ve been doing more research in the library world, so this site has not been updated as often as it used to be.

The maps page is all messed up. There’s something wrong with the script I had been using to create them, so I’ll have to fix that.

In my spare time I’ve been working on an artistic map showing the Turkic languages. Perhaps I’ll even offer it up for sale as prints. It’s very time-consuming as I’m making it in SVG using Inkscape. You can do some cool map effects, and I’m hopeful that it will look as good when it’s done as I imagine. I’m nearly done with the country borders and am working on the major bodies of water. After that, I’ll do rivers and lakes, cities and place names, then finally I’ll add the languages. I’ll likely have to create a couple of insets. I’m thinking of doing one for the Caucasus and one for the Altay-Sayan region.

I’m still collecting books and articles and other resources, so whenever I have time I’ll continue to add new entries.

I would love to turn this into a more etymologically-based project. It would be neat to see not only reconstructions of Proto-Turkic, but also reflexes of borrowings from various sources and points. We’ll see – I’d have to rethink the entire database structure.

Old Uyghur

There is a new freely(!) available glossary of Old Uyghur that has been published:

Wilkens, Jens. 2021. Handwörterbuch des Altuigurischen: Altuigurisch – Deutsch – Türkisch / Eski Uygurcanın El Sözlüğü: Eski Uygurca – Almanca – Türkçe. Göttingen: Akademie der Wissenschaften zu Göttingen. https://doi.org/10.17875/gup2021-1590.

Accordingly, I have begun to add entries from this source and have added Old Uyghur as a new language. It’s tricky because Orkhon Turkic is already in the database, and not all sources clearly differentiate between the two. They do appear to be different, especially as Old Uyghur has clear influences from Chinese and Sanskrit and was affiliated with very large kingdoms. Orkhon, on the other hand, is a bit more limited and found mostly on stela in Mongolia. I may need to revisit this.

I’ve been occasionally working through the rest of Khorasani Turkic. It’s a bit of a mess. I’ve been very busy with work, so this project has become a low priority for the time. I’ve added the first 50 words from Old Uyghur and should have this completed sometime soon.