A brief history, new glosses?

This project started as something I began many years ago. When I lived in Turkey, my host mother had a comparative dictionary of the Turkic languages, and I would marvel at how similar yet different they could be. Later, I began my own database of sorts on a series of notecards. I eventually tossed the notecards and moved on, in part because I hadn’t been consistent in my transcription and because they were cumbersome to transport and use.

More recently, I took a class on SQL, and once my instructor recommended I look into PHP, I realized I could publish data to the web. It was then that I began putting this all together – at first on my hard drive, and later on this website.

Thinking back to those original cards, I have come remember that I once had a number of glosses that do not exist in this database. And looking through the many, many dictionaries, grammars, and field reports in my references, I have come to realize that other authors found some of these glosses to be important as well. Adding new glosses is no small task as it can be annoying to have to revisit old sources. In some cases, I may have to wait weeks, as I obtained them through interlibrary loan.

For now, here is a preliminary list of the new glosses I have considered adding:

  • butterfly
  • fly
  • walnut
  • hammer
  • ax
  • bee
  • honey
  • wool
  • thread
  • footprint
  • penis
  • vulva
  • urine
  • feces

I was on the fence about the last four, given their taboo nature. However, they do show up in sources with surprising frequency. Even the Codex Cumanicus has them. If it’s good enough for Late Medieval Italians, it’s good enough for me.

Because I’m a bit obsessive (as the existence of this site shows), I might try to add a few more to achieve a nice, round number. However, adding these 14 will bring the total to 365, which is certainly nice.


I’ve been adding tons of forms here and there, with finishing up certain languages and working towards finishing others (Uzbek is on the list right now).

I’ve also begun an interesting foray into Cuman, a medieval language spoken by an early Kipchak people in the steppes of Ukraine and Eastern Europe. Even in the 13th and 14th centuries there’s quite a bit of Persian influence.

Unfortunately, the Codex Cumanicus is written in very confusing medieval Latin, so there are bound to be tons of mistakes on my part. My favorite so far is lupi ceruerij. When I saw the Cuman gloss was silausun I thought “Huh, that looks like the Kipchak word for lynx.” Sure enough, a little searching reveals that lupi cervieri was a term that early Italian traders and costumiers used for lynx fur. The term (which literally just means “wolf-deer”) seems to have had other meanings, but finding those out is a bit beyond me now.

The transcription is very inconsistent, which makes figuring out the original form very difficult. The letter x, for example, seems to represent what I transcribe as z, č, and s. Basically, if I don’t have a modern word to check the Cuman form against, I can’t reconstruct anything.

13,000 forms!

Another month, another milestone. This time, it’s reaching 13,000 forms. Entry 13,000 is Karakalpak for ‘far’ – узақ, алыс, қашық.

I’ve mostly been working on Karakalpak, but have also added some new sources for Western Yugur and Fu-yü Gïrgïs.

Updates (and lack thereof)

It’s been a crazy few weeks, so I haven’t had time to update anything. I’m still trying to figure out what to do with a lot of Doerfer’s Iranian materials, and still need to find good data for a few other languages.

I had to put my cat Stereo to sleep – he’s the big gray boy with me in the photo on the Elegant Lexicon main page. I’ve been absolutely crushed, so I’ve decided to add the word ‘cat’ to my list of glosses in his memory. The original Turks seem not to have been very interested in cats – they don’t do well with a nomadic lifestyle. Modern Turks are wild about cats though, so here’s to my kedi, my mushuk.