Linguistics

Why am I interested in linguistics?

UPDATE on 30/09/2025: I've made a YouTube video about the basics of programming language design and implementation which is in the Latin language. In case you cannot open it, try opening this MP4 file. This video, for a reason that escapes me, went relatively viral, so I suppose you might like it as well.

So, how did I get interested in linguistics? Well, what played the most part are probably the funny things that happened during my Latin classes.
To me, the funniest things that happened is this: The teacher asked a student "What was the name of the Caesar's assistant in the calendar reform?", and he, after hesitating for a few seconds, responded "Gastritis?" Of course, the teacher said "Gastritis?! A man named Gastritis?!" Then we all started to laugh. After a few seconds, the teacher said: "If only it even reminded of, but it doesn't. He was called Sosigenes. Try associating it with English word 'sausage'." After some 30 minutes, the teacher asked another student "So, who was Gastritis?". The student responded with: "He was Caesar's assistant in the calendar reform." Then the teacher said: "Well, you did very well if you didn't know his name." Sosigenes later got all sorts of funny names from the students. Syphilis, Schizogenic, Sisiguz (apparently derived from two Croatian vulgar words)...
The other funny thing that happened was that on the national competition in Latin (I was the 7th in Croatia), someone allegedly mistranslated "Ea, quae fertilissima totius Germaniae sunt, loca" (the most fertile places in whole Germany) as "the most fertile German woman" in the first sentence, and therefore mistranslated the whole text. The text we had to translate in the first part of the competition was about some forest in modern-day Germany. Those fertile places were around that forest. Later that text was about the northern deer, the animals which were believed, by the ancient Romans, to be endemic to that forest. The text later went "Ea nascuntur alces, animalia quae reliquis in locis visa non sint." (There (=in that forest) the norhtern deer are born, the animals which aren't seen in other places.) The student in question allegedly translated that as: "She gives birth to the northern deer." We were allowed to use a dictionary, it's just that, if you don't know the grammar, you can very well mistranslate the whole text, even though you know all the words.
The third funny thing happened when it was a test in Latin and we had to translate a text from Croatian to Latin using a dictionary. The student next to me complained that there is no "ide" (goes) in the dictionary. I told him: "Well, you have to look up the verbs in infinitive." Then he asked: "What's infinitive? I am not very good at grammar, you know." I said: "The verb form that ends in -ti or -ci. For example, the infinitive of 'je' (is) is 'biti' (to be), and the infinitive of 'ide' (goes) is 'ici' (to go)." Teacher commented on it: "You think that he will understand that?!" Then, after few minutes, he asked me: "Hey, what's the infinitive of 'crveni' (red, as an adjective)? 'Crveniti' (red+'ti') or...?"

Plato said that
if you keep people
in a cave their
entire lives,
they will be
skeptical of
the real world.
The same appears
to be true
with modern education
system.

So, I've spent years studying linguistics on the Internet, mostly on Wikipedia. I've figuered out that the linguistic education in the schools is very bad. They are bombarding students with controversial statements they have no clue how to evaluate. That would be acceptable if they gave students even the basic idea what linguistics is about, but it doesn't even do that. When I try to talk to those students about actual scientific linguistics, they are being skeptical becuase of their ignorance.
If there was only one thing I could teach you about linguistics, that would be the concept of regular sound changes. That is, the sound changes don't affect just one or a few words in a language, but they affect the whole languages. A sound change that would change "two" to be pronounced "dwo", but wouldn't affect any other word, isn't possible. But a sound change that would change every 't' in the beginning of a word to 'd' is likely. So, if a sound change occurs that changes "two" to "dwo", it will probably also change "ten" to "den". From that simple fact, much about languages can be derived.
For instance, new phonemes are created by two subsequent sound changes, the second one creating the conditions of the first one after the first one ended. For instance, the Latin words "cibus" and "clamo" were both pronounced with a 'k' at the beginning. But let's look at their Italian descendants. The first relevant sound change that occured was a change of the 'k' before the 'i' to 'ch' (written as 'c'). So, the Italian words after that were "chibo" and "klamo". Then what occured was a sound change that changed the 'l' in some conditions to 'y'. So the modern Italian words are "cibo" (pronounced "chee-baw") and "chiamo" (pronounced "kyah-maw"). And that's how 'k' and 'ch' came to be different phonemes. The allophones are created by a sound change changing an existing phoneme in some condition to a sound that's not yet a phoneme in a language. For instance, 'k' and 'ch' were allophones before the second sound change.
The accents are created primarily by sound changes merging the two similar phonemes together. For instance, if a sound change in English merged 'p' and 'b' into a single phoneme, "pin" and "bin" wouldn't become homophones, but words with different accents. Namely, the 'i' is already pronounced slightly differently in those two words.
But here is something we aren't taught at school, but it follows from the regular sound changes. In related languages, one phoneme in one language regularly corresponds to another phoneme in a related language. For instance, the English 't' in the beginning of a word regularly corresponds to German 'z'. Here are some examples: two-zwei, ten-zehn, tooth-Zahn, tongue-Zunge, twig-Zweig, toe-Zeh, too-zu, tame-zahm, token-Zeichen, twinkle-zwinkern, twilight-Zwielicht... This rule was exceptionless at the time that sound change occured in German. Since then, both German and English have changed significantly, but the rule is still unmistakable and undeniable. Another such rule is that English 'th' corresponds to German 'd', and not just at the beginning of a word, but everywhere: "three"-"drei", "thick"-"dick", "thin"-"dünn", "this"-"das", "that"-"dass", "there"-"da", "bath"-"Bad", "thirst"-"Durst", "thank"-"danken", "think"-"denken", "thing"-"Ding", "brother"-"Bruder", "thou"-"du", "then"-"dann", "thorn"-"Dorn", "thumb"-"Daumen", "both"-"beide", "mouth-"Mund", "through"-"durch", "north"-"Nord", "south"-"Süd", "thunder"-"Donner", "earth"-"Erde"... Also, the general rule is that English 'd' corresponds to German 't' in native words ("day"-"Tag", "cold"-"kalt", "old"-"alt", "dream"-"Traum", "red"-"rot", "do"-"tun", "death"-"Tod", "dale"-"Tal", "ride"-"reiten", "blood"-"Blut", "dew"-"Tau", "deep"-"tief", "good"-"gut", "god"-"Gott", "bed"-"Bett", "drive"-"treiben", "drink"-"trinken", "drop"-"Tropfen", "dove"-"Taube", "deaf"-"taub", "daughter"-"Tochter", "bread"-"Brot", "door"-"Tür", "word"-"Wort", "world"-"Welt", "shoulder"-"Schulter", "hard"-"hart"...) and early Latin loanwords (for example, "desk"-"Tisch", both coming from Latin "discus"...), however, there are some exceptions such as after 'n' ("blind"-"blind", "wind"-"Wind", "land"-"Land") and sometimes after 'l' ("wild"-"wild", but notice that's not true in the words for "cold" or "old". I've opened a StackExchange question about that.). And the same goes for other languages, for instance, Latin 's' in the beginning of a word corresponds to Greek 'h': sex-hex (both meaning six), septem-hepta (both meaning seven), sal-hals (both meaning salt), sol-helios (both meaning sun), sus-hys (both meaning pig, the rare English word swine is from the same root), super-hyper (both meaning above), similis-homoios (both meaning similar), sollus-holos (both meaning whole), silva-hyle (both meaning wood), somnium-hypnos (both meaning sleep), sanguis-haima (both meaning blood, although it is a bit controversial to say those two words share the same root, I have asked a forum question about that), sudor-hidros (both meaning sweat), suavis-hedys (both meaning sweet)... Sometimes it even happens that a consonant in one language regularly corresponds to a vowel in another language. For example, Latin -em at the end of a word corresponds to Greek -a: septem-hepta (both meaning seven), decem-deka (both meaning ten), the Latin third-declension accusative singular ending -em and the Greek third-declension accusative singular ending -a... Such sound changes are called vocalizations.
So, according to these rules, it's actually possible to reconstruct the ancient languages from which the modern languages were derived. It's important to understand that such rules where one sound in one language regularly corresponds to a dissimilar sound (the most extreme example being vocalizations) in another language cannot be explained as a result of borrowing, as then the sounds in questions would be similar sounds, rather than dissimilar sounds. And they are also very unlikely to occur by chance (much less likely than if one were allowed to once match the 'p' in one language with a 'b' in another, once with an 'f', and once with s 'p'). Regular correspondences of dissimilar sounds can only be explained by languages having common ancestors, an ancient language from which both of them derive. And many of those ancient languages have been reconstructed, to a higher or lesser degree, it's just that the public is incredibly ignorant about it.
The linguistic theses I personally like to support on the Internet forums are my alternative interpretation of the Croatian toponyms, that Illyrian (contrary to the mainstream linguistics) didn't have sibilarization (UPDATE on 12/05/2024: Here is my YouTube video in which I present five arguments for Illyrian not having sibilarization. If you cannot open it, try downloading this MP4 and opening it in VLC or something similar.) and that the Indo-European language family and the Austronesian language family share a common ancestor. I've described them in great detail in the thread I linked to on the left.
I hope that has some educational value.

UPDATE on 15/12/2017: I am currently developing a web-game about linguistics.

UPDATE on 25/01/2018: The full version of the game is now available on-line. Special thanks to Boris Muminovic for helping me with informatics-related issues and to Daniel Ross for helping me with linguistics-related issues. I've wanted to make such a game for a long time, but I didn't have the knowledge needed to do it.

UPDATE on 06/07/2021: You may want to read the draft of the paper I am planning to publish about validating the algorithm used in Etymology Game. (UPDATE on 15/10/2022: I have also made a PowerPoint presentation about that.) (UPDATE on 28/11/2022: I have made a YouTube video (MP4) about it.)

UPDATE on 04/10/2021: I have written a C++ program that outputs an SQL database containing names of numbers in different languages. Maybe it comes useful to somebody. I have also written a seminar about it in Croatian. In case you want to make it work in database engines other than SQLite, you can follow these instructions.

UPDATE on 25/02/2023: A few months ago, I published a paper in Valpovački Godišnjak which applies informatics to the names of places. It is mentioned in Glas Slavonije. It is basically this text, just a slightly different version.

UPDATE on 23/04/2023: I have shared a humorous anecdote about language learning, which got many upvotes on Reddit (I should warn you that it's a bit of a dark humour):

My grandmother suddenly died without having written the will. According to the law, her relatives living in Germany could claim to own her house, and they have to explicitly say in court they don't want that so that we could sell the house. So I called one of her relatives living in Germany that I happened to have the telephone number of. The conversation went something like this:

Me: Ich spreche kein Deutsch. Kroatisch vielleicht? (I speak no German. Croatian perhaps?)
He: Ne. (The Croatian word for "no")
Me: Englisch vielleicht? (English maybe?)
He: Nein, kein Wort. (No, not a word.)
...
Me: Was ist mit <name>? (What about <name>?)
He: Er ist tot. (He is dead.)
Me: Ah. Und was ist mit <name>? (Ah. And what about <name>?)
He: Sie ist auch tot. (She is also dead.)
...
Me: Ah. Tut mir lied dann. (It is supposed to be "Tut mir leid dann.", which means "I am sorry then.". But "lied" actually means "song".).

So instead of telling him I am sorry that everybody I mentioned died, I said that that makes me sing.

UPDATE on 15/05/2024: I've opened a discussion about whether Albanian is descended from Illyrian on TextKit and r/latin:

Eratne lingua Illyrica "centum" aut "satem" lingua? Suntne Albani nativi in Balkane?

Quid homines in hac agora censent, eratne lingua Illyrica "centum" aut "satem" lingua? Linguae Indo-Europeae omnes in duas uniones divisae sunt: "centum" et "satem". In "centum" linguis, Indo-Europeanum phonemum 'kj' in 'k' vertitur. Lingua Latina est "centum" lingua, etiam sunt lingua Graeca et lingua Anglica. In lingua Anglica vere 'kj' in 'h' vertitur, sed, quodam tempore, ante Grimmi legem, 'kj' in 'k' vertebatur in lingua Anglica, et propterea lingua Anglica est "centum" lingua. In "satem" linguis, 'kj' in 's' vertitur. Exempla "satem" linguarum sunt lingua Croatica, lingua Albanica et lingua Sanskrit. James Patrick Mallory scripsit in Encyclopedia of Indo-European Culture se censere id, num Illyrica erat "centum" aut "satem", ex datis quae habemus, sciri non posse. Plurimi linguistae in Croatia, et alibi in Balkane, censent linguam Illyricam fuisse "satem" linguam et etiam progenitorem esse linguae Albanicae. Sed ego censeo linguam Illyricam "centum" linguam fuisse. Die ante heri, ego publicavi YouTube filamentum in lingua Croatica de eo.
https://youtu.be/4QQ2iJZnyUk
In eo filamento, do quinque argumenta pro idea quia lingua Illyrica erat "centum" lingua. Ea argumenta sunt:

'K'-'r' regularitas in nominibus fluminum in Croatia. In multis nominibus fluminum in Croatia, primus consonans est 'k' et secundus consonans est 'r': Krka, Korana, Krapina, Krbavica, Kravarščica, et duo flumina cum nomine Karašica. Plurimi linguistae censent eam regularitatem coincidentalem esse, sed ego censeo quia theoria informationis (Paradoxa Dierum Natalium et Entropia Collisionum) docet nobis quia probabilitas ut ea regularitas apparet coincidentaliter est inter 1/300 et 1/17. Calculationem habetis in meo textu "Etimologija Karašica", quod publicavi in almanaco Valpovački Godišnjak anno Domini 2022-o. Ego censeo quia nomen "Karašica" venit ex Illyrico nomine Kurr-urr-issia, et quia "kurr" significabat "fluere" (probabiliter ex Indo-Europea *kjers, quod significabat "currere"), "urr" significabat "aqua" (ex Indo-Europea *weh1r), et "-issia" erat suffixum in lingua Illyrica, quod etiam est in antiquo nomine pro Đakovo, "Certissia". Per me, nomen "Kurrurrissia" ivit ex Illyrico in Prae-Sclavicum *Kъrъrьsьja, quod dedit "Karrasj-">"Karaš-ica" (-ica est Croaticum suffixum) in hodierna lingua Croatica. Ego etiam censeo Krapina venisse ex Illyrico nomine Kar-p-ona, "kar" ex *kjers, "p" ex *h2ep (aqua), et "ona" erat suffixum in multis Illyricis nominibus locorum, inter alia, "Salona" et "Albona". Per me, nomen "Karpona" ivit ex Illyrico in Prae-Sclavicum *Korpyna, quod dedit "Krapina" in hodierna lingua Croatica. Et cetera...
Si lingua Illyrica erat "centum" lingua, "Curicum", antiquum nomen pro Krk, potest legi ut "caurus, ventus borealis", ex Indo-Euroepea *(s)kjeh1weros (unde Latinum verbum "caurus" venit), et Krk est borealissima insula in mare nostro.
Si lingua Illyrica erat "centum" lingua, "Incerum", antiquum nomen pro Požega, potest legi ut "cor vallis", ex Indo-Europeais radicibus *h1eyn (vallis) et *kjer(d) (cor).
Si lingua Illyrica erat "centum" lingua, "Cibelae", antiquum nomen pro Vinkovci, potest legi ut "firma casa" vel "castrum", ex Indo-Europeis radicibus *kjey (casa) et *bel (firmus).
Multae inscriptiones in lingua Illyrica incipiunt cum "klauhi zis", et id probabiliter significabat "Audiat Deus...". "Klauhi" ergo probabiliter venit ex *kjlew (audire), ergo, *kj vertitur in *k in lingua Illyrica.

Audiunturne ea argumenta vobis compellentia?

UPDATE on 03/06/2024: Many people on Internet forums are saying that the grammar-translation method of teaching a language (as Latin is usually taught) is bad. However, I don't think grammar-translation method is always bad. I only understood the consecutio temporum in my Latin language classes, from the table presented in the Elementa Latina textbook, four years after I was introduced to consecutio temporum (sequence of tenses) in my English language classes. The explanations in the English textbooks, which try to explain the sequence of tenses in English, are word-salads to me.