If I asked you on the spot to tell me the relative frequency of the English language’s most common words, would you be able to do it? How much more common is the most common word “the” compared to the second most common word “of?” If you can answer this question, then you can answer questions about the frequencies of most of the world’s languages.
Zipf’s law is a probability distribution that says, “The frequency of any word is inversely proportional to its frequency rank.” Informally, it means that the second most common word is used half as much as the most common word; the third most common word is used a third as much as the most common word; etc.
This law seems like a pretty big coincidence, but it holds true for English and most of the world’s languages (including intentionally designed languages like Esperanto). The word “the” is English’s most common word appearing about 7% of the time in English text. The word “of” is predicted by Zipf’s law to occur half the time or 3.5%, which turns out to be exactly correct. English’s third most common word “and” appears about 2.8% of the time, which is extremely close to Zipf’s prediction of 2.3%.
Why does this happen? How have the world’s languages developed such a uniform distribution of words? Even more mysterious is Zipf’s appearances in other rankings such as the size of cities and number of people watching a given TV channel.
There is not a definitive explanation for the Zipf phenomena, but there are some possible explanations. One statistical explanation is that the Zipf pattern is a natural result of true randomness. In a study, Wentian Li showed that randomly generated texts will create results that obey Zipf’s law.
Zipf himself proposed an idea that the law results from the principle of least effort, a principle states given a choice, the path of least resistance will be taken. More specifically, both the speaker and listener aim for the path of least resistance. A language that only has one word is extremely easy for the speaker but extremely difficult for the listener. A language that uses a unique word for each concept is easier for the listener but difficult for the speaker. It is the compromise between the two possibilities creates the Zipf distribution of the world’s languages.





















