Zipf's law is a statistical formula that can be used to describe many types of data observed in both the physical and social sciences. This can also be used to predict data patterns for seemingly random occurances.
To learn more about Zipfian distributions, read about it here, or watch this video by Vasuce.
As it turns out, the most common words in the english language are:
To use this tool, simply type or paste words into the text box above. As with any statistical probability, the more words you use, the better. I would reccoment using a minimum of 100 words to start to see interesting results, although long passages of many thousands of words are preferred. To help you with this, I have included options to remove punctuation and wikipedia references for those pasting in text from wikipedia
On the results page, you will see a graph showing each different word in order of occurance, with the percentage of times it occurs in the text. If your passage follows a Zipfian distribution, you should see a curved graph, with about 80% of the words occuring in the first 20% of the graph.
Following this, there will be a section with 'cards' giving you more information about the text. Included are the most common words found in your piece of text, and the most frequent 'uncommon' words in your text.
Working...
reset
100%
75%
50%
25%
Try to use more words next time for more interesting results
This is a test
The list below is the top ten most used words in this piece of text
The list below represents the top ten 'uncommon' words found in this passage
info
Zipf's law is a statistical formula that can be used to describe many types of data observed in both the physical and social sciences. This can also be used to predict data patterns for seemingly random occurances.
If your graph shows an exponential decrease in word use as you scroll left, then your text shows a Zipfian distribution in word use.