The page A Corpus Linguistics Glossary is now complete. Check it out. Please feel free to contact me if there are any terms which you think should be included on the list, or if there are any corrections needed. Cheers.

word list

A list of all the types in a corpus. Usually arranged by frequency with the highest frequency at the top.

As a reference corpus a word list can tell you which are the most common words within a language. Placed against another corpus from a different period (or one that is marked with usage information) it can tell you how language as changed.

How many apps have I downloaded?

Apple reports that iOS apps downloaded from the App Store have now totalled 100 billion. Considering that 1 billion devices have been sold that is on average 100 apps per device. And considering that there are 1.5 million unique apps on the Store that is an average of 66,667 downloads per app.  

Personally, I have downloaded at least a thousand apps on two owned devices. So I guess I am downloading five times more than the average person. Yikes.


Case – lower and uppercase – serves the purpose of helping reading and therefore meaning in graphic texts. Nothing substantially changes to the pronunciation of a word. It is therefore a wholly written feature of language that is not apparent in spoken form.

Concordancing software often allow you to choose between being case-sensitive or not. At times, it may be desirable to make a distinction between uppercase and lowercase in doing corpus linguistic analysis. An example of such desirability may be in the case where a text is abundant with the word token Will, as in the nickname for William, in which case the inflated frequency may be mistaken for the modal auxiliary.

Evidence of “language” in birds

It took a while but we have first evidence that an animal other than human beings use meaningless sounds to convey meaning, similar to how we form words or grammar. It has always been foolish of us to think we are the only species which does so.