type-token ratio

The type-token ratio (or TTR) is used to compare two corpora in terms of lexical complexity. The formula is the number of types divided by the number of tokens. The closer to 1 the greater the complexity. The closer to 0 the greater the repetition of words. There is not a specific ratio which can be said to be ideal as such but that one corpus can be said comparably more or less complex to another only when they are of similar size. Therefore approximate size equivalence is an important criterion in using TTR.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s