What makes a word difficult?

Part of our task in developing the Wordbook vocabulary building program was to determine the difficulty level of more than a thousand words. We did this by putting the words into test items and administering each item to hundreds of students. A question that naturally occurs in work like this is: just what makes one word more difficult than another? Here, we look at two aspects of words that appear to have some influence on how difficult they are: frequency and part of speech.

Major word counts in English have been done that have tabulated the number of times words appear in over a million words of text. A word's frequency can be determined by consulting the results of such studies.

Curious? Check out wordfrequency.info

How do these frequencies compare with our difficulty estimates? The answer is that there is a moderately strong relationship: words that appear often in print tend to be learned earlier and more easily than words that appear rarely in print. If you see or hear a word often, you are likely to develop a sense of what it means; if you never see it, you are not likely to know its meaning.

It is perhaps surprising that the relationship is not even stronger than it is. Many frequently-appearing words are difficult; many rare ones apparently give people little trouble. Part of the reason for this has to do with a limitation of the word counts. They do not differentiate among different words that have the same spelling - for example, bear, an animal, and bear, to carry – or among different meanings of the same word – rich, wealthy, and rich, full of calories. Frequency counts for such words will not accurately reflect how often a person is exposed to the word when it is used with a particular meaning.

abuse, the noun, is ranked #1548, and abuse, the verb, is ranked #3778, out of 5000 wordsIn other cases, it is apparent that a high frequency, even of a word with one common meaning, does not guarantee that the word will be correctly understood. The word exceptional, for example, had a much higher difficulty level in our calculations than would have been predicted by its frequency. The reason was not hard to find: one of the incorrect choices on the test item was perfect, and many people selected this answer rather than the correct answer, unusual. What appears to have happened is that people usually see exceptional used in a positive sense, as in “an exceptional movie” or “an exceptionally good party,” and many never obtain a clear idea of the word's basic meaning, which is actually out of the ordinary, uncommon, or unusual. So some words might require more frequent appearances than others to become firmly fixed in a person's mind.

Other studies have shown that the relationship between frequency and order of difficulty becomes weaker as finer shades of meaning are required in the test items. This is the case with the attractive but clearly incorrect choice of perfect in the previous example. The moral is that frequency counts are no substitute for actual testing in determining a word's difficulty level.

A word's part of speech has much less effect on its difficulty level than its frequency does, but some relationship is discernible. Nouns, on average, are somewhat easier than other parts of speech.  A possible explanation for this is that nouns in general make a concrete reference to the physical world: a pencil, for example, is something you can see and touch and use. Verbs, on the other hand, are more abstract, expressing relationships between things, so they may be harder to learn.

It's likely that how we learn words has to do with more than just frequency and difficulty. Further research may uncover other aspects of words that contribute to their difficulty levels. For example, some nouns are more concrete than others – you can see a touch a fern, but empathy is an abstraction. You can see examples of empathy, but you can't see empathy itself. Does that make the word harder to learn? What about the “age” of a word – the date that it entered the language? Are more recent words like cryonics and maven harder than older words?  Perhaps our learning process changes and evolves as the language we learn continues to change and evolve.