Keyword (linguistics)

In corpus linguistics a key word is a word which occurs in a text more often than we would expect to occur by chance alone.[1] Key words are calculated by carrying out a statistical test (e.g., loglinear or chi-squared) which compares the word frequencies in a text against their expected frequencies derived in a much larger corpus, which acts as a reference for general language use. Keyness is then the quality a word or phrase has of being "key" in its context. Combinations of nouns with parts of speech that human readers would not likely notice, such as prepositions, time adverbs, and pronouns can be a relevant part of keyness. Even separate pronouns can constitute keywords.[2]

Compare this with collocation, the quality linking two words or phrases usually assumed to be within a given span of each other. Keyness is a textual feature, not a language feature (so a word has keyness in a certain textual context but may well not have keyness in other contexts, whereas a node and collocate are often found together in texts of the same genre so collocation is to a considerable extent a language phenomenon). The set of keywords found in a given text share keyness, they are co-key. Words typically found in the same texts as a key word are called associates.