In the context of information retrieval, a thesaurus (plural: "thesauri") is a form of controlled vocabulary that seeks to dictate semantic manifestations of metadata in the indexing of content objects. A thesaurus serves to minimise semantic ambiguity by ensuring uniformity and consistency in the storage and retrieval of the manifestations of content objects. ANSI/NISO Z39.19-2005 defines a content object as "any item that is to be described for inclusion in an information retrieval system, website, or other source of information".[1] The thesaurus aids the assignment of preferred terms to convey semantic metadata associated with the content object.[2]
A thesaurus serves to guide both an indexer and a searcher in selecting the same preferred term or combination of preferred terms to represent a given subject. ISO 25964, the international standard for information retrieval thesauri, defines a thesaurus as a “controlled and structured vocabulary in which concepts are represented by terms, organized so that relationships between concepts are made explicit, and preferred terms are accompanied by lead-in entries for synonyms or quasi-synonyms.”
A thesaurus is composed by at least three elements: 1-a list of words (or terms), 2-the relationship amongst the words (or terms), indicated by their hierarchical relative position (e.g. parent/broader term; child/narrower term, synonym, etc.), 3-a set of rules on how to use the thesaurus.