Search engine indexing

Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process, in the context of search engines designed to find web pages on the Internet, is web indexing.

Popular search engines focus on the full-text indexing of online, natural language documents.[1] Media types such as pictures, video,[2] audio,[3] and graphics[4] are also searchable.

Meta search engines reuse the indices of other services and do not store a local index whereas cache-based search engines permanently store the index along with the corpus. Unlike full-text indices, partial-text services restrict the depth indexed to reduce index size. Larger services typically perform indexing at a predetermined time interval due to the required time and processing costs, while agent-based search engines index in real time.

  1. ^ Clarke, C., Cormack, G.: Dynamic Inverted Indexes for a Distributed Full-Text Retrieval System. TechRep MT-95-01, University of Waterloo, February 1995.
  2. ^ Sikos, L. F. (August 2016). "RDF-powered semantic video annotation tools with concept mapping to Linked Data for next-generation video indexing". Multimedia Tools and Applications. doi:10.1007/s11042-016-3705-7. S2CID 254832794.[permanent dead link]
  3. ^ http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf [bare URL PDF]
  4. ^ Charles E. Jacobs, Adam Finkelstein, David H. Salesin. Fast Multiresolution Image Querying. Department of Computer Science and Engineering, University of Washington. 1995. Verified Dec 2006