Biomedical text mining

Biomedical text mining (including biomedical natural language processing or BioNLP) refers to the methods and study of how text mining may be applied to texts and literature of the biomedical domain. As a field of research, biomedical text mining incorporates ideas from natural language processing, bioinformatics, medical informatics and computational linguistics. The strategies in this field have been applied to the biomedical literature available through services such as PubMed.

In recent years, the scientific literature has shifted to electronic publishing but the volume of information available can be overwhelming. This revolution of publishing has caused a high demand for text mining techniques. Text mining offers information retrieval (IR) and entity recognition (ER).[1] IR allows the retrieval of relevant papers according to the topic of interest, e.g. through PubMed. ER is practiced when certain biological terms are recognized (e.g. proteins or genes) for further processing.

  1. ^ Jensen, Lars Juhl; Saric, Jasmin; Bork, Peer (February 2006). "Literature mining for the biologist: from information retrieval to biological discovery". Nature Reviews Genetics. 7 (2): 119–129. doi:10.1038/nrg1768. ISSN 1471-0056. PMID 16418747. S2CID 423509.