René Witte, Thomas Kappler, and Christopher J. O. Baker. Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences, chapter Ontology Design for Biomedical Text Mining. Springer, 2007. ISBN: 978-0-387-48436-5.
Text Mining in biology and biomedicine requires a large amount of domain-speciﬁc knowledge. Publicly accessible resources hold much of the information needed, yet their practical integration into natural language processing (NLP) systems is fraught with manifold hurdles, especially the problem of semantic disconnectedness throughout the various resources and components. Ontologies can provide the necessary framework for a consistent semantic integration, while additionally delivering formal reasoning capabilities to NLP.
In this chapter, we address four important aspects relating to the integration of ontology and NLP: (i) An analysis of the different integration alternatives and their respective vantages; (ii) The design requirements for an ontology supporting NLP tasks; (iii) Creation and initialization of an ontology using publicly available tools and databases; and (iv) The connection of common NLP tasks with an ontology, including technical aspects of ontology deployment in a text mining framework. A concrete application example—text mining of enzyme mutations—is provided to motivate and illustrate these points.