CHronological information Extraction SyStem (CHESS)
Date of Award
Doctor of Philosophy (PhD)
Stuart A. Sutton
Information extraction, Question-answer systems, Visual information browsing, Language processing
This research focuses on the use of computational linguistics methods to process unstructured text, and automatically extract meaningful information in order to provide users with a variety of information access tools appropriate to their specific needs. Recognizing that differing types of information need exist (Taylor, 1968) that might not be satisfied by traditional information retrieval (IR) systems alone, this research is founded on the premise that sophisticated information access methods, in conjunction with a traditional IR system, can satisfy users' varying information needs better than exclusive reliance on an IR system. A second premise is that automatic information extraction processes enable sophisticated information access methods, such as question-answering, visual browsing of unstructured textual information.
To fully demonstrate these premises, a natural language processing (NLP) based, domain-independent information extraction framework is needed that can convert the semantic content of unstructured texts into structured knowledge representations. This domain-independent and automatic semantic analysis of text, similar to human capability, has been beyond the computational capabilities of NLP. Previous state of the art work has been done in very narrow and specific domains (MUC-3, MUC-4, MUC-5, and MUC-6) to populate pre-configured databases using automatic semantic processing and information extraction.
The research described here is an attempt to integrate a domain-independent semantic interpretation of text, using the Conceptual Graph model (Sowa, 1984), as an integral part of an automatic information extraction system, to enable multiple modes of novel information access for users. The novel information extraction framework described herein has been shown to be useful in populating knowledge bases for automated reasoning applications, for question-answering when users have very specific information needs, and for populating a visual browser for when users' needs are vague.
The approach is limited to the extraction of the subject and factual content of textual information. Logical content, often revealed by quantifiers and modals within text, is important for hypothetical or counter-factual reasoning and an autonomous problem-solving system, but is beyond the scope of this research.
Surface provides description only. Full text is available to ProQuest subscribers. Ask your Librarian for assistance.
Paik, Woojin, "CHronological information Extraction SyStem (CHESS)" (2000). iSchool Information Science and Technology - Dissertations. Paper 55.