Title

Information retrieval by plausible inferences: An application of the theory of plausible reasoning of Collins and Michalski

Date of Award

1995

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical Engineering and Computer Science

Advisor(s)

Robert N. Oddy

Keywords

Computer science, Information Systems

Subject Categories

Information and Library Science

Abstract

This work explores the possibility of using plausible inferences as a means of retrieving relevant documents. An attempt is made to simulate the common sense reasoning of a human being (e.g. a reference librarian) when (s)he is asked to provide references on selected subjects. Collins and Michalski developed a theory of plausible reasoning for question-answering situations, we modify their theory to accommodate information retrieval. Methods are proposed to represent document contents by logical terms and statements, and queries by incomplete logical statements. Extensions to plausible inferences are discussed and examples of reference retrieval using these extensions are given. Several relationships namely, Broader-Narrower, ISA, X, Y, REF, AUTH and CITE, are defined. Some techniques are developed to extract these relationships from text. Methods of converting these relationships to forms suitable for reasoning, such as logical terms, statements and mutual dependencies, are proposed.

Two versions of the extended plausible reasoning system were implemented, one using dominance weights as described in detail in this dissertation and the other using tf.idf (Term Frequency Inverse Document Frequency) weights. A vector space retrieval system was also implemented for purposes of comparison. Experiments were conducted using the titles and abstracts of the CACM collection. Each item in the collection was scanned for REF, X, Y and Broader-Narrower relationships, and the extracted relations were stored in a semantic network. A standard set of queries was posed against this corpus by each system. Using the Wilcoxon matched-pairs signed-ranks test of statistical significance to determine the significance of the results, it was found that both versions of the extended system are better than the vector space model and the system using dominance weights performed better than the system with tf.idf weights.

Our plausible reasoning techniques seem to be more general and more robust than other logics applied to information retrieval. This was further investigated by using plausible inferences to simulate the behavior of several well-known retrieval models. It is thereby demonstrated that plausible reasoning as a formalism encompasses retrieval models such as semantic networks using spreading activation inferences like GRANT or I$\sp3$R, all logic-based IR systems including the Boolean model and Van Rijsbergen's conditional logic, mathematical models like the vector space model, and Thomas, a cognitive approach to information retrieval.

Access

Surface provides description only. Full text is available to ProQuest subscribers. Ask your Librarian for assistance.

http://libezproxy.syr.edu/login?url=http://proquest.umi.com/pqdweb?did=742139241&sid=2&Fmt=2&clientId=3739&RQT=309&VName=PQD