Document Type
Report
Date
12-2002
Embargo Period
10-18-2010
Keywords
Evidence Extraction and Link Discovery, EELD, Transformation Based Learning
Language
English
Disciplines
Library and Information Science | Linguistics
Description/Abstract
As part of our Evidence Extraction and Link Discovery (EELD) project, we proposed to use Transformation Based Learning (TBL) to learn domain-specific specializations for generic event extractions. The primary goal of our learning task was to reduce the amount of human effort required for specializing generic event extractions to domains that are new and specific. Three initial annotation cycles and one annotation review and correction cycle involving a total of 70 documents were completed, with slightly over 32 hours required for the entire annotation effort; where possible, the annotation cycles started with bootstrapped files resulting from the application of TBL rules learned after the prior annotation cycle. A five-fold evaluation was completed using the annotated files as the gold standard for evaluation purposes. When our analysis was limited to specialized event types with 10 or more examples available for training, we achieved 67.93% Coverage, 88.93% Accuracy, and an F-score of 77.02%. Several conclusions can be drawn from our study: (1) the use of TBL to learn specializations of generic extractions to specific domains is possible, (2) the use of TBL leads to a significant reduction in the human effort involved in specializing to a new domain, (3) sparsity of training data has a large impact on the results of learning, and (4) with more training instances, coverage and accuracy would improve.
Recommended Citation
Mary D. Taffet, Nancy J. McCracken, Eileen E. Allen, Elizabeth D. Liddy 2002. Transformation Based Learning for Specialization of Generic Event Extractions. CNLP Technical Report. December 2002