"MetaExtract: an NLP system to automatically assign metadata" by Ozgur Yilmazel, Christina M. Finneran et al.

Center for Natural Language Processing

Title

MetaExtract: an NLP system to automatically assign metadata

Authors/Contributors

Ozgur Yilmazel, Syracuse University, School of Information Studies, Center for Natural Language Processing
Christina M. Finneran, Syracuse University, School of Information Studies, Center for Natural Language Processing
Elizabeth D. Liddy, Syracuse University, School of Information Studies, Center for Natural Language Processing

Document Type

Article

Date

2004

Keywords

digital libraries, artificial intelligence, natural language processing, digital library standards, digital library system issues, digital library user issues

Language

English

Disciplines

Artificial Intelligence and Robotics | Library and Information Science

Description/Abstract

We have developed MetaExtract, a system to automatically assign Dublin Core + GEM metadata using extraction techniques from our natural language processing research. MetaExtract is comprised of three distinct processes: eQuery and HTML-based Extraction modules and a Keyword Generator module. We conducted a Web-based survey to have users evaluate each metadata element’s quality. Only two of the elements, Title and Keyword, were shown to be significantly different, with the manual quality slightly higher. The remaining elements for which we had enough data to test were shown not to be significantly different; they are: Description, Grade, Duration, Essential Resources, Pedagogy-Teaching Method, and Pedagogy-Group.

Recommended Citation

Yilmazel, Ozgur; Finneran, Christina M.; and Liddy, Elizabeth D., "MetaExtract: an NLP system to automatically assign metadata" (2004). Center for Natural Language Processing. Paper 4.
http://surface.syr.edu/cnlp/4

Download

Tell a Colleague
Print
Download Adobe Reader
COinS

SURFACE