Can Document-Genre Metadata Improve Information Access to Large Digital Collections
information retrieval, digital libraries, document genres, searching
Library and Information Science
We discuss the issues of resolving the information-retrieval problem in large digital collections through the identification and use of document genres. Explicit identification of genre seems particularly important for such collections because any search usually retrieves documents with a diversity of genres that are undifferentiated by obvious clues as to their identity. As well, because most genres are characterized by both form and purpose, identifying the genre of a document provides information as to the documents purpose and its fit to the users situation, which can be otherwise difficult to assess. We begin by outlining the possible role of genre identification in the information-retrieval process. Our assumption is that genre identification would enhance searching, first because we know that topic alone is not enough to define an information problem and second because search results containing genre information would be more easily understandable. Next, we discuss how information professionals have traditionally tackled the issues of representing genre in settings where topical representation is the norm. Finally, we address the issues of studying the efficacy of identifying genre in large digital collections. Because genre is often an implicit notion, studying it in a systematic way presents many problems. We outline a research protocol that would provide guidance for identifying Web document genres, for observing how genre is used in searching and evaluating search results, and finally for representing and visualizing genres.
Crowston, K. & Kwasnik, B. H. Can document-genre metadata improve information access to large digital collections? Library Trends, 52(2), 345–361. Available from http://hdl.handle.net/2142/8533
Accessible PDF Version
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.