Document Type




Embargo Period



question-answering, QA, question-answering evaluation, open-domain systems, TREC QA, restricted-domain system, tset question development, answer key creation, test collection construction




Library and Information Science


Question-Answering (QA) evaluation efforts have largely been tailored to open-domain systems. The TREC QA test collections contain newswire articles and the accompanying queries cover a wide variety of topics. While some apprehension about the limitations of restricted-domain systems is no doubt justified, the strict promotion of unlimited domain QA evaluations may have some unintended consequences. Simply applying the open domain QA evaluation paradigm to a restricted-domain system poses problems in the areas of test question development, answer key creation, and test collection construction. This paper examines the evaluation requirements of restricted domain systems. It incorporates evaluation criteria identified by users of an operational QA system in the aerospace engineering domain. While the paper demonstrates that user-centered task-based evaluations are required for restricted domain systems, these evaluations are found to be equally applicable to open domain systems.

Creative Commons License

This work is licensed under a Creative Commons Attribution 3.0 License.