DIMACS TR: 2002-10

Use of virtual semantic headers to improve the coverage of natural language question answering domains

Authors: Boris Galitsky


We report on the knowledge representation mechanism designed for natural language question-answering system to function in such poorly-formalized and heterogeneous domains as the financial, legal, pharmaceutical and psychological. The system is oriented to provide a customized expert advice, given the pre-designed set of textual templates and the database that contains customers’ profiles and preferences, parameters of products and services, etc. Question-answering is based on matching a formal representation of a query against the formalized representations of answers’ essential ideas (semantic headers of these answers). Semantic headers are designed to be independent on the query phrasing and are the means of approximate reasoning while generating the most relevant advice. A semantic skeleton of an answer includes semantic headers and deductive links between them and their entities, based on the common-sense domain knowledge. Semantic skeleton includes the virtual semantic headers, which do not have to be explicitly programmed but are generated on the fly, using the clauses of semantic skeleton, to be matched with a question.

We present the evaluation of the released question-answering system, advising the customers of H&R Block and CBS Market Watch on various taxes and associated legal issues starting from 1999. Domain development and maintenance implications of semantic header technique, answer accuracy, meaning deviations and overall customer impressions are analyzed.

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-10.ps.gz

DIMACS Home Page