DIMACS TR: 99-61
A Geometric Model of Information Retrieval Systems
Author: Myung Ho Kim
ABSTRACT
This decade has seen a great deal of progress in the development of
information retrieval systems. Unfortunately, we still lack a systematic
understanding of the behavior of the systems and their relationship with
documents. In this paper we present a completely new approach towards the
understanding of the information retrieval systems. Recently, it has been
observed that retrieval systems in TREC 6 show some remarkable patterns in
retrieving relevant documents. Based on the TREC 6 observations, we
introduce a geometric linear model of information retrieval systems. We then
apply the model to predict the number of relevant documents by the retrieval
systems. The model is also scalable to a much larger data set. Although the
model is developed based on the TREC 6 routing test data, I believe it can
be readily applicable to other information retrieval systems. In Appendix,
we explained a simple and efficient way of making a better system from the
existing systems.
Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-61.ps.gz
DIMACS Home Page