next up previous
Next: Introduction

Some Simple Effective Approximations to the 2--Poisson Model for Probabilistic Weighted Retrieval

S.E. Robertson - S. Walker
Centre for Interactive Systems Research, Department of Information Science, City University
Northampton Square, London EC1V 0HB, UK

Abstract:

The 2--Poisson model for term frequencies is used to suggest ways of incorporating certain variables in probabilistic models for information retrieval. The variables concerned are within-document term frequency, document length, and within-query term frequency. Simple weighting functions are developed, and tested on the TREC test collection. Considerable performance improvements (over simple inverse collection frequency weighting) are demonstrated.





Steve Robertson
Mon May 13 18:33:21 BST 1996