Discussion

Next: Applicability to Other Up: Experiments Previous: Query Term Frequency

Discussion

On the short queries, and without a query term frequency component, the best version of BM11 gives an increase of about 50% in average precision, with somewhat smaller improvements in the other statistics, over the baseline Robertson/Sparck Jones weighting BM1. On the long queries the proportionate improvement is very much greater still. To put this in perspective, the best results reported here (Table 4, row 8), are similar to the best reported by any of the TREC--2 participants at the time of the conference in September 1993.

Many experimental runs were also carried out on two other sets of 50 topics and on two other databases: TREC disk 3 and a subdatabase of disks 1 and 2 consisting entirely of Wall Street Journal articles. The absolute values of the statistics varied quite widely, but the rank order of treatments was very similar to those shown in the tables here.

Steve Robertson
Mon May 13 18:33:21 BST 1996