next up previous
Next: A Simple Formulation Up: A Rough Model Previous: A Rough Model

The Shape of the Effect

Many different functions have been used to allow within-document term frequency to influence the weight given to the particular document on account of the term in question. In some cases a linear function has been used; in others, the effect has been dampened by using a suitable transformation such as .

Even if we do not use the full equation 5, we may allow it to suggest the shape of an appropriate, but simpler, function. In fact, equation 5 has the following characteristics: (a) It is zero for ; (b) it increases monotonically with ; (c) but to an asymptotic maximum; (d) which approximates to the Robertson/Sparck Jones weight that would be given to a direct indicator of eliteness.

Only in an extreme case, where eliteness is identical to relevance, is the function linear in . These points can be seen from the following re-arrangement of equation 5:

is smaller than . As (to give us the asymptotic maximum), goes to zero, so those components drop out. will be small, so the approximation is:

 

(The last approximation may not be a good one: for a poor and/or infrequent term, will not be very small. Although this should not affect the component in the numerator, because is likely to be small, it will affect the component in the denominator.)



Steve Robertson
Mon May 13 18:33:21 BST 1996