Many different functions have been used to allow within-document term
frequency to influence the weight given to the particular
document on account of the term in question. In some cases a linear
function has been used; in others, the effect has been dampened by using
a suitable transformation such as
.
Even if we do not use the full equation 5, we may allow it
to suggest the shape of an appropriate, but simpler, function. In fact,
equation 5 has the following characteristics:
(a) It is zero for ;
(b) it increases monotonically with
;
(c) but to an asymptotic maximum;
(d) which approximates to the
Robertson/Sparck Jones weight that would be given to a direct indicator
of eliteness.
Only in an extreme case, where eliteness is identical to relevance, is
the function linear in . These points can be seen from the
following re-arrangement of equation 5:
is smaller than
.
As
(to give us the asymptotic
maximum),
goes to zero, so those components
drop out.
will be small, so the approximation is:
(The last approximation may not be a good one: for a poor and/or
infrequent term, will not be very small. Although
this should not affect the component in the numerator, because
is
likely to be small, it will affect the component in the denominator.)