The first component of equation 11 is:
Expanding this on the basis of term
independence assumptions, and also making the assumption that eliteness
is independent of document length (on the basis of the Verbosity
hypothesis), we can obtain a formula for the weight of a term t which
occurs times, as follows:
Analysis of the behaviour of this function with varying and
d is a little complex. The simple function used for the experiments
(formula 10) exhibits some of the correct properties, but not all.
In particular, 14 shows that increasing d exaggerates the
S-shape mentioned in section 4.2; formula 10 does not have this
property. It seems that there may be further scope for development of a
rough model based on the behaviour of formula 14.