Referring back to the basic weighting function 1, we may include document length as one component of the vector x. However, document length does not so obviously have a ``natural'' zero (an actual document of zero length is a pathological case). Instead, we may use the average length of a document for the corresponding component of the reference vector 0; thus we would expect to get a formula in which the document length component disappears for a document of average length, but not for other lengths. The weighting formula then becomes:
where d is document length, and x represents all other information about the document. This may be decomposed into the sum of two components, , where
These two components are discussed separately.