Boolean Model

Data retrieval model based on binary decision criterion

A query is a Boolean expression which can be represented as a disjunction of conjunctive vectors.

Example query: (intelligent∧system)∨(information∧retrieval)


  • Advantage:
    clean formalism, simplicity
  • Disadvantage:
    exact matching my lead to retrieval of too few or too many documents



Vector Model

For the vector model, the weight wi,j is associated with term ki and document dj (or query q).



  • Advantage:
    its term weighting scheme(tf-idf) improves retrieval performance
    its partial matching strategy allows retrieval of documents that approximate the query conditions
    its cosine ranking formula sorts the documents according to their degree of similarity to the query(below)
  • Disadvantage:
    the assumption of mutual independence between index terms may be unrealistic in practice.



Degree of similarity

The cosine of θ is adopted as sim(dj, q).