Bi-gram Language Modeling
Gaurav Arora
Abstract
Bi-gram Language modeling approach to information retrieval have proved to outperform the three tradition IR approaches . Bi-gram Language model apart from better retrieval performance renders a rich resource Bi-gram from collection which can be used for phrase searching, Diversifying search results, and query reformulation suggestion to user. Bi-gram Language model would make Xapian a more powerful library for research in information retrieval.
Additional Information
Bi-gram Language Model deviates from traditional ranking model,language model consider document as Language sample and rank document with probability of generation of query using document Language Model.
Given a relevant document, queries are generated by the explicit generation of important terms and unimportant terms. The important terms are supposed to be drawn at random from the document. The unimportant terms are supposed to be drawn at random from the full collection.
Code samples
| File name | Size | Date submitted |
|---|---|---|
| Gaurav_Arora.tar.gz | 171.5 KB | September 02 2012 16:22 UTC |
