Skip to content

Toggle service links

Document Re-ranking by Generality
Xin Yan, Dr. Xue Li, Dr. Dawei Song

This event took place on 8th March 2006 at 9:00am (09:00 GMT)
Knowledge Media Institute, Berrill Building, The Open University, Milton Keynes, United Kingdom, MK7 6AA

Document ranking is a fundamental feature for an information retrieval system. Traditional document ranking methods often measure how relevant a document is to a query based on the similarity between them. Due to information ex-plosion and the popularity of WWW information retrieval, however, the sufficiency of using simi-larity alone for ranking has been questioned by context-sensitive information retrieval. This paper argues to take into account the factor of ?generality?. As a complement to the traditional similarity based ranking, generality measures how general it is for a document or a query to describe a certain topic. For example, a query may aim to find review articles in a broader scope rather than technical papers. Moreover, given a large set of relevant documents retrieved by an IR system, the user may sometimes expect the jargon free documents to be moved upward to the top of the list. As such, a layman can then have an easy understanding of a topic before get-ting into documents with more technical terms. This is particularly the case in some domains such as biomedical IR. To address this problem, we propose to re-rank the retrieved documents by generality. A novel approach for calculating document generality is developed based on quantifying the scope and semantic cohesion of a document. On the other hand, the generality of a query can be estimated in a same way. The documents are then re-ranked by comparing the closeness of docu-ments? generality scores to the query?s. Experiments have been conducted on a large scale bio-medical text corpus, OHSUMED, which is a subset of MEDLINE collection containing 348,566 medical journal references and 101 test queries. Our approach has demonstrated an encouraging performance.

Download PowerPoint presentation (16kb ZIP file)
Return to the event page

Click here to submit a question or comment

The webcast was open to 50 users

Click below to play the event (25 minutes)

Creative Commons Licence KMi logo