AbstractsComputer Science

Learning to reformulate long queries

by Neha Gupta




Institution: MIT
Department: Electrical Engineering and Computer Science
Degree: MS
Year: 2010
Keywords: Electrical Engineering and Computer Science.
Record ID: 1889717
Full text PDF: http://hdl.handle.net/1721.1/60164


Abstract

Long search queries are useful because they let the users specify their search criteria in more detail. However, the user often receives poor results in response to the long queries from today's Information Retrieval systems. For the document to be returned as a relevant result, the system requires every query term to appear in the document. This makes the search task especially challenging for those users who lack the domain knowledge or have limited search experience. They face the difficulty of selecting the exact keywords to carry out their search. The goal of our research is to help bridge that gap so that the search engine can help novice users formulate queries in a vocabulary that appears in the index of the relevant documents. We present a machine learning approach to automatically summarize long search queries, using word specific features that capture the discriminative ability of particular words for a search task. Instead of using hand-labeled training data, we automatically evaluate a search query using a query score specific to the task. We evaluate our approach using the task of searching for related academic articles.