Electronic Theses and Dissertations

Matches Made in Heaven or Somewhere: Personalized Query Refinement Gold Standard Generation Using Transformers

Yogeswar Lakshmi Narayanan, University of WindsorFollow

Date of Award

10-4-2023

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

Conditional Transformers;Information Retrieval;Personalized Query Reformulation

Supervisor

Hossein Fani

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

The foremost means of information retrieval, search engines, have difficulty searching into knowledge repositories, e.g., the web, because they are not tailored to the users' differing information needs. User queries are, more often than not, under-specified or contain ambiguous terms that also retrieve irrelevant documents. Query refinement is the process of transforming users' queries into new refined versions without semantic drift to enhance the relevance of search results. Prior query refiners have been benchmarked on ad-hoc web retrieval datasets following weak assumptions that users' input queries improve gradually within a search session. Existing methods also have employed additional metadata, such as session history or users' click-throughs, to enrich the query context. However, one crucial contextual cue has been overlooked: the user context. Moreover, personalized query refinement is vastly unexplored with the recent advancements in transformers and large language models in general. To overcome the aforementioned problems, (i) We contribute RePair, an open-source configurable toolkit, to generate large-scale gold standard benchmark datasets from a variety of domains for the task of query refinement. RePair takes a dataset of queries and their relevance judgements (e.g., msmarco or aol), a sparse or dense information retrieval method (e.g., bm25, colbert), and an evaluation metric (e.g., map), and outputs refined versions of queries, each of which with the relevance improvement guarantees under the retrieval method in terms of the evaluation metric. RePair benefits text-to-text-transfer-transformer (t5) to generate gold standard datasets for any input query set and is designed with extensibility in mind. Out of the box, we have generated and publicly shared gold-standard datasets for aol and msmarco.passage whilst benchmarking these gold standard datasets with state-of-the-art supervised query suggestions models and exploring t5 as an alternative model for query suggestion. (ii) We propose leveraging t5 to incorporate user context by adding a user-tailored pretext to the input sequence as prior conditions to generate personalized reformulation of queries in the output sequence. Our experiments on the aol query log demonstrated the effectiveness of t5 in personalized query reformulation without any loss of generality to other conditional transformers. Our codebase is publicly available at https://github.com/fani-lab/RePair.

Recommended Citation

Lakshmi Narayanan, Yogeswar, "Matches Made in Heaven or Somewhere: Personalized Query Refinement Gold Standard Generation Using Transformers" (2023). Electronic Theses and Dissertations. 9213.
https://scholar.uwindsor.ca/etd/9213

Download

Included in

Computer Sciences Commons

COinS

Scholarship at UWindsor

Electronic Theses and Dissertations

Matches Made in Heaven or Somewhere: Personalized Query Refinement Gold Standard Generation Using Transformers

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Rights

Creative Commons License

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Scholarship at UWindsor

Electronic Theses and Dissertations

Matches Made in Heaven or Somewhere: Personalized Query Refinement Gold Standard Generation Using Transformers

Author

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Rights

Creative Commons License

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner