Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Guglielmo Faggioli

Perspectives on Large Language Models for Relevance Judgment

Apr 13, 2023
Guglielmo Faggioli, Laura Dietz, Charles Clarke, Gianluca Demartini, Matthias Hagen, Claudia Hauff, Noriko Kando, Evangelos Kanoulas, Martin Potthast, Benno Stein, Henning Wachsmuth

Figure 1 for Perspectives on Large Language Models for Relevance Judgment

Figure 2 for Perspectives on Large Language Models for Relevance Judgment

Figure 3 for Perspectives on Large Language Models for Relevance Judgment

Figure 4 for Perspectives on Large Language Models for Relevance Judgment

When asked, current large language models (LLMs) like ChatGPT claim that they can assist us with relevance judgments. Many researchers think this would not lead to credible IR research. In this perspective paper, we discuss possible ways for LLMs to assist human experts along with concerns and issues that arise. We devise a human-machine collaboration spectrum that allows categorizing different relevance judgment strategies, based on how much the human relies on the machine. For the extreme point of "fully automated assessment", we further include a pilot experiment on whether LLM-based relevance judgments correlate with judgments from trained human assessors. We conclude the paper by providing two opposing perspectives - for and against the use of LLMs for automatic relevance judgments - and a compromise perspective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR researchers. We hope to start a constructive discussion within the community to avoid a stale-mate during review, where work is dammed if is uses LLMs for evaluation and dammed if it doesn't.

Via

Access Paper or Ask Questions

Query Performance Prediction for Neural IR: Are We There Yet?

Feb 20, 2023
Guglielmo Faggioli, Thibault Formal, Stefano Marchesin, Stéphane Clinchant, Nicola Ferro, Benjamin Piwowarski

Figure 1 for Query Performance Prediction for Neural IR: Are We There Yet?

Figure 2 for Query Performance Prediction for Neural IR: Are We There Yet?

Figure 3 for Query Performance Prediction for Neural IR: Are We There Yet?

Figure 4 for Query Performance Prediction for Neural IR: Are We There Yet?

Evaluation in Information Retrieval relies on post-hoc empirical procedures, which are time-consuming and expensive operations. To alleviate this, Query Performance Prediction (QPP) models have been developed to estimate the performance of a system without the need for human-made relevance judgements. Such models, usually relying on lexical features from queries and corpora, have been applied to traditional sparse IR methods - with various degrees of success. With the advent of neural IR and large Pre-trained Language Models, the retrieval paradigm has significantly shifted towards more semantic signals. In this work, we study and analyze to what extent current QPP models can predict the performance of such systems. Our experiments consider seven traditional bag-of-words and seven BERT-based IR approaches, as well as nineteen state-of-the-art QPPs evaluated on two collections, Deep Learning '19 and Robust '04. Our findings show that QPPs perform statistically significantly worse on neural IR systems. In settings where semantic signals are prominent (e.g., passage retrieval), their performance on neural models drops by as much as 10% compared to bag-of-words approaches. On top of that, in lexical-oriented scenarios, QPPs fail to predict performance for neural IR systems on those queries where they differ from traditional approaches the most.

Via

Access Paper or Ask Questions

Towards Feature Selection for Ranking and Classification Exploiting Quantum Annealers

May 09, 2022
Maurizio Ferrari Dacrema, Fabio Moroni, Riccardo Nembrini, Nicola Ferro, Guglielmo Faggioli, Paolo Cremonesi

Figure 1 for Towards Feature Selection for Ranking and Classification Exploiting Quantum Annealers

Figure 2 for Towards Feature Selection for Ranking and Classification Exploiting Quantum Annealers

Figure 3 for Towards Feature Selection for Ranking and Classification Exploiting Quantum Annealers

Figure 4 for Towards Feature Selection for Ranking and Classification Exploiting Quantum Annealers

Feature selection is a common step in many ranking, classification, or prediction tasks and serves many purposes. By removing redundant or noisy features, the accuracy of ranking or classification can be improved and the computational cost of the subsequent learning steps can be reduced. However, feature selection can be itself a computationally expensive process. While for decades confined to theoretical algorithmic papers, quantum computing is now becoming a viable tool to tackle realistic problems, in particular special-purpose solvers based on the Quantum Annealing paradigm. This paper aims to explore the feasibility of using currently available quantum computing architectures to solve some quadratic feature selection algorithms for both ranking and classification. The experimental analysis includes 15 state-of-the-art datasets. The effectiveness obtained with quantum computing hardware is comparable to that of classical solvers, indicating that quantum computers are now reliable enough to tackle interesting problems. In terms of scalability, current generation quantum computers are able to provide a limited speedup over certain classical algorithms and hybrid quantum-classical strategies show lower computational cost for problems of more than a thousand features.

* Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022)
* Source code is available on Github https://github.com/qcpolimi/SIGIR22_QuantumFeatureSelection.git

Via

Access Paper or Ask Questions