Barlacchi, Gianni
Strong and Efficient Baselines for Open Domain Conversational Question Answering
Coman, Andrei C., Barlacchi, Gianni, de Gispert, Adrià
Unlike the Open Domain Question Answering (ODQA) setting, the conversational (ODConvQA) domain has received limited attention when it comes to reevaluating baselines for both efficiency and effectiveness. In this paper, we study the State-of-the-Art (SotA) Dense Passage Retrieval (DPR) retriever and Fusion-in-Decoder (FiD) reader pipeline, and show that it significantly underperforms when applied to ODConvQA tasks due to various limitations. We then propose and evaluate strong yet simple and efficient baselines, by introducing a fast reranking component between the retriever and the reader, and by performing targeted finetuning steps. Experiments on two ODConvQA tasks, namely TopiOCQA and OR-QuAC, show that our method improves the SotA results, while reducing reader's latency by 60%. Finally, we provide new and valuable insights into the development of challenging baselines that serve as a reference for future, more intricate approaches, including those that leverage Large Language Models (LLMs).
Deep Learning for Human Mobility: a Survey on Data and Models
Luca, Massimiliano, Barlacchi, Gianni, Lepri, Bruno, Pappalardo, Luca
The study of human mobility is crucial due to its impact on several aspects of our society, such as disease spreading, urban planning, well-being, pollution, and more. The proliferation of digital mobility data, such as phone records, GPS traces, and social media posts, combined with the outstanding predictive power of artificial intelligence, triggered the application of deep learning to human mobility. In particular, the literature is focusing on three tasks: next-location prediction, i.e., predicting an individual's future locations; crowd flow prediction, i.e., forecasting flows on a geographic region; and trajectory generation, i.e., generating realistic individual trajectories. Existing surveys focus on single tasks, data sources, mechanistic or traditional machine learning approaches, while a comprehensive description of deep learning solutions is missing. This survey provides: (i) basic notions on mobility and deep learning; (ii) a review of data sources and public datasets; (iii) a description of deep learning models and (iv) a discussion about relevant open challenges. Our survey is a guide to the leading deep learning solutions to next-location prediction, crowd flow prediction, and trajectory generation. At the same time, it helps deep learning scientists and practitioners understand the fundamental concepts and the open challenges of the study of human mobility.
Modeling Taxi Drivers' Behaviour for the Next Destination Prediction
Rossi, Alberto, Barlacchi, Gianni, Bianchini, Monica, Lepri, Bruno
Taxi destination prediction is a very important task for optimizing the efficiency of electronic dispatching systems, thus allowing relevant advantages for both taxi companies and customers. In fact, during periods of high demand, there should be a taxi whose current ride will end near a requested pick up location from a new customer. If an electronic dispatcher is able to know in advance where all taxi drivers will end their current ride, it will also be able to better allocate its resources, identifying which taxi to assign to each call. Moreover, automatic systems for the taxi mobility monitoring collect data that, integrated with other information sources, can help in understanding daytime human mobility routines. In this paper, we introduce a novel approach for addressing the taxi destination prediction problem, based on Recurrent Neural Networks (RNNs) applied to a regression setting. RNNs are trained based on the individual drivers' history and on geographical information (i.e., points of interest), using only the starting point of each ride (with no knowledge about the whole trajectory). The proposed approach was tested on the dataset of the ECML/PKDD Discovery Challenge 2015 - based on the city of Porto - obtaining better results with respect to the competition winner, whilst using less information, and on Manhattan and San Francisco datasets.