Goto

Collaborating Authors

 Information Retrieval


Ta'keed: The First Generative Fact-Checking System for Arabic Claims

arXiv.org Artificial Intelligence

This paper introduces Ta'keed, an explainable Arabic automatic fact-checking system. While existing research often focuses on classifying claims as "True" or "False," there is a limited exploration of generating explanations for claim credibility, particularly in Arabic. Ta'keed addresses this gap by assessing claim truthfulness based on retrieved snippets, utilizing two main components: information retrieval and LLM-based claim verification. We compiled the ArFactEx, a testing gold-labelled dataset with manually justified references, to evaluate the system. The initial model achieved a promising F1 score of 0.72 in the classification task. Meanwhile, the system's generated explanations are compared with gold-standard explanations syntactically and semantically. The study recommends evaluating using semantic similarities, resulting in an average cosine similarity score of 0.76. Additionally, we explored the impact of varying snippet quantities on claim classification accuracy, revealing a potential correlation, with the model using the top seven hits outperforming others with an F1 score of 0.77.


SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval

arXiv.org Artificial Intelligence

Multi-modal information retrieval (MMIR) is a rapidly evolving field, where significant progress, particularly in image-text pairing, has been made through advanced representation learning and cross-modality alignment research. However, current benchmarks for evaluating MMIR performance in image-text pairing within the scientific domain show a notable gap, where chart and table images described in scholarly language usually do not play a significant role. To bridge this gap, we develop a specialised scientific MMIR (SciMMIR) benchmark by leveraging open-access paper collections to extract data relevant to the scientific domain. This benchmark comprises 530K meticulously curated image-text pairs, extracted from figures and tables with detailed captions in scientific documents. We further annotate the image-text pairs with two-level subset-subcategory hierarchy annotations to facilitate a more comprehensive evaluation of the baselines. We conducted zero-shot and fine-tuning evaluations on prominent multi-modal image-captioning and visual language models, such as CLIP and BLIP. Our analysis offers critical insights for MMIR in the scientific domain, including the impact of pre-training and fine-tuning settings and the influence of the visual and textual encoders. All our data and checkpoints are publicly available at https://github.com/Wusiwei0410/SciMMIR.


DREditor: An Time-efficient Approach for Building a Domain-specific Dense Retrieval Model

arXiv.org Artificial Intelligence

Deploying dense retrieval models efficiently is becoming increasingly important across various industries. This is especially true for enterprise search services, where customizing search engines to meet the time demands of different enterprises in different domains is crucial. Motivated by this, we develop a time-efficient approach called DREditor to edit the matching rule of an off-the-shelf dense retrieval model to suit a specific domain. This is achieved by directly calibrating the output embeddings of the model using an efficient and effective linear mapping. This mapping is powered by an edit operator that is obtained by solving a specially constructed least squares problem. Compared to implicit rule modification via long-time finetuning, our experimental results show that DREditor provides significant advantages on different domain-specific datasets, dataset sources, retrieval models, and computing devices. It consistently enhances time efficiency by 100-300 times while maintaining comparable or even superior retrieval performance. In a broader context, we take the first step to introduce a novel embedding calibration approach for the retrieval task, filling the technical blank in the current field of embedding calibration. This approach also paves the way for building domain-specific dense retrieval models efficiently and inexpensively.


Comparing Human-Centered Language Modeling: Is it Better to Model Groups, Individual Traits, or Both?

arXiv.org Artificial Intelligence

Natural language processing has made progress in incorporating human context into its models, but whether it is more effective to use group-wise attributes (e.g., over-45-year-olds) or model individuals remains open. Group attributes are technically easier but coarse: not all 45-year-olds write the same way. In contrast, modeling individuals captures the complexity of each person's identity. It allows for a more personalized representation, but we may have to model an infinite number of users and require data that may be impossible to get. We compare modeling human context via group attributes, individual users, and combined approaches. Combining group and individual features significantly benefits user-level regression tasks like age estimation or personality assessment from a user's documents. Modeling individual users significantly improves the performance of single document-level classification tasks like stance and topic detection. We also find that individual-user modeling does well even without user's historical data.


An Empirical Study on Compliance with Ranking Transparency in the Software Documentation of EU Online Platforms

arXiv.org Artificial Intelligence

Compliance with the European Union's Platform-to-Business (P2B) Regulation is challenging for online platforms, and assessing their compliance can be difficult for public authorities. This is partly due to the lack of automated tools for assessing the information (e.g., software documentation) platforms provide concerning ranking transparency. Our study tackles this issue in two ways. First, we empirically evaluate the compliance of six major platforms (Amazon, Bing, Booking, Google, Tripadvisor, and Yahoo), revealing substantial differences in their documentation. Second, we introduce and test automated compliance assessment tools based on ChatGPT and information retrieval technology. These tools are evaluated against human judgments, showing promising results as reliable proxies for compliance assessments. Our findings could help enhance regulatory compliance and align with the United Nations Sustainable Development Goal 10.3, which seeks to reduce inequality, including business disparities, on these platforms.


LearnedWMP: Workload Memory Prediction Using Distribution of Query Templates

arXiv.org Artificial Intelligence

In a modern DBMS, working memory is frequently the limiting factor when processing in-memory analytic query operations such as joins, sorting, and aggregation. Existing resource estimation approaches for a DBMS estimate the resource consumption of a query by computing an estimate of each individual database operator in the query execution plan. Such an approach is slow and error-prone as it relies upon simplifying assumptions, such as uniformity and independence of the underlying data. Additionally, the existing approach focuses on individual queries separately and does not factor in other queries in the workload that may be executed concurrently. In this research, we are interested in query performance optimization under concurrent execution of a batch of queries (a workload). Specifically, we focus on predicting the memory demand for a workload rather than providing separate estimates for each query within it. We introduce the problem of workload memory prediction and formalize it as a distribution regression problem. We propose Learned Workload Memory Prediction (LearnedWMP) to improve and simplify estimating the working memory demands of workloads. Through a comprehensive experimental evaluation, we show that LearnedWMP reduces the memory estimation error of the state-of-the-practice method by up to 47.6%. Compared to an alternative single-query model, during training and inferencing, the LearnedWMP model and its variants were 3x to 10x faster. Moreover, LearnedWMP-based models were at least 50% smaller in most cases. Overall, the results demonstrate the advantages of the LearnedWMP approach and its potential for a broader impact on query performance optimization.


Personality Trait Inference Via Mobile Phone Sensors: A Machine Learning Approach

arXiv.org Artificial Intelligence

This study provides evidence that personality can be reliably predicted from activity data collected through mobile phone sensors. Employing a set of well informed indicators calculable from accelerometer records and movement patterns, we were able to predict users' personality up to a 0.78 F1 score on a two class problem. Given the fast growing number of data collected from mobile phones, our novel personality indicators open the door to exciting avenues for future research in social sciences. Our results reveal distinct behavioral patterns that proved to be differentially predictive of big five personality traits. They potentially enable cost effective, questionnaire free investigation of personality related questions at an unprecedented scale. We show how a combination of rich behavioral data obtained with smartphone sensing and the use of machine learning techniques can help to advance personality research and can inform both practitioners and researchers about the different behavioral patterns of personality. These findings have practical implications for organizations harnessing mobile sensor data for personality assessment, guiding the refinement of more precise and efficient prediction models in the future.


Navigating the Thin Line: Examining User Behavior in Search to Detect Engagement and Backfire Effects

arXiv.org Artificial Intelligence

Opinionated users often seek information that aligns with their preexisting beliefs while dismissing contradictory evidence due to confirmation bias. This conduct hinders their ability to consider alternative stances when searching the web. Despite this, few studies have analyzed how the diversification of search results on disputed topics influences the search behavior of highly opinionated users. To this end, we present a preregistered user study (n = 257) investigating whether different levels (low and high) of bias metrics and search results presentation (with or without AI-predicted stances labels) can affect the stance diversity consumption and search behavior of opinionated users on three debated topics (i.e., atheism, intellectual property rights, and school uniforms). Our results show that exposing participants to (counter-attitudinally) biased search results increases their consumption of attitude-opposing content, but we also found that bias was associated with a trend toward overall fewer interactions within the search page. We also found that 19% of users interacted with queries and search pages but did not select any search results. When we removed these participants in a post-hoc analysis, we found that stance labels increased the diversity of stances consumed by users, particularly when the search results were biased. Our findings highlight the need for future research to explore distinct search scenario settings to gain insight into opinionated users' behavior.


An Information Retrieval and Extraction Tool for Covid-19 Related Papers

arXiv.org Artificial Intelligence

Background: The COVID-19 pandemic has caused severe impacts on health systems worldwide. Its critical nature and the increased interest of individuals and organizations to develop countermeasures to the problem has led to a surge of new studies in scientific journals. Objetive: We sought to develop a tool that incorporates, in a novel way, aspects of Information Retrieval (IR) and Extraction (IE) applied to the COVID-19 Open Research Dataset (CORD-19). The main focus of this paper is to provide researchers with a better search tool for COVID-19 related papers, helping them find reference papers and hightlight relevant entities in text. Method: We applied Latent Dirichlet Allocation (LDA) to model, based on research aspects, the topics of all English abstracts in CORD-19. Relevant named entities of each abstract were extracted and linked to the corresponding UMLS concept. Regular expressions and the K-Nearest Neighbors algorithm were used to rank relevant papers. Results: Our tool has shown the potential to assist researchers by automating a topic-based search of CORD-19 papers. Nonetheless, we identified that more fine-tuned topic modeling parameters and increased accuracy of the research aspect classifier model could lead to a more accurate and reliable tool. Conclusion: We emphasize the need of new automated tools to help researchers find relevant COVID-19 documents, in addition to automatically extracting useful information contained in them. Our work suggests that combining different algorithms and models could lead to new ways of browsing COVID-19 paper data.


Efficient approximation of Earth Mover's Distance Based on Nearest Neighbor Search

arXiv.org Artificial Intelligence

Earth Mover's Distance (EMD) is an important similarity measure between two distributions, used in computer vision and many other application domains. However, its exact calculation is computationally and memory intensive, which hinders its scalability and applicability for large-scale problems. Various approximate EMD algorithms have been proposed to reduce computational costs, but they suffer lower accuracy and may require additional memory usage or manual parameter tuning. In this paper, we present a novel approach, NNS-EMD, to approximate EMD using Nearest Neighbor Search (NNS), in order to achieve high accuracy, low time complexity, and high memory efficiency. The NNS operation reduces the number of data points compared in each NNS iteration and offers opportunities for parallel processing. We further accelerate NNS-EMD via vectorization on GPU, which is especially beneficial for large datasets. We compare NNS-EMD with both the exact EMD and state-of-the-art approximate EMD algorithms on image classification and retrieval tasks. We also apply NNS-EMD to calculate transport mapping and realize color transfer between images. NNS-EMD can be 44x to 135x faster than the exact EMD implementation, and achieves superior accuracy, speedup, and memory efficiency over existing approximate EMD methods.