AITopics | Wallat, Jonas

Collaborating Authors

Wallat, Jonas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Study into Investigating Temporal Robustness of LLMs

Wallat, Jonas, Abdallah, Abdelrahman, Jatowt, Adam, Anand, Avishek

arXiv.org Artificial IntelligenceMar-21-2025

Large Language Models (LLMs) encapsulate a surprising amount of factual world knowledge. However, their performance on temporal questions and historical knowledge is limited because they often cannot understand temporal scope and orientation or neglect the temporal aspect altogether. In this study, we aim to measure precisely how robust LLMs are for question answering based on their ability to process temporal information and perform tasks requiring temporal reasoning and temporal factual knowledge. Specifically, we design eight time-sensitive robustness tests for factual information to check the sensitivity of six popular LLMs in the zero-shot setting. Overall, we find LLMs lacking temporal robustness, especially to temporal reformulations and the use of different granularities of temporal references. We show how a selection of these eight tests can be used automatically to judge a model's temporal robustness for user questions on the fly. Finally, we apply the findings of this study to improve the temporal QA performance by up to 55 percent.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2503.17073

Country:

Asia (1.00)
Europe > Germany > Lower Saxony (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Extending Dense Passage Retrieval with Temporal Information

Abdallah, Abdelrahman, Piryani, Bhawna, Wallat, Jonas, Anand, Avishek, Jatowt, Adam

arXiv.org Artificial IntelligenceFeb-28-2025

Temporal awareness is crucial in many information retrieval tasks, particularly in scenarios where the relevance of documents depends on their alignment with the query's temporal context. Traditional retrieval methods such as BM25 and Dense Passage Retrieval (DPR) excel at capturing lexical and semantic relevance but fall short in addressing time-sensitive queries. To bridge this gap, we introduce the temporal retrieval model that integrates explicit temporal signals by incorporating query timestamps and document dates into the representation space. Our approach ensures that retrieved passages are not only topically relevant but also temporally aligned with user intent. We evaluate our approach on two large-scale benchmark datasets, ArchivalQA and ChroniclingAmericaQA, achieving substantial performance gains over standard retrieval baselines. In particular, our model improves Top-1 retrieval accuracy by 6.63% and NDCG@10 by 3.79% on ArchivalQA, while yielding a 9.56% boost in Top-1 retrieval accuracy and 4.68% in NDCG@10 on ChroniclingAmericaQA. Additionally, we introduce a time-sensitive negative sampling strategy, which refines the model's ability to distinguish between temporally relevant and irrelevant documents during training. Our findings highlight the importance of explicitly modeling time in retrieval systems and set a new standard for handling temporally grounded queries.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.21024

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Correctness is not Faithfulness in RAG Attributions

Wallat, Jonas, Heuss, Maria, de Rijke, Maarten, Anand, Avishek

arXiv.org Artificial IntelligenceDec-23-2024

Retrieving relevant context is a common approach to reduce hallucinations and enhance answer reliability. Explicitly citing source documents allows users to verify generated responses and increases trust. Prior work largely evaluates citation correctness - whether cited documents support the corresponding statements. But citation correctness alone is insufficient. To establish trust in attributed answers, we must examine both citation correctness and citation faithfulness. In this work, we first disentangle the notions of citation correctness and faithfulness, which have been applied inconsistently in previous studies. Faithfulness ensures that the model's reliance on cited documents is genuine, reflecting actual reference use rather than superficial alignment with prior beliefs, which we call post-rationalization. We design an experiment that reveals the prevalent issue of post-rationalization, which undermines reliable attribution and may result in misplaced trust. Our findings suggest that current attributed answers often lack citation faithfulness (up to 57 percent of the citations), highlighting the need to evaluate correctness and faithfulness for trustworthy attribution in language models.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.18004

Country:

North America > United States (0.68)
Europe > Germany (0.49)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)

Add feedback

Temporal Blind Spots in Large Language Models

Wallat, Jonas, Jatowt, Adam, Anand, Avishek

arXiv.org Artificial IntelligenceJan-22-2024

Large language models (LLMs) have recently gained significant attention due to their unparalleled ability to perform various natural language processing tasks. These models, benefiting from their advanced natural language understanding capabilities, have demonstrated impressive zero-shot performance. However, the pre-training data utilized in LLMs is often confined to a specific corpus, resulting in inherent freshness and temporal scope limitations. Consequently, this raises concerns regarding the effectiveness of LLMs for tasks involving temporal intents. In this study, we aim to investigate the underlying limitations of general-purpose LLMs when deployed for tasks that require a temporal understanding. We pay particular attention to handling factual temporal knowledge through three popular temporal QA datasets. Specifically, we observe low performance on detailed questions about the past and, surprisingly, for rather new information. In manual and automatic testing, we find multiple temporal errors and characterize the conditions under which QA performance deteriorates. Our analysis contributes to understanding LLM limitations and offers valuable insights into developing future models that can better cater to the demands of temporally-oriented tasks. The code is available\footnote{https://github.com/jwallat/temporalblindspots}.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2401.12078

Country:

Europe (1.00)
Asia > Middle East > Republic of Türkiye (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

GeneMask: Fast Pretraining of Gene Sequences to Enable Few-Shot Learning

Roy, Soumyadeep, Wallat, Jonas, Sundaram, Sowmya S, Nejdl, Wolfgang, Ganguly, Niloy

arXiv.org Artificial IntelligenceJul-29-2023

Large-scale language models such as DNABert and LOGO aim to learn optimal gene representations and are trained on the entire Human Reference Genome. However, standard tokenization schemes involve a simple sliding window of tokens like k-mers that do not leverage any gene-based semantics and thus may lead to (trivial) masking of easily predictable sequences and subsequently inefficient Masked Language Modeling (MLM) training. Therefore, we propose a novel masking algorithm, GeneMask, for MLM training of gene sequences, where we randomly identify positions in a gene sequence as mask centers and locally select the span around the mask center with the highest Normalized Pointwise Mutual Information (NPMI) to mask. We observe that in the absence of human-understandable semantics in the genomics domain (in contrast, semantic units like words and phrases are inherently available in NLP), GeneMask-based models substantially outperform the SOTA models (DNABert and LOGO) over four benchmark gene sequence classification datasets in five few-shot settings (10 to 1000-shot). More significantly, the GeneMask-based DNABert model is trained for less than one-tenth of the number of epochs of the original SOTA model. We also observe a strong correlation between top-ranked PMI tokens and conserved DNA sequence motifs, which may indicate the incorporation of latent genomic information. The codes (including trained models) and datasets are made publicly available at https://github.com/roysoumya/GeneMask.

artificial intelligence, enable few-shot learning, natural language, (3 more...)

arXiv.org Artificial Intelligence

doi: 10.3233/FAIA230492

2307.15933

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

The Effect of Masking Strategies on Knowledge Retention by Language Models

Wallat, Jonas, Zhang, Tianyi, Anand, Avishek

arXiv.org Artificial IntelligenceJun-12-2023

Language models retain a significant amount of world knowledge from their pre-training stage. This allows knowledgeable models to be applied to knowledge-intensive tasks prevalent in information retrieval, such as ranking or question answering. Understanding how and which factual information is acquired by our models is necessary to build responsible models. However, limited work has been done to understand the effect of pre-training tasks on the amount of knowledge captured and forgotten by language models during pre-training. Building a better understanding of knowledge acquisition is the goal of this paper. Therefore, we utilize a selection of pre-training tasks to infuse knowledge into our model. In the following steps, we test the model's knowledge retention by measuring its ability to answer factual questions. Our experiments show that masking entities and principled masking of correlated spans based on pointwise mutual information lead to more factual knowledge being retained than masking random tokens. Our findings demonstrate that, like the ability to perform a task, the (factual) knowledge acquired from being trained on that task is forgotten when a model is trained to perform another task (catastrophic forgetting) and how to prevent this phenomenon. To foster reproducibility, the code, as well as the data used in this paper, are openly available.

information retrieval, machine learning, question answering, (21 more...)

arXiv.org Artificial Intelligence

2306.07185

Country:

Europe (1.00)
North America > United States > Texas (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

A Review of the Role of Causality in Developing Trustworthy AI Systems

Ganguly, Niloy, Fazlija, Dren, Badar, Maryam, Fisichella, Marco, Sikdar, Sandipan, Schrader, Johanna, Wallat, Jonas, Rudra, Koustav, Koubarakis, Manolis, Patro, Gourab K., Amri, Wadhah Zai El, Nejdl, Wolfgang

arXiv.org Artificial IntelligenceFeb-14-2023

As a result, they are often brittle and unable to adapt to new domains, can treat individuals or subgroups unfairly, and have limited ability to explain their actions or recommendations [197, 235] reducing the trust of human users [118]. Following this, a new area of research, trustworthy AI, has recently received much attention from several policymakers and other regulatory organizations. The resulting guidelines (e.g., [184, 186, 187]), introduced to increase trust in AI systems, make developing trustworthy AI not only a technical (research) and social endeavor but also an organizational and (legal) obligational requirement. In this paper, we set out to demonstrate, through an extensive survey, that causal modeling and reasoning is an emerging and very useful tool for enabling current AI systems to become trustworthy. Causality is the science of reasoning about causes and effects. Cause-and-effect relationships are central to how we make sense of the world around us, how we act upon it, and how we respond to changes in our environment. In AI, research in causality was pioneered by the Turing award winner Judea Pearl long back in his 1995 seminal paper [194]. Since then, many researchers have contributed to the development of a solid mathematical basis for causality; see, for example, the books [79, 196, 201], the survey [90] and seminal papers [197, 235].

artificial intelligence, machine learning, survey article, (16 more...)

arXiv.org Artificial Intelligence

2302.06975

Country:

Asia (0.92)
North America > Canada (0.67)
North America > United States > Massachusetts (0.45)
Europe > Germany > Lower Saxony (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Questionnaire & Opinion Survey (0.92)
(2 more...)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Law (1.00)
(16 more...)

Add feedback