AITopics | Anand, Avishek

Collaborating Authors

Anand, Avishek

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model-free reinforcement learning with noisy actions for automated experimental control in optics

Richtmann, Lea, Schmiesing, Viktoria-S., Wilken, Dennis, Heine, Jan, Tranter, Aaron, Anand, Avishek, Osborne, Tobias J., Heurs, Michèle

arXiv.org Artificial IntelligenceMay-24-2024

Experimental control involves a lot of manual effort with non-trivial decisions for precise adjustments. Here, we study the automatic experimental alignment for coupling laser light into an optical fiber using reinforcement learning (RL). We face several real-world challenges, such as time-consuming training, partial observability, and noisy actions due to imprecision in the mirror steering motors. We show that we can overcome these challenges: To save time, we use a virtual testbed to tune our environment for dealing with partial observability and use relatively sample-efficient model-free RL algorithms like Soft Actor-Critic (SAC) or Truncated Quantile Critics (TQC). Furthermore, by fully training on the experiment, the agent learns directly to handle the noise present. In our extensive experimentation, we show that we are able to achieve 90% coupling, showcasing the effectiveness of our proposed approaches. We reach this efficiency, which is comparable to that of a human expert, without additional feedback loops despite the motors' inaccuracies. Our result is an example of the readiness of RL for real-world tasks. We consider RL a promising tool for reducing the workload in labs.

machine learning, reinforcement learning, training step, (17 more...)

arXiv.org Artificial Intelligence

2405.15421

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

QuanTemp: A real-world open-domain benchmark for fact-checking numerical claims

V, Venktesh, Anand, Abhijit, Anand, Avishek, Setty, Vinay

arXiv.org Artificial IntelligenceMay-1-2024

Automated fact checking has gained immense interest to tackle the growing misinformation in the digital era. Existing systems primarily focus on synthetic claims on Wikipedia, and noteworthy progress has also been made on real-world claims. In this work, we release QuanTemp, a diverse, multi-domain dataset focused exclusively on numerical claims, encompassing temporal, statistical and diverse aspects with fine-grained metadata and an evidence collection without leakage. This addresses the challenge of verifying real-world numerical claims, which are complex and often lack precise information, not addressed by existing works that mainly focus on synthetic claims. We evaluate and quantify the limitations of existing solutions for the task of verifying numerical claims. We also evaluate claim decomposition based methods, numerical understanding based models and our best baselines achieves a macro-F1 of 58.32. This demonstrates that QuanTemp serves as a challenging evaluation set for numerical claim verification.

large language model, machine learning, numerical claim, (18 more...)

arXiv.org Artificial Intelligence

2403.17169

Country:

Europe (1.00)
North America > Canada > Ontario (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
(6 more...)

Genre: Research Report > Experimental Study (0.68)

Industry:

Government (0.93)
Media > News (0.66)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

Add feedback

Temporal Blind Spots in Large Language Models

Wallat, Jonas, Jatowt, Adam, Anand, Avishek

arXiv.org Artificial IntelligenceJan-22-2024

Large language models (LLMs) have recently gained significant attention due to their unparalleled ability to perform various natural language processing tasks. These models, benefiting from their advanced natural language understanding capabilities, have demonstrated impressive zero-shot performance. However, the pre-training data utilized in LLMs is often confined to a specific corpus, resulting in inherent freshness and temporal scope limitations. Consequently, this raises concerns regarding the effectiveness of LLMs for tasks involving temporal intents. In this study, we aim to investigate the underlying limitations of general-purpose LLMs when deployed for tasks that require a temporal understanding. We pay particular attention to handling factual temporal knowledge through three popular temporal QA datasets. Specifically, we observe low performance on detailed questions about the past and, surprisingly, for rather new information. In manual and automatic testing, we find multiple temporal errors and characterize the conditions under which QA performance deteriorates. Our analysis contributes to understanding LLM limitations and offers valuable insights into developing future models that can better cater to the demands of temporally-oriented tasks. The code is available\footnote{https://github.com/jwallat/temporalblindspots}.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2401.12078

Country:

Europe (1.00)
Asia > Middle East > Republic of Türkiye (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

In-Context Ability Transfer for Question Decomposition in Complex QA

V, Venktesh, Bhattacharya, Sourangshu, Anand, Avishek

arXiv.org Artificial IntelligenceOct-26-2023

Answering complex questions is a challenging task that requires question decomposition and multistep reasoning for arriving at the solution. While existing supervised and unsupervised approaches are specialized to a certain task and involve training, recently proposed prompt-based approaches offer generalizable solutions to tackle a wide variety of complex question-answering (QA) tasks. However, existing prompt-based approaches that are effective for complex QA tasks involve expensive hand annotations from experts in the form of rationales and are not generalizable to newer complex QA scenarios and tasks. We propose, icat (In-Context Ability Transfer) which induces reasoning capabilities in LLMs without any LLM fine-tuning or manual annotation of in-context samples. We transfer the ability to decompose complex questions to simpler questions or generate step-by-step rationales to LLMs, by careful selection from available data sources of related tasks. We also propose an automated uncertainty-aware exemplar selection approach for selecting examples from transfer data sources. Finally, we conduct large-scale experiments on a variety of complex QA tasks involving numerical reasoning, compositional complex QA, and heterogeneous complex QA which require decomposed reasoning. We show that ICAT convincingly outperforms existing prompt-based solutions without involving any model training, showcasing the benefits of re-using existing abilities.

in-context ability transfer, large language model, natural language, (3 more...)

arXiv.org Artificial Intelligence

2310.18371

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)

Add feedback

DINE: Dimensional Interpretability of Node Embeddings

Piaggesi, Simone, Khosla, Megha, Panisson, André, Anand, Avishek

arXiv.org Artificial IntelligenceOct-2-2023

Graphs are ubiquitous due to their flexibility in representing social and technological systems as networks of interacting elements. Graph representation learning methods, such as node embeddings, are powerful approaches to map nodes into a latent vector space, allowing their use for various graph tasks. Despite their success, only few studies have focused on explaining node embeddings locally. Moreover, global explanations of node embeddings remain unexplored, limiting interpretability and debugging potentials. We address this gap by developing human-understandable explanations for dimensions in node embeddings. Towards that, we first develop new metrics that measure the global interpretability of embedding vectors based on the marginal contribution of the embedding dimensions to predicting graph structure. We say that an embedding dimension is more interpretable if it can faithfully map to an understandable sub-structure in the input graph - like community structure. Having observed that standard node embeddings have low interpretability, we then introduce DINE (Dimension-based Interpretable Node Embedding), a novel approach that can retrofit existing node embeddings by making them more interpretable without sacrificing their task performance. We conduct extensive experiments on synthetic and real-world graphs and show that we can simultaneously learn highly interpretable node embeddings with effective performance in link prediction.

data mining, dimension, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2310.01162

Country:

Europe > Italy (0.46)
Europe > Germany (0.28)
North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Context Aware Query Rewriting for Text Rankers using LLM

Anand, Abhijit, V, Venktesh, Setty, Vinay, Anand, Avishek

arXiv.org Artificial IntelligenceAug-31-2023

Query rewriting refers to an established family of approaches that are applied to underspecified and ambiguous queries to overcome the vocabulary mismatch problem in document ranking. Queries are typically rewritten during query processing time for better query modelling for the downstream ranker. With the advent of large-language models (LLMs), there have been initial investigations into using generative approaches to generate pseudo documents to tackle this inherent vocabulary gap. In this work, we analyze the utility of LLMs for improved query rewriting for text ranking tasks. We find that there are two inherent limitations of using LLMs as query re-writers -- concept drift when using only queries as prompts and large inference costs during query processing. We adopt a simple, yet surprisingly effective, approach called context aware query rewriting (CAR) to leverage the benefits of LLMs for query understanding. Firstly, we rewrite ambiguous training queries by context-aware prompting of LLMs, where we use only relevant documents as context.Unlike existing approaches, we use LLM-based query rewriting only during the training phase. Eventually, a ranker is fine-tuned on the rewritten queries instead of the original queries during training. In our extensive experiments, we find that fine-tuning a ranker using re-written queries offers a significant improvement of up to 33% on the passage ranking task and up to 28% on the document ranking task when compared to the baseline performance of using original queries.

context aware query rewriting, large language model, natural language, (3 more...)

arXiv.org Artificial Intelligence

2308.16753

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Query Understanding in the Age of Large Language Models

Anand, Avishek, V, Venktesh, Anand, Abhijit, Setty, Vinay

arXiv.org Artificial IntelligenceJun-28-2023

Querying, conversing, and controlling search and information-seeking interfaces using natural language are fast becoming ubiquitous with the rise and adoption of large-language models (LLM). In this position paper, we describe a generic framework for interactive query-rewriting using LLMs. Our proposal aims to unfold new opportunities for improved and transparent intent understanding while building high-performance retrieval systems using LLMs. A key aspect of our framework is the ability of the rewriter to fully specify the machine intent by the search engine in natural language that can be further refined, controlled, and edited before the final retrieval phase. The ability to present, interact, and reason over the underlying machine intent in natural language has profound implications on transparency, ranking performance, and a departure from the traditional way in which supervised signals were collected for understanding intents. We detail the concept, backed by initial experiments, along with open questions for this interactive query understanding framework.

information retrieval, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2306.16004

Country:

Europe (1.00)
North America > United States > Texas (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

The Effect of Masking Strategies on Knowledge Retention by Language Models

Wallat, Jonas, Zhang, Tianyi, Anand, Avishek

arXiv.org Artificial IntelligenceJun-12-2023

Language models retain a significant amount of world knowledge from their pre-training stage. This allows knowledgeable models to be applied to knowledge-intensive tasks prevalent in information retrieval, such as ranking or question answering. Understanding how and which factual information is acquired by our models is necessary to build responsible models. However, limited work has been done to understand the effect of pre-training tasks on the amount of knowledge captured and forgotten by language models during pre-training. Building a better understanding of knowledge acquisition is the goal of this paper. Therefore, we utilize a selection of pre-training tasks to infuse knowledge into our model. In the following steps, we test the model's knowledge retention by measuring its ability to answer factual questions. Our experiments show that masking entities and principled masking of correlated spans based on pointwise mutual information lead to more factual knowledge being retained than masking random tokens. Our findings demonstrate that, like the ability to perform a task, the (factual) knowledge acquired from being trained on that task is forgotten when a model is trained to perform another task (catastrophic forgetting) and how to prevent this phenomenon. To foster reproducibility, the code, as well as the data used in this paper, are openly available.

information retrieval, machine learning, question answering, (21 more...)

arXiv.org Artificial Intelligence

2306.07185

Country:

Europe (1.00)
North America > United States > Texas (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

FaxPlainAC: A Fact-Checking Tool Based on EXPLAINable Models with HumAn Correction in the Loop

Zhang, Zijian, Rudra, Koustav, Anand, Avishek

arXiv.org Artificial IntelligenceSep-12-2021

Fact-checking on the Web has become the main mechanism through which we detect the credibility of the news or information. Existing fact-checkers verify the authenticity of the information (support or refute the claim) based on secondary sources of information. However, existing approaches do not consider the problem of model updates due to constantly increasing training data due to user feedback. It is therefore important to conduct user studies to correct models' inference biases and improve the model in a life-long learning manner in the future according to the user feedback. In this paper, we present FaxPlainAC, a tool that gathers user feedback on the output of explainable fact-checking models. FaxPlainAC outputs both the model decision, i.e., whether the input fact is true or not, along with the supporting/refuting evidence considered by the model. Additionally, FaxPlainAC allows for accepting user feedback both on the prediction and explanation. Developed in Python, FaxPlainAC is designed as a modular and easily deployable tool. It can be integrated with other downstream tasks and allowing for fact-checking human annotation gathering and life-long learning.

artificial intelligence, machine learning, social media, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3459637.3481985

2110.10144

Country:

Europe > Germany (0.15)
North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.95)

Add feedback

Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks

Funke, Thorben, Khosla, Megha, Anand, Avishek

arXiv.org Artificial IntelligenceMay-18-2021

With the ever-increasing popularity and applications of graph neural networks, several proposals have been made to interpret and understand the decisions of a GNN model. Explanations for a GNN model differ in principle from other input settings. It is important to attribute the decision to input features and other related instances connected by the graph structure. We find that the previous explanation generation approaches that maximize the mutual information between the label distribution produced by the GNN model and the explanation to be restrictive. Specifically, existing approaches do not enforce explanations to be predictive, sparse, or robust to input perturbations. In this paper, we lay down some of the fundamental principles that an explanation method for GNNs should follow and introduce a metric fidelity as a measure of the explanation's effectiveness. We propose a novel approach Zorro based on the principles from rate-distortion theory that uses a simple combinatorial procedure to optimize for fidelity. Extensive experiments on real and synthetic datasets reveal that Zorro produces sparser, stable, and more faithful explanations than existing GNN explanation approaches.

deep learning, explanation, neural network, (19 more...)

arXiv.org Artificial Intelligence

2105.08621

Country:

Europe > Germany (0.14)
North America > United States (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback