AITopics | Gunjal, Anisha

Collaborating Authors

Gunjal, Anisha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification

Gunjal, Anisha, Durrett, Greg

arXiv.org Artificial IntelligenceJun-28-2024

Automatic factuality verification of large language model (LLM) generations is becoming more and more widely used to combat hallucinations. A major point of tension in the literature is the granularity of this fact-checking: larger chunks of text are hard to fact-check, but more atomic facts like propositions may lack context to interpret correctly. In this work, we assess the role of context in these atomic facts. We argue that fully atomic facts are not the right representation, and define two criteria for molecular facts: decontextuality, or how well they can stand alone, and minimality, or how little extra information is added to achieve decontexuality. We quantify the impact of decontextualization on minimality, then present a baseline methodology for generating molecular facts automatically, aiming to add the right amount of information. We compare against various methods of decontextualization and find that molecular facts balance minimality with fact verification accuracy in ambiguous settings.

information, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2406.20079

Country:

Europe (1.00)
Asia (0.67)
North America > United States > New York (0.28)

Genre:

Personal (0.93)
Research Report (0.83)

Industry:

Media > Film (0.93)
Government > Regional Government > North America Government > United States Government (0.93)
Leisure & Entertainment > Sports > Hockey (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Detecting and Preventing Hallucinations in Large Vision Language Models

Gunjal, Anisha, Yin, Jihan, Bas, Erhan

arXiv.org Artificial IntelligenceAug-18-2023

Instruction tuned Large Vision Language Models (LVLMs) have significantly advanced in generalizing across a diverse set of multi-modal tasks, especially for Visual Question Answering (VQA). However, generating detailed responses that are visually grounded is still a challenging task for these models. We find that even the current state-of-the-art LVLMs (InstructBLIP) still contain a staggering 30 percent of the hallucinatory text in the form of non-existent objects, unfaithful descriptions, and inaccurate relationships. To address this, we introduce M-HalDetect, a (M)ultimodal (Hal)lucination (Detect)ion Dataset that can be used to train and benchmark models for hallucination detection and prevention. M-HalDetect consists of 16k fine-grained annotations on VQA examples, making it the first comprehensive multi-modal hallucination detection dataset for detailed image descriptions. Unlike previous work that only consider object hallucination, we additionally annotate both entity descriptions and relationships that are unfaithful. To demonstrate the potential of this dataset for hallucination prevention, we optimize InstructBLIP through our novel Fine-grained Direct Preference Optimization (FDPO). We also train fine-grained multi-modal reward models from InstructBLIP and evaluate their effectiveness with best-of-n rejection sampling. We perform human evaluation on both FDPO and rejection sampling, and find that they reduce hallucination rates in InstructBLIP by 41% and 55% respectively. We also find that our reward model generalizes to other multi-modal models, reducing hallucinations in LLaVA and mPLUG-OWL by 15% and 57% respectively, and has strong correlation with human evaluated accuracy scores.

natural language, question answering, reward model, (19 more...)

arXiv.org Artificial Intelligence

2308.06394

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)

Add feedback

Drafting Event Schemas using Language Models

Gunjal, Anisha, Durrett, Greg

arXiv.org Artificial IntelligenceMay-24-2023

Past work has studied event prediction and event language modeling, sometimes mediated through structured representations of knowledge in the form of event schemas. Such schemas can lead to explainable predictions and forecasting of unseen events given incomplete information. In this work, we look at the process of creating such schemas to describe complex events. We use large language models (LLMs) to draft schemas directly in natural language, which can be further refined by human curators as necessary. Our focus is on whether we can achieve sufficient diversity and recall of key events and whether we can produce the schemas in a sufficiently descriptive style. We show that large language models are able to achieve moderate recall against schemas taken from two different datasets, with even better results when multiple prompts and multiple samples are combined. Moreover, we show that textual entailment methods can be used for both matching schemas to instances of events as well as evaluating overlap between gold and predicted schemas. Our method paves the way for easier distillation of event knowledge from large language model into schemas.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.14847

Country:

Europe (1.00)
North America > United States (0.67)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Law Enforcement & Public Safety (0.69)
Government (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback