AITopics | Ishii, Etsuko

Collaborating Authors

Ishii, Etsuko

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLM Internal States Reveal Hallucination Risk Faced With a Query

Ji, Ziwei, Chen, Delong, Ishii, Etsuko, Cahyawijaya, Samuel, Bang, Yejin, Wilie, Bryan, Fung, Pascale

arXiv.org Artificial IntelligenceJul-3-2024

The hallucination problem of Large Language Models (LLMs) significantly limits their reliability and trustworthiness. Humans have a self-awareness process that allows us to recognize what we don't know when faced with queries. Inspired by this, our paper investigates whether LLMs can estimate their own hallucination risk before response generation. We analyze the internal mechanisms of LLMs broadly both in terms of training data sources and across 15 diverse Natural Language Generation (NLG) tasks, spanning over 700 datasets. Our empirical analysis reveals two key insights: (1) LLM internal states indicate whether they have seen the query in training data or not; and (2) LLM internal states show they are likely to hallucinate or not regarding the query. Our study explores particular neurons, activation layers, and tokens that play a crucial role in the LLM perception of uncertainty and hallucination risk. By a probing estimator, we leverage LLM self-assessment, achieving an average hallucination estimation accuracy of 84.32\% at run time.

internal state reveal hallucination risk, large language model, natural language, (2 more...)

arXiv.org Artificial Intelligence

2407.03282

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Belief Revision: The Adaptability of Large Language Models Reasoning

Wilie, Bryan, Cahyawijaya, Samuel, Ishii, Etsuko, He, Junxian, Fung, Pascale

arXiv.org Artificial IntelligenceJun-28-2024

The capability to reason from text is crucial for real-world NLP applications. Real-world scenarios often involve incomplete or evolving data. In response, individuals update their beliefs and understandings accordingly. However, most existing evaluations assume that language models (LMs) operate with consistent information. We introduce Belief-R, a new dataset designed to test LMs' belief revision ability when presented with new evidence. Inspired by how humans suppress prior inferences, this task assesses LMs within the newly proposed delta reasoning ($\Delta R$) framework. Belief-R features sequences of premises designed to simulate scenarios where additional information could necessitate prior conclusions drawn by LMs. We evaluate $\sim$30 LMs across diverse prompting strategies and found that LMs generally struggle to appropriately revise their beliefs in response to new information. Further, models adept at updating often underperformed in scenarios without necessary updates, highlighting a critical trade-off. These insights underscore the importance of improving LMs' adaptiveness to changing information, a step toward more reliable AI systems.

information, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2406.19764

Country:

North America > Canada (0.28)
Europe (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Contrastive Learning for Inference in Dialogue

Ishii, Etsuko, Xu, Yan, Wilie, Bryan, Ji, Ziwei, Lovenia, Holy, Chung, Willy, Fung, Pascale

arXiv.org Artificial IntelligenceNov-12-2023

Inference, especially those derived from inductive processes, is a crucial component in our conversation to complement the information implicitly or explicitly conveyed by a speaker. While recent large language models show remarkable advances in inference tasks, their performance in inductive reasoning, where not all information is present in the context, is far behind deductive reasoning. In this paper, we analyze the behavior of the models based on the task difficulty defined by the semantic information gap -- which distinguishes inductive and deductive reasoning (Johnson-Laird, 1988, 1993). Our analysis reveals that the disparity in information between dialogue contexts and desired inferences poses a significant challenge to the inductive inference process. To mitigate this information gap, we investigate a contrastive learning approach by feeding negative samples. Our experiments suggest negative samples help models understand what is wrong and improve their inference generations.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2310.12467

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Mitigating Hallucination in Large Language Models via Self-Reflection

Ji, Ziwei, Yu, Tiezheng, Xu, Yan, Lee, Nayeon, Ishii, Etsuko, Fung, Pascale

arXiv.org Artificial IntelligenceOct-9-2023

Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks. However, the practical deployment still faces challenges, notably the issue of "hallucination", where models generate plausible-sounding but unfaithful or nonsensical information. This issue becomes particularly critical in the medical domain due to the uncommon professional concepts and potential social risks involved. This paper analyses the phenomenon of hallucination in medical generative QA systems using widely adopted LLMs and datasets. Our investigation centers on the identification and comprehension of common problematic answers, with a specific emphasis on hallucination. To tackle this challenge, we present an interactive self-reflection methodology that incorporates knowledge acquisition and answer generation. Through this feedback process, our approach steadily enhances the factuality, consistency, and entailment of the generated answers. Consequently, we harness the interactivity and multitasking ability of LLMs and produce progressively more precise and accurate answers. Experimental results on both automatic and human evaluation demonstrate the superiority of our approach in hallucination reduction compared to baselines.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.06271

Country:

Asia > China (0.14)
North America > United States (0.14)
North America > Canada (0.14)
Europe > Croatia (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Diagnostic Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Survey of Hallucination in Natural Language Generation

Ji, Ziwei, Lee, Nayeon, Frieske, Rita, Yu, Tiezheng, Su, Dan, Xu, Yan, Ishii, Etsuko, Bang, Yejin, Dai, Wenliang, Madotto, Andrea, Fung, Pascale

arXiv.org Artificial IntelligenceNov-7-2022

Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before. In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions; and (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, machine translation, and visual-language generation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3571730

2202.03629

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota (0.28)
North America > United States > California (0.27)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VScript: Controllable Script Generation with Visual Presentation

Ji, Ziwei, Xu, Yan, Cheng, I-Tsun, Cahyawijaya, Samuel, Frieske, Rita, Ishii, Etsuko, Zeng, Min, Madotto, Andrea, Fung, Pascale

arXiv.org Artificial IntelligenceOct-13-2022

In order to offer a customized script tool and inspire professional scriptwriters, we present VScript. It is a controllable pipeline that generates complete scripts, including dialogues and scene descriptions, as well as presents visually using video retrieval. With an interactive interface, our system allows users to select genres and input starting words that control the theme and development of the generated script. We adopt a hierarchical structure, which first generates the plot, then the script and its visual Figure 1: An example of the generated script (right) presentation. A novel approach is also introduced with its visual presentation (top left) from VScript. to plot-guided dialogue generation by Given the inputs, i.e., genre and starting words, a plot treating it as an inverse dialogue summarization. is generated, which guides the generation of a script The experiment results show that our approach consisting of a scene description and a dialogue. The outperforms the baselines on both automatic words highlighted in pink show the belongingness to and human evaluations, especially in the given genre (Sci-Fi).

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2203.00314

Genre: Research Report > New Finding (0.34)

Industry:

Media (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Greenformer: Factorization Toolkit for Efficient Deep Neural Networks

Cahyawijaya, Samuel, Winata, Genta Indra, Lovenia, Holy, Wilie, Bryan, Dai, Wenliang, Ishii, Etsuko, Fung, Pascale

arXiv.org Artificial IntelligenceSep-14-2021

While the recent advances in deep neural networks (DNN) bring remarkable success, the computational cost also increases considerably. In this paper, we introduce Greenformer, a toolkit to accelerate the computation of neural networks through matrix factorization while maintaining performance. Greenformer can be easily applied with a single line of code to any DNN model. Our experimental results show that Greenformer is effective for a wide range of scenarios.

deep learning, greenformer, neural network, (16 more...)

arXiv.org Artificial Intelligence

2109.06762

Country:

Europe > Spain (0.15)
Europe > Italy (0.15)
Asia > China (0.15)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Assessing Political Prudence of Open-domain Chatbots

Bang, Yejin, Lee, Nayeon, Ishii, Etsuko, Madotto, Andrea, Fung, Pascale

arXiv.org Artificial IntelligenceJun-11-2021

Politically sensitive topics are still a challenge for open-domain chatbots. However, dealing with politically sensitive content in a responsible, non-partisan, and safe behavior way is integral for these chatbots. Currently, the main approach to handling political sensitivity is by simply changing such a topic when it is detected. This is safe but evasive and results in a chatbot that is less engaging. In this work, as a first step towards a politically safe chatbot, we propose a group of metrics for assessing their political prudence. We then conduct political prudence analysis of various chatbots and discuss their behavior from multiple angles Figure 1: Illustration of responses from different chatbots through our automatic metric and human in a political conversation. Abortion law is a topic evaluation metrics. The testsets and codebase that often leads to divisive political debates.

artificial intelligence, chatbot, health & medicine, (16 more...)

arXiv.org Artificial Intelligence

2106.06157

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)

Genre: Research Report > Experimental Study (0.93)

Industry:

Government (0.69)
Health & Medicine > Therapeutic Area (0.35)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

CAiRE in DialDoc21: Data Augmentation for Information-Seeking Dialogue System

Ishii, Etsuko, Xu, Yan, Winata, Genta Indra, Lin, Zhaojiang, Madotto, Andrea, Liu, Zihan, Xu, Peng, Fung, Pascale

arXiv.org Artificial IntelligenceJun-7-2021

Information-seeking dialogue systems, including knowledge identification and response generation, aim to respond to users with fluent, coherent, and informative responses based on users' needs, which. To tackle this challenge, we utilize data augmentation methods and several training techniques with the pre-trained language models to learn a general pattern of the task and thus achieve promising performance. In DialDoc21 competition, our system achieved 74.95 F1 score and 60.74 Exact Match score in subtask 1, and 37.72 SacreBLEU score in subtask 2. Empirical analysis is provided to explain the effectiveness of our approaches.

artificial intelligence, dataset, natural language, (17 more...)

arXiv.org Artificial Intelligence

2106.0353

Country: Europe > Belgium (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)

Add feedback

Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters

Xu, Yan, Ishii, Etsuko, Liu, Zihan, Winata, Genta Indra, Su, Dan, Madotto, Andrea, Fung, Pascale

arXiv.org Artificial IntelligenceMay-13-2021

To diversify and enrich generated dialogue responses, knowledge-grounded dialogue has been investigated in recent years. Despite the success of the existing methods, they mainly follow the paradigm of retrieving the relevant sentences over a large corpus and augment the dialogues with explicit extra information, which is time- and resource-consuming. In this paper, we propose KnowExpert, an end-to-end framework to bypass the retrieval process by injecting prior knowledge into the pre-trained language models with lightweight adapters. To the best of our knowledge, this is the first attempt to tackle this task relying solely on a generation-based approach. Experimental results show that KnowExpert performs comparably with the retrieval-based baselines, demonstrating the potential of our proposed direction.

artificial intelligence, knowledge, natural language, (15 more...)

arXiv.org Artificial Intelligence

2105.06232

Country:

Asia (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.47)

Add feedback