AITopics | Chen, Hung-Ting

Collaborating Authors

Chen, Hung-Ting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Open-World Evaluation for Retrieving Diverse Perspectives

Chen, Hung-Ting, Choi, Eunsol

arXiv.org Artificial IntelligenceSep-26-2024

We study retrieving a set of documents that covers various perspectives on a complex and contentious question (e.g., will ChatGPT do more harm than good?). We curate a Benchmark for Retrieval Diversity for Subjective questions (BERDS), where each example consists of a question and diverse perspectives associated with the question, sourced from survey questions and debate websites. On this data, retrievers paired with a corpus are evaluated to surface a document set that contains diverse perspectives. Our framing diverges from most retrieval tasks in that document relevancy cannot be decided by simple string matches to references. Instead, we build a language model based automatic evaluator that decides whether each retrieved document contains a perspective. This allows us to evaluate the performance of three different types of corpus (Wikipedia, web snapshot, and corpus constructed on the fly with retrieved pages from the search engine) paired with retrievers. Retrieving diverse documents remains challenging, with the outputs from existing retrievers covering all perspectives on only 33.74% of the examples. We further study the impact of query expansion and diversity-focused reranking approaches and analyze retriever sycophancy. Together, we lay the foundation for future studies in retrieval diversity handling complex queries.

information retrieval, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2409.1811

Country:

Asia (1.00)
Africa (0.68)
Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Information Technology (0.69)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

CaLMQA: Exploring culturally specific long-form question answering across 23 languages

Arora, Shane, Karpinska, Marzena, Chen, Hung-Ting, Bhattacharjee, Ipsita, Iyyer, Mohit, Choi, Eunsol

arXiv.org Artificial IntelligenceJul-3-2024

Large language models (LLMs) are used for long-form question answering (LFQA), which requires them to generate paragraph-length answers to complex questions. While LFQA has been well-studied in English, this research has not been extended to other languages. To bridge this gap, we introduce CaLMQA, a collection of 1.5K complex culturally specific questions spanning 23 languages and 51 culturally agnostic questions translated from English into 22 other languages. We define culturally specific questions as those uniquely or more likely to be asked by people from cultures associated with the question's language. We collect naturally-occurring questions from community web forums and hire native speakers to write questions to cover under-resourced, rarely-studied languages such as Fijian and Kirundi. Our dataset contains diverse, complex questions that reflect cultural topics (e.g. traditions, laws, news) and the language usage of native speakers. We automatically evaluate a suite of open- and closed-source models on CaLMQA by detecting incorrect language and token repetitions in answers, and observe that the quality of LLM-generated answers degrades significantly for some low-resource languages. Lastly, we perform human evaluation on a subset of models and languages. Manual evaluation reveals that model performance is significantly worse for culturally specific questions than for culturally agnostic questions. Our findings highlight the need for further research in non-English LFQA and provide an evaluation framework.

annotator, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.17761

Country:

Europe (1.00)
Africa (1.00)
North America > United States > Massachusetts (0.14)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.97)
Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Continually Improving Extractive QA via Human Feedback

Gao, Ge, Chen, Hung-Ting, Artzi, Yoav, Choi, Eunsol

arXiv.org Artificial IntelligenceNov-3-2023

We study continually improving an extractive question answering (QA) system via human user feedback. We design and deploy an iterative approach, where information-seeking users ask questions, receive model-predicted answers, and provide feedback. We conduct experiments involving thousands of user interactions under diverse setups to broaden the understanding of learning from feedback over time. Our experiments show effective improvement from user feedback of extractive QA models over time across different data regimes, including significant potential for domain adaptation.

machine learning, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2305.12473

Country:

North America > United States > Illinois (0.15)
North America > United States > Gulf of Mexico > Central GOM (0.15)
North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Baseball (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.35)

Add feedback

Understanding Retrieval Augmentation for Long-Form Question Answering

Chen, Hung-Ting, Xu, Fangyuan, Arora, Shane, Choi, Eunsol

arXiv.org Artificial IntelligenceOct-18-2023

We present a study of retrieval-augmented language models (LMs) on long-form question answering. We analyze how retrieval augmentation impacts different LMs, by comparing answers generated from models while using the same evidence documents, and how differing quality of retrieval document set impacts the answers generated from the same LM. We study various attributes of generated answers (e.g., fluency, length, variance) with an emphasis on the attribution of generated long-form answers to in-context evidence documents. We collect human annotations of answer attribution and evaluate methods for automatically judging attribution. Our study provides new insights on how retrieval augmentation impacts long, knowledge-rich text generation of LMs. We further identify attribution patterns for long text generation and analyze the main culprits of attribution errors. Together, our analysis reveals how retrieval augmentation impacts long knowledge-rich text generation and provide directions for future work.

artificial intelligence, natural language, retrieval augmentation

arXiv.org Artificial Intelligence

2310.1215

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.60)

Add feedback

Predict and Use Latent Patterns for Short-Text Conversation

Chen, Hung-Ting, Chao, Yu-Chieh, Chao, Ta-Hsuan, Ma, Wei-Yun

arXiv.org Artificial IntelligenceOct-26-2020

Many neural network models nowadays have achieved promising performances in Chit-chat settings. The majority of them rely on an encoder for understanding the post and a decoder for generating the response. Without given assigned semantics, the models lack the fine-grained control over responses as the semantic mapping between posts and responses is hidden on the fly within the end-to-end manners. Some previous works utilize sampled latent words as a controllable semantic form to drive the generated response around the work, but few works attempt to use more complex semantic forms to guide the generation. In this paper, we propose to use more detailed semantic forms, including latent responses and part-of-speech sequences sampled from the corresponding distributions, as the controllable semantics to guide the generation. Our experimental results show that the richer semantics are not only able to provide informative and diverse responses, but also increase the overall performance of response quality, including fluency and coherence.

artificial intelligence, neural network, sequence, (21 more...)

arXiv.org Artificial Intelligence

2010.13982

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback