AITopics | Deng, Naihao

Collaborating Authors

Deng, Naihao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

You Are What You Annotate: Towards Better Models through Annotator Representations

Deng, Naihao, Zhang, Xinliang Frederick, Liu, Siyang, Wu, Winston, Wang, Lu, Mihalcea, Rada

arXiv.org Artificial IntelligenceOct-22-2023

Annotator disagreement is ubiquitous in natural language processing (NLP) tasks. There are multiple reasons for such disagreements, including the subjectivity of the task, difficult cases, unclear guidelines, and so on. Rather than simply aggregating labels to obtain data annotations, we instead try to directly model the diverse perspectives of the annotators, and explicitly account for annotators' idiosyncrasies in the modeling process by creating representations for each annotator (annotator embeddings) and also their annotations (annotation embeddings). In addition, we propose TID-8, The Inherent Disagreement - 8 dataset, a benchmark that consists of eight existing language understanding datasets that have inherent annotator disagreement. We test our approach on TID-8 and show that our approach helps models learn significantly better from disagreements on six different datasets in TID-8 while increasing model size by fewer than 1% parameters. By capturing the unique tendencies and subjectivity of individual annotators through embeddings, our representations prime AI models to be inclusive of diverse viewpoints.

annotator, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.14663

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Query Rewriting for Effective Misinformation Discovery

Kazemi, Ashkan, Abzaliev, Artem, Deng, Naihao, Hou, Rui, Hale, Scott A., Pérez-Rosas, Verónica, Mihalcea, Rada

arXiv.org Artificial IntelligenceOct-2-2023

We propose a novel system to help fact-checkers formulate search queries for known misinformation claims and effectively search across multiple social media platforms. We introduce an adaptable rewriting strategy, where editing actions for queries containing claims (e.g., swap a word with its synonym; change verb tense into present simple) are automatically learned through offline reinforcement learning. Our model uses a decision transformer to learn a sequence of editing actions that maximizes query retrieval metrics such as mean average precision. We conduct a series of experiments showing that our query rewriting system achieves a relative increase in the effectiveness of the queries of up to 42%, while producing editing action sequences that are human interpretable.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.07467

Country: North America > United States > Louisiana (0.14)

Genre: Research Report (0.50)

Industry:

Media > News (0.62)
Information Technology (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

EASE: An Easily-Customized Annotation System Powered by Efficiency Enhancement Mechanisms

Deng, Naihao, Liu, Yikai, Chen, Mingye, Wu, Winston, Liu, Siyang, Chen, Yulong, Zhang, Yue, Mihalcea, Rada

arXiv.org Artificial IntelligenceMay-23-2023

The performance of current supervised AI systems is tightly connected to the availability of annotated datasets. Annotations are usually collected through annotation tools, which are often designed for specific tasks and are difficult to customize. Moreover, existing annotation tools with an active learning mechanism often only support limited use cases. To address these limitations, we present EASE, an Easily-Customized Annotation System Powered by Efficiency Enhancement Mechanisms. \sysname provides modular annotation units for building customized annotation interfaces and also provides multiple back-end options that suggest annotations using (1) multi-task active learning; (2) demographic feature based active learning; (3) a prompt system that can query the API of large language models. We conduct multiple experiments and user studies to evaluate our system's flexibility and effectiveness. Our results show that our system can meet the diverse needs of NLP researchers and significantly accelerate the annotation process.

annotator, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.14169

Country:

Europe (1.00)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Questionnaire & Opinion Survey (0.70)
Research Report > New Finding (0.54)

Industry:

Education (0.93)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

A PhD Student's Perspective on Research in NLP in the Era of Very Large Language Models

Ignat, Oana, Jin, Zhijing, Abzaliev, Artem, Biester, Laura, Castro, Santiago, Deng, Naihao, Gao, Xinyi, Gunal, Aylin, He, Jacky, Kazemi, Ashkan, Khalifa, Muhammad, Koh, Namho, Lee, Andrew, Liu, Siyang, Min, Do June, Mori, Shinka, Nwatu, Joan, Perez-Rosas, Veronica, Shen, Siqi, Wang, Zekun, Wu, Winston, Mihalcea, Rada

arXiv.org Artificial IntelligenceMay-21-2023

Recent progress in large language models has enabled the deployment of many generative NLP applications. At the same time, it has also led to a misleading public discourse that ``it's all been solved.'' Not surprisingly, this has in turn made many NLP researchers -- especially those at the beginning of their career -- wonder about what NLP research area they should focus on. This document is a compilation of NLP research directions that are rich for exploration, reflecting the views of a diverse group of PhD students in an academic research lab. While we identify many research areas, many others exist; we do not cover those areas that are currently addressed by LLMs but where LLMs lag behind in performance, or those focused on LLM development. We welcome suggestions for other research directions to include: https://bit.ly/nlp-era-llm

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.12544

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Information Technology > Security & Privacy (0.92)
Media (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Prefix-to-SQL: Text-to-SQL Generation from Incomplete User Questions

Deng, Naihao, Chang, Shuaichen, Shi, Peng, Yu, Tao, Zhang, Rui

arXiv.org Artificial IntelligenceSep-15-2021

Existing text-to-SQL research only considers complete questions as the input, but lay-users might strive to formulate a complete question. To build a smarter natural language interface to database systems (NLIDB) that also processes incomplete questions, we propose a new task, prefix-to-SQL which takes question prefix from users as the input and predicts the intended SQL. We construct a new benchmark called PAGSAS that contains 124K user question prefixes and the intended SQL for 5 sub-tasks Advising, GeoQuery, Scholar, ATIS, and Spider. Additionally, we propose a new metric SAVE to measure how much effort can be saved by users. Experimental results show that PAGSAS is challenging even for strong baseline models such as T5. As we observe the difficulty of prefix-to-SQL is related to the number of omitted tokens, we incorporate curriculum learning of feeding examples with an increasing number of omitted tokens. This improves scores on various sub-tasks by as much as 9% recall scores on sub-task GeoQuery in PAGSAS.

deep learning, neural network, sql query, (21 more...)

arXiv.org Artificial Intelligence

2109.13066

Country:

Europe (1.00)
North America > United States > New York (0.14)
North America > United States > Michigan (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback