AITopics | Wu, Yongkang

Collaborating Authors

Wu, Yongkang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

Li, Xiaoxi, Jin, Jiajie, Zhou, Yujia, Wu, Yongkang, Li, Zhonghua, Ye, Qi, Dou, Zhicheng

arXiv.org Artificial IntelligenceDec-16-2024

Large language models (LLMs) exhibit remarkable generative capabilities but often suffer from hallucinations. Retrieval-augmented generation (RAG) offers an effective solution by incorporating external knowledge, but existing methods still face several limitations: additional deployment costs of separate retrievers, redundant input tokens from retrieved text chunks, and the lack of joint optimization of retrieval and generation. To address these issues, we propose \textbf{RetroLLM}, a unified framework that integrates retrieval and generation into a single, cohesive process, enabling LLMs to directly generate fine-grained evidence from the corpus with constrained decoding. Moreover, to mitigate false pruning in the process of constrained evidence generation, we introduce (1) hierarchical FM-Index constraints, which generate corpus-constrained clues to identify a subset of relevant documents before evidence generation, reducing irrelevant decoding space; and (2) a forward-looking constrained decoding strategy, which considers the relevance of future sequences to improve evidence accuracy. Extensive experiments on five open-domain QA datasets demonstrate RetroLLM's superior performance across both in-domain and out-of-domain tasks. The code is available at \url{https://github.com/sunnynexus/RetroLLM}.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.11919

Country:

Europe (1.00)
Asia (0.93)
North America > United States > California (0.28)
(2 more...)

Genre:

Personal > Honors (1.00)
Research Report > New Finding (0.67)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Media > Music (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Cheng, Yiruo, Mao, Kelong, Zhao, Ziliang, Dong, Guanting, Qian, Hongjin, Wu, Yongkang, Sakai, Tetsuya, Wen, Ji-Rong, Dou, Zhicheng

arXiv.org Artificial IntelligenceOct-30-2024

Retrieval-Augmented Generation (RAG) has become a powerful paradigm for enhancing large language models (LLMs) through external knowledge retrieval. Despite its widespread attention, existing academic research predominantly focuses on single-turn RAG, leaving a significant gap in addressing the complexities of multi-turn conversations found in real-world applications. To bridge this gap, we introduce CORAL, a large-scale benchmark designed to assess RAG systems in realistic multi-turn conversational settings. CORAL includes diverse information-seeking conversations automatically derived from Wikipedia and tackles key challenges such as open-domain coverage, knowledge intensity, free-form responses, and topic shifts. It supports three core tasks of conversational RAG: passage retrieval, response generation, and citation labeling. We propose a unified framework to standardize various conversational RAG methods and conduct a comprehensive evaluation of these methods on CORAL, demonstrating substantial opportunities for improving existing approaches.

large language model, machine learning, qwen2, (20 more...)

arXiv.org Artificial Intelligence

2410.2309

Country:

North America > United States > Maryland (0.14)
Asia > Middle East > UAE (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Personal (0.93)
Research Report (0.64)

Industry:

Leisure & Entertainment (0.94)
Transportation > Ground > Road (0.68)
Automobiles & Trucks (0.68)
Transportation > Electric Vehicle (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions

Zhang, Zhebin, Zhang, Xinyu, Ren, Yuanhang, Shi, Saijiang, Han, Meng, Wu, Yongkang, Lai, Ruofei, Cao, Zhao

arXiv.org Artificial IntelligenceNov-30-2023

Retrieval-Augmented Generation (RAG), by incorporating external knowledge with parametric memory of language models, has become the state-of-the-art architecture for open-domain QA tasks. However, common knowledge bases are inherently constrained by limited coverage and noisy information, making retrieval-based approaches inadequate to answer implicit reasoning questions. In this paper, we propose an Induction-Augmented Generation (IAG) framework that utilizes inductive knowledge along with the retrieved documents for implicit reasoning. We leverage large language models (LLMs) for deriving such knowledge via a novel prompting method based on inductive reasoning patterns. On top of this, we implement two versions of IAG named IAG-GPT and IAG-Student, respectively. IAG-GPT directly utilizes the knowledge generated by GPT-3 for answer prediction, while IAG-Student gets rid of dependencies on GPT service at inference time by incorporating a student inductor model. The inductor is firstly trained via knowledge distillation and further optimized by back-propagating the generator feedback via differentiable beam scores. Experimental results show that IAG outperforms RAG baselines as well as ChatGPT on two Open-Domain QA tasks. Notably, our best models have won the first place in the official leaderboards of CSQA2.0 (since Nov 1, 2022) and StrategyQA (since Jan 8, 2023).

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.18397

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Zero-Shot Continuous Prompt Transfer: Generalizing Task Semantics Across Language Models

Wu, Zijun, Wu, Yongkang, Mou, Lili

arXiv.org Artificial IntelligenceOct-2-2023

Recently in natural language processing (NLP), there has been a paradigm shift from full language model finetuning to the optimization of a small subset of prompt tokens (Shin et al., 2020; Lester et al., 2021; Li and Liang, 2021; Zhong et al., 2021). As language models have dramatically increased in size and may contain billions of parameters (Brown et al., 2020), the strategy of freezing language models while optimizing the learnable prompt parameters becomes the most affordable and efficient alternative for downstream tasks. This technique, referred to as prompt tuning, has gained substantial recognition for its efficacy across a range of language models (Shin et al., 2020; Lester et al., 2021; Li and Liang, 2021; Zhong et al., 2021). Various prompt tuning methods have been explored, which can be generally categorized into discrete and continuous cases. Discrete prompt tuning, such as AutoPrompt (Shin et al., 2020), primarily focuses on the selection and optimization of a predetermined set of tokens within a language model's vocabulary.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.01691

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)

Add feedback

Unsupervised Chunking with Hierarchical RNN

Wu, Zijun, Deshmukh, Anup Anand, Wu, Yongkang, Lin, Jimmy, Mou, Lili

arXiv.org Artificial IntelligenceSep-9-2023

In Natural Language Processing (NLP), predicting linguistic structures, such as parsing and chunking, has mostly relied on manual annotations of syntactic structures. This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner. We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions. Our approach involves a two-stage training process: pretraining with an unsupervised parser and finetuning on downstream NLP tasks. Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points. Further, finetuning with downstream tasks results in an additional performance improvement. Interestingly, we observe that the emergence of the chunking structure is transient during the neural model's downstream-task training. This study contributes to the advancement of unsupervised syntactic structure discovery and opens avenues for further research in linguistic theory.

artificial intelligence, natural language, unsupervised chunking, (1 more...)

arXiv.org Artificial Intelligence

2309.04919

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback