AITopics | Peng, Ningxin

Plotting

Peng, Ningxin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Cao, Zhiwei, Cao, Qian, Lu, Yu, Peng, Ningxin, Huang, Luyang, Cheng, Shanbo, Su, Jinsong

arXiv.org Artificial IntelligenceJun-17-2024

The growing popularity of Large Language Models has sparked interest in context compression for Large Language Models (LLMs). However, the performance of previous methods degrades dramatically as compression ratios increase, sometimes even falling to the closed-book level. This decline can be attributed to the loss of key information during the compression process. Our preliminary study supports this hypothesis, emphasizing the significance of retaining key information to maintain model performance under high compression ratios. As a result, we introduce Query-Guided Compressor (QGC), which leverages queries to guide the context compression process, effectively preserving key information within the compressed context. Additionally, we employ a dynamic compression strategy. We validate the effectiveness of our proposed QGC on the Question Answering task, including NaturalQuestions, TriviaQA, and HotpotQA datasets. Experimental results show that QGC can consistently perform well even at high compression ratios, which also offers significant benefits in terms of inference cost and throughput.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2406.02376

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Chain of Thought Explanation for Dialogue State Tracking

Xu, Lin, Peng, Ningxin, Zhou, Daquan, Ng, See-Kiong, Fu, Jinlan

arXiv.org Artificial IntelligenceMar-9-2024

Dialogue state tracking (DST) aims to record user queries and goals during a conversational interaction achieved by maintaining a predefined set of slots and their corresponding values. Current approaches decide slot values opaquely, while humans usually adopt a more deliberate approach by collecting information from relevant dialogue turns and then reasoning the appropriate values. In this work, we focus on the steps needed to figure out slot values by proposing a model named Chain-of-Thought-Explanation (CoTE) for the DST task. CoTE, which is built on the generative DST framework, is designed to create detailed explanations step by step after determining the slot values. This process leads to more accurate and reliable slot values. More-over, to improve the reasoning ability of the CoTE, we further construct more fluent and high-quality explanations with automatic paraphrasing, leading the method CoTE-refined. Experimental results on three widely recognized DST benchmarks-MultiWOZ 2.2, WoZ 2.0, and M2M-demonstrate the remarkable effectiveness of the CoTE. Furthermore, through a meticulous fine-grained analysis, we observe significant benefits of our CoTE on samples characterized by longer dialogue turns, user responses, and reasoning steps.

explanation, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2403.04656

Country:

Europe (1.00)
North America > United States (0.46)
North America > Canada (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)

Add feedback

BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation

Kang, Liyan, Huang, Luyang, Peng, Ningxin, Zhu, Peihao, Sun, Zewei, Cheng, Shanbo, Wang, Mingxuan, Huang, Degen, Su, Jinsong

arXiv.org Artificial IntelligenceJul-3-2023

The text inputs are often context to understand the world. From the simple and sufficient for translation tasks (Wu perspective of NMT, it is also much needed to et al., 2021). Take the widely used Multi30K as make use of such information to approach humanlevel an example. Multi30K consists of only 30K image translation abilities. To facilitate Multimodal captions, while typical text translation systems are Machine Translation (MMT) research, a number often trained with several million sentence pairs. of datasets have been proposed including imageguided We argue that studying the effects of visual contexts translation datasets (Elliott et al., 2016; in machine translation requires a large-scale Gella et al., 2019; Wang et al., 2022) and videoguided and diverse data set for training and a real-world translation datasets (Sanabria et al., 2018; and complex benchmark for testing.

machine learning, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

2305.18326

Country: Asia > China > Fujian Province (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.46)
Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Visual Information Matters for ASR Error Correction

Kumar, Vanya Bannihatti, Cheng, Shanbo, Peng, Ningxin, Zhang, Yuchen

arXiv.org Artificial IntelligenceMay-26-2023

Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error correction (EC) techniques have been widely developed due to their efficiency in using parallel text data. Previous works mainly focus on using text or/ and speech data, which hinders the performance gain when not only text and speech information, but other modalities, such as visual information are critical for EC. The challenges are mainly two folds: one is that previous work fails to emphasize visual information, thus rare exploration has been studied. The other is that the community lacks a high-quality benchmark where visual information matters for the EC models. Therefore, this paper provides 1) simple yet effective methods, namely gated fusion and image captions as prompts to incorporate visual information to help EC; 2) large-scale benchmark datasets, namely Visual-ASR-EC, where each item in the training data consists of visual, speech, and text information, and the test data are carefully selected by human annotators to ensure that even humans could make mistakes when visual information is missing. Experimental results show that using captions as prompts could effectively use the visual information and surpass state-of-the-art methods by upto 1.2% in Word Error Rate(WER), which also indicates that visual information is critical in our proposed Visual-ASR-EC dataset

information, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2303.1016

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback