AITopics | Huang, Luyang

Collaborating Authors

Huang, Luyang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

G-DIG: Towards Gradient-based Diverse and High-quality Instruction Data Selection for Machine Translation

Pan, Xingyuan, Huang, Luyang, Kang, Liyan, Liu, Zhicheng, Lu, Yu, Cheng, Shanbo

arXiv.org Artificial IntelligenceJul-7-2024

Large Language Models (LLMs) have demonstrated remarkable abilities in general scenarios. Instruction finetuning empowers them to align with humans in various tasks. Nevertheless, the Diversity and Quality of the instruction data remain two main challenges for instruction finetuning. With regard to this, in this paper, we propose a novel gradient-based method to automatically select high-quality and diverse instruction finetuning data for machine translation. Our key innovation centers around analyzing how individual training examples influence the model during training. Specifically, we select training examples that exert beneficial influences on the model as high-quality ones by means of Influence Function plus a small high-quality seed dataset. Moreover, to enhance the diversity of the training data we maximize the variety of influences they have on the model by clustering on their gradients and resampling. Extensive experiments on WMT22 and FLORES translation tasks demonstrate the superiority of our methods, and in-depth analysis further validates their effectiveness and generalization.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.12915

Country: Asia (0.14)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Cao, Zhiwei, Cao, Qian, Lu, Yu, Peng, Ningxin, Huang, Luyang, Cheng, Shanbo, Su, Jinsong

arXiv.org Artificial IntelligenceJun-17-2024

The growing popularity of Large Language Models has sparked interest in context compression for Large Language Models (LLMs). However, the performance of previous methods degrades dramatically as compression ratios increase, sometimes even falling to the closed-book level. This decline can be attributed to the loss of key information during the compression process. Our preliminary study supports this hypothesis, emphasizing the significance of retaining key information to maintain model performance under high compression ratios. As a result, we introduce Query-Guided Compressor (QGC), which leverages queries to guide the context compression process, effectively preserving key information within the compressed context. Additionally, we employ a dynamic compression strategy. We validate the effectiveness of our proposed QGC on the Question Answering task, including NaturalQuestions, TriviaQA, and HotpotQA datasets. Experimental results show that QGC can consistently perform well even at high compression ratios, which also offers significant benefits in terms of inference cost and throughput.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2406.02376

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

Zhu, Yaoming, Sun, Zewei, Cheng, Shanbo, Huang, Luyang, Wu, Liwei, Wang, Mingxuan

arXiv.org Artificial IntelligenceSep-2-2023

Multimodal machine translation (MMT) aims to improve translation quality by incorporating information from other modalities, such as vision. Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets. These studies face two challenges. First, they can only utilize triple data (bilingual texts with images), which is scarce; second, current benchmarks are relatively restricted and do not correspond to realistic scenarios. Therefore, this paper correspondingly establishes new methods and new datasets for MMT. First, we propose a framework 2/3-Triplet with two new approaches to enhance MMT by utilizing large-scale non-triple data: monolingual image-text data and parallel text-only data. Second, we construct an English-Chinese {e}-commercial {m}ulti{m}odal {t}ranslation dataset (including training and testing), named EMMT, where its test set is carefully selected as some words are ambiguous and shall be translated mistakenly without the help of images. Experiments show that our method is more suitable for real-world scenarios and can significantly improve translation performance by using more non-triple data. In addition, our model also rivals various SOTA models in conventional multimodal translation benchmarks.

artificial intelligence, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2212.10313

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Law (0.67)
Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation

Kang, Liyan, Huang, Luyang, Peng, Ningxin, Zhu, Peihao, Sun, Zewei, Cheng, Shanbo, Wang, Mingxuan, Huang, Degen, Su, Jinsong

arXiv.org Artificial IntelligenceJul-3-2023

The text inputs are often context to understand the world. From the simple and sufficient for translation tasks (Wu perspective of NMT, it is also much needed to et al., 2021). Take the widely used Multi30K as make use of such information to approach humanlevel an example. Multi30K consists of only 30K image translation abilities. To facilitate Multimodal captions, while typical text translation systems are Machine Translation (MMT) research, a number often trained with several million sentence pairs. of datasets have been proposed including imageguided We argue that studying the effects of visual contexts translation datasets (Elliott et al., 2016; in machine translation requires a large-scale Gella et al., 2019; Wang et al., 2022) and videoguided and diverse data set for training and a real-world translation datasets (Sanabria et al., 2018; and complex benchmark for testing.

machine learning, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

2305.18326

Country: Asia > China > Fujian Province (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.46)
Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

An Entity-Driven Framework for Abstractive Summarization

Sharma, Eva, Huang, Luyang, Hu, Zhe, Wang, Lu

arXiv.org Artificial IntelligenceSep-4-2019

Abstractive summarization systems aim to produce more coherent and concise summaries than their extractive counterparts. Popular neural models have achieved impressive results for single-document summarization, yet their outputs are often incoherent and unfaithful to the input. In this paper, we introduce SENECA, a novel System for ENtity-drivEn Coherent Abstractive summarization framework that leverages entity information to generate informative and coherent abstracts. Our framework takes a two-step approach: (1) an entity-aware content selection module first identifies salient sentences from the input, then (2) an abstract generation module conducts cross-sentence information compression and abstraction to generate the final summary, which is trained with rewards to promote coherence, conciseness, and clarity. The two components are further connected using reinforcement learning. Automatic evaluation shows that our model significantly outperforms previous state-of-the-art on ROUGE and our proposed coherence measures on New York Times and CNN/Daily Mail datasets. Human judges further rate our system summaries as more informative and coherent than those by popular summarization models.

computational linguistics, deep learning, neural network, (23 more...)

arXiv.org Artificial Intelligence

1909.02059

Country:

Europe (1.00)
North America > United States > Massachusetts > Suffolk County > Boston (0.14)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Law (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback