AITopics | Liu, Wanlong

Collaborating Authors

Liu, Wanlong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models

Liu, Wanlong, Xiao, Yichen, Zeng, Dingyi, Zhao, Hongyang, Chen, Wenyu, Zhang, Malu

arXiv.org Artificial IntelligenceJan-30-2025

Post-Training Quantization (PTQ) is pivotal for deploying large language models (LLMs) within resource-limited settings by significantly reducing resource demands. However, existing PTQ strategies underperform at low bit levels < 3 bits due to the significant difference between the quantized and original weights. To enhance the quantization performance at low bit widths, we introduce a Mixed-precision Graph Neural PTQ (MG-PTQ) approach, employing a graph neural network (GNN) module to capture dependencies among weights and adaptively assign quantization bit-widths. Through the information propagation of the GNN module, our method more effectively captures dependencies among target weights, leading to a more accurate assessment of weight importance and optimized allocation of quantization strategies. Extensive experiments on the WikiText2 and C4 datasets demonstrate that our MG-PTQ method outperforms previous state-of-the-art PTQ method GPTQ, setting new benchmarks for quantization performance under low-bit conditions.

large language model, machine learning, quantization, (18 more...)

arXiv.org Artificial Intelligence

2501.18154

Country:

Asia (0.47)
Europe > Austria > Vienna (0.15)

Genre: Research Report (0.40)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Liu, Wanlong, Chen, Junying, Ji, Ke, Zhou, Li, Chen, Wenyu, Wang, Benyou

arXiv.org Artificial IntelligenceDec-31-2024

Retrieval-Augmented Generation (RAG) has emerged as a key paradigm for enhancing large language models (LLMs) by incorporating external knowledge. However, current RAG methods face two limitations: (1) they only cover limited RAG scenarios. (2) They suffer from limited task diversity due to the lack of a general RAG dataset. To address these limitations, we propose RAG-Instruct, a general method for synthesizing diverse and high-quality RAG instruction data based on any source corpus. Our approach leverages (1) five RAG paradigms, which encompass diverse query-document relationships, and (2) instruction simulation, which enhances instruction diversity and quality by utilizing the strengths of existing instruction datasets. Using this method, we construct a 40K instruction dataset from Wikipedia, comprehensively covering diverse RAG scenarios and tasks. Experiments demonstrate that RAG-Instruct effectively enhances LLMs' RAG capabilities, achieving strong zero-shot performance and significantly outperforming various RAG baselines across a diverse set of tasks. RAG-Instruct is publicly available at https://github.com/FreedomIntelligence/RAG-Instruct.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.00353

Country: Asia > China (0.46)

Genre: Research Report (0.64)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Chen, Junying, Cai, Zhenyang, Ji, Ke, Wang, Xidong, Liu, Wanlong, Wang, Rongsheng, Hou, Jianye, Wang, Benyou

arXiv.org Artificial IntelligenceDec-25-2024

The breakthrough of OpenAI o1 highlights the potential of enhancing reasoning to improve LLM. Yet, most research in reasoning has focused on mathematical tasks, leaving domains like medicine underexplored. The medical domain, though distinct from mathematics, also demands robust reasoning to provide reliable answers, given the high standards of healthcare. However, verifying medical reasoning is challenging, unlike those in mathematics. To address this, we propose verifiable medical problems with a medical verifier to check the correctness of model outputs. This verifiable nature enables advancements in medical reasoning through a two-stage approach: (1) using the verifier to guide the search for a complex reasoning trajectory for fine-tuning LLMs, (2) applying reinforcement learning (RL) with verifier-based rewards to enhance complex reasoning further. Finally, we introduce HuatuoGPT-o1, a medical LLM capable of complex reasoning, which outperforms general and medical-specific baselines using only 40K verifiable problems. Experiments show complex reasoning improves medical problem-solving and benefits more from RL. We hope our approach inspires advancements in reasoning across medical and other specialized domains.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.18925

Country:

Europe > Austria (0.29)
Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

Liu, Wanlong, Zhou, Li, Zeng, Dingyi, Xiao, Yichen, Cheng, Shaohuan, Zhang, Chen, Lee, Grandee, Zhang, Malu, Chen, Wenyu

arXiv.org Artificial IntelligenceJun-16-2024

Recent mainstream event argument extraction methods process each event in isolation, resulting in inefficient inference and ignoring the correlations among multiple events. To address these limitations, here we propose a multiple-event argument extraction model DEEIA (Dependency-guided Encoding and Event-specific Information Aggregation), capable of extracting arguments from all events within a document simultaneouslyThe proposed DEEIA model employs a multi-event prompt mechanism, comprising DE and EIA modules. The DE module is designed to improve the correlation between prompts and their corresponding event contexts, whereas the EIA module provides event-specific information to improve contextual understanding. Extensive experiments show that our method achieves new state-of-the-art performance on four public datasets (RAMS, WikiEvents, MLEE, and ACE05), while significantly saving the inference time compared to the baselines. Further analyses demonstrate the effectiveness of the proposed modules.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.01884

Country:

North America > United States (0.90)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.27)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.76)

Technology:

Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying

Wang, Guanghui, Liu, Dexi, Nie, Jian-Yun, Wan, Qizhi, Hu, Rong, Liu, Xiping, Liu, Wanlong, Liu, Jiaming

arXiv.org Artificial IntelligenceJun-15-2024

Recent advancements in event argument extraction (EAE) involve incorporating useful auxiliary information into models during training and inference, such as retrieved instances and event templates. These methods face two challenges: (1) the retrieval results may be irrelevant and (2) templates are developed independently for each event without considering their possible relationship. In this work, we propose DEGAP to address these challenges through a simple yet effective components: dual prefixes, i.e. learnable prompt vectors, where the instance-oriented prefix and template-oriented prefix are trained to learn information from different event instances and templates. Additionally, we propose an event-guided adaptive gating mechanism, which can adaptively leverage possible connections between different events and thus capture relevant information from the prefix. Finally, these event-guided prefixes provide relevant information as cues to EAE model without retrieval. Extensive experiments demonstrate that our method achieves new state-of-the-art performance on four datasets (ACE05, RAMS, WIKIEVENTS, and MLEE). Further analysis shows the impact of different components.

information, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.13325

Country:

North America > Canada (0.29)
North America > United States (0.28)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.25)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge

Zhou, Li, Karidi, Taelin, Garneau, Nicolas, Cao, Yong, Liu, Wanlong, Chen, Wenyu, Hershcovich, Daniel

arXiv.org Artificial IntelligenceApr-10-2024

Recent studies have highlighted the presence of cultural biases in Large Language Models (LLMs), yet often lack a robust methodology to dissect these phenomena comprehensively. Our work aims to bridge this gap by delving into the Food domain, a universally relevant yet culturally diverse aspect of human life. We introduce FmLAMA, a multilingual dataset centered on food-related cultural facts and variations in food practices. We analyze LLMs across various architectures and configurations, evaluating their performance in both monolingual and multilingual settings. By leveraging templates in six different languages, we investigate how LLMs interact with language-specific and cultural knowledge. Our findings reveal that (1) LLMs demonstrate a pronounced bias towards food knowledge prevalent in the United States; (2) Incorporating relevant cultural context significantly improves LLMs' ability to access cultural knowledge; (3) The efficacy of LLMs in capturing cultural nuances is highly dependent on the interplay between the probing language, the specific model architecture, and the cultural context in question. This research underscores the complexity of integrating cultural understanding into LLMs and emphasizes the importance of culturally diverse datasets to mitigate biases and enhance model performance across different cultural domains.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.06833

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DPGNN: Dual-Perception Graph Neural Network for Representation Learning

Zhou, Li, Chen, Wenyu, Zeng, Dingyi, Cheng, Shaohuan, Liu, Wanlong, Zhang, Malu, Qu, Hong

arXiv.org Artificial IntelligenceJan-23-2024

Graph neural networks (GNNs) have drawn increasing attention in recent years and achieved remarkable performance in many graph-based tasks, especially in semi-supervised learning on graphs. However, most existing GNNs are based on the message-passing paradigm to iteratively aggregate neighborhood information in a single topology space. Despite their success, the expressive power of GNNs is limited by some drawbacks, such as inflexibility of message source expansion, negligence of node-level message output discrepancy, and restriction of single message space. To address these drawbacks, we present a novel message-passing paradigm, based on the properties of multi-step message source, node-specific message output, and multi-space message interaction. To verify its validity, we instantiate the new message-passing paradigm as a Dual-Perception Graph Neural Network (DPGNN), which applies a node-to-step attention mechanism to aggregate node-specific multi-step neighborhood information adaptively. Our proposed DPGNN can capture the structural neighborhood information and the feature-related information simultaneously for graph representation learning. Experimental results on six benchmark datasets with different topological structures demonstrate that our method outperforms the latest state-of-the-art models, which proves the superiority and versatility of our method. To our knowledge, we are the first to consider node-specific message passing in the GNNs.

artificial intelligence, machine learning, message-passing paradigm, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.knosys.2023.110377

2110.07869

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

MLPs Compass: What is learned when MLPs are combined with PLMs?

Zhou, Li, Chen, Wenyu, Cao, Yong, Zeng, Dingyi, Liu, Wanlong, Qu, Hong

arXiv.org Artificial IntelligenceJan-3-2024

While Transformer-based pre-trained language models and their variants exhibit strong semantic representation capabilities, the question of comprehending the information gain derived from the additional components of PLMs remains an open question in this field. Motivated by recent efforts that prove Multilayer-Perceptrons (MLPs) modules achieving robust structural capture capabilities, even outperforming Graph Neural Networks (GNNs), this paper aims to quantify whether simple MLPs can further enhance the already potent ability of PLMs to capture linguistic information. Specifically, we design a simple yet effective probing framework containing MLPs components based on BERT structure and conduct extensive experiments encompassing 10 probing tasks spanning three distinct linguistic levels. The experimental results demonstrate that MLPs can indeed enhance the comprehension of linguistic structure by PLMs. Our research provides interpretable and valuable insights into crafting variations of PLMs utilizing MLPs for tasks that emphasize diverse linguistic structures.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2401.01667

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Enhancing Document-level Event Argument Extraction with Contextual Clues and Role Relevance

Liu, Wanlong, Cheng, Shaohuan, Zeng, Dingyi, Qu, Hong

arXiv.org Artificial IntelligenceOct-19-2023

Document-level event argument extraction poses new challenges of long input and cross-sentence inference compared to its sentence-level counterpart. However, most prior works focus on capturing the relations between candidate arguments and the event trigger in each event, ignoring two crucial points: a) non-argument contextual clue information; b) the relevance among argument roles. In this paper, we propose a SCPRG (Span-trigger-based Contextual Pooling and latent Role Guidance) model, which contains two novel and effective modules for the above problem. The Span-Trigger-based Contextual Pooling(STCP) adaptively selects and aggregates the information of non-argument clue words based on the context attention weights of specific argument-trigger pairs from pre-trained model. The Role-based Latent Information Guidance (RLIG) module constructs latent role representations, makes them interact through role-interactive encoding to capture semantic relevance, and merges them into candidate arguments. Both STCP and RLIG introduce no more than 1% new parameters compared with the base model and can be easily applied to other event extraction models, which are compact and transplantable. Experiments on two public datasets show that our SCPRG outperforms previous state-of-the-art methods, with 1.13 F1 and 2.64 F1 improvements on RAMS and WikiEvents respectively. Further analyses illustrate the interpretability of our model.

extraction, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.05991

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.26)
Asia > China > Sichuan Province (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback