AITopics | Xu, Guoqiang

Collaborating Authors

Xu, Guoqiang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DINER: Debiasing Aspect-based Sentiment Analysis with Multi-variable Causal Inference

Wu, Jialong, Zhang, Linhai, Zhou, Deyu, Xu, Guoqiang

arXiv.org Artificial IntelligenceJun-6-2024

Though notable progress has been made, neural-based aspect-based sentiment analysis (ABSA) models are prone to learn spurious correlations from annotation biases, resulting in poor robustness on adversarial data transformations. Among the debiasing solutions, causal inference-based methods have attracted much research attention, which can be mainly categorized into causal intervention methods and counterfactual reasoning methods. However, most of the present debiasing methods focus on single-variable causal inference, which is not suitable for ABSA with two input variables (the target aspect and the review). In this paper, we propose a novel framework based on multi-variable causal inference for debiasing ABSA. In this framework, different types of biases are tackled based on different causal intervention methods. For the review branch, the bias is modeled as indirect confounding from context, where backdoor adjustment intervention is employed for debiasing. For the aspect branch, the bias is described as a direct correlation with labels, where counterfactual reasoning is adopted for debiasing. Extensive experiments demonstrate the effectiveness of the proposed method compared to various baselines on the two widely used real-world aspect robustness test set datasets.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.01166

Country:

Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient Fine-Tuning of Large Language Models

Zhang, Linhai, Wu, Jialong, Zhou, Deyu, Xu, Guoqiang

arXiv.org Artificial IntelligenceJun-6-2024

Though Large Language Models (LLMs) have demonstrated the powerful capabilities of few-shot learning through prompting methods, supervised training is still necessary for complex reasoning tasks. Because of their extensive parameters and memory consumption, both Parameter-Efficient Fine-Tuning (PEFT) methods and Memory-Efficient Fine-Tuning methods have been proposed for LLMs. Nevertheless, the issue of large annotated data consumption, the aim of Data-Efficient Fine-Tuning, remains unexplored. One obvious way is to combine the PEFT method with active learning. However, the experimental results show that such a combination is not trivial and yields inferior results. Through probe experiments, such observation might be explained by two main reasons: uncertainty gap and poor model calibration. Therefore, in this paper, we propose a novel approach to effectively integrate uncertainty-based active learning and LoRA. Specifically, for the uncertainty gap, we introduce a dynamic uncertainty measurement that combines the uncertainty of the base model and the uncertainty of the full model during the iteration of active learning. For poor model calibration, we incorporate the regularization method during LoRA training to keep the model from being over-confident, and the Monte-Carlo dropout mechanism is employed to enhance the uncertainty estimation. Experimental results show that the proposed approach outperforms existing baseline models on three complex reasoning tasks.

computational linguistic, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2403.01165

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fine-grainedly Synthesize Streaming Data Based On Large Language Models With Graph Structure Understanding For Data Sparsity

Zhang, Xin, Zhang, Linhai, Zhou, Deyu, Xu, Guoqiang

arXiv.org Artificial IntelligenceMar-10-2024

Due to the sparsity of user data, sentiment analysis on user reviews in e-commerce platforms often suffers from poor performance, especially when faced with extremely sparse user data or long-tail labels. Recently, the emergence of LLMs has introduced new solutions to such problems by leveraging graph structures to generate supplementary user profiles. However, previous approaches have not fully utilized the graph understanding capabilities of LLMs and have struggled to adapt to complex streaming data environments. In this work, we propose a fine-grained streaming data synthesis framework that categorizes sparse users into three categories: Mid-tail, Long-tail, and Extreme. Specifically, we design LLMs to comprehensively understand three key graph elements in streaming data, including Local-global Graph Understanding, Second-Order Relationship Extraction, and Product Attribute Understanding, which enables the generation of high-quality synthetic data to effectively address sparsity across different categories. Experimental results on three real datasets demonstrate significant performance improvements, with synthesized data contributing to MSE reductions of 45.85%, 3.16%, and 62.21%, respectively.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.06139

Country:

Asia (1.00)
Europe (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.55)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

Zhou, Ruiwen, Yang, Yingxuan, Wen, Muning, Wen, Ying, Wang, Wenhao, Xi, Chunling, Xu, Guoqiang, Yu, Yong, Zhang, Weinan

arXiv.org Artificial IntelligenceMar-10-2024

Numerous large language model (LLM) agents have been built for different tasks like web navigation and online shopping due to LLM's wide knowledge and text-understanding ability. Among these works, many of them utilize in-context examples to achieve generalization without the need for fine-tuning, while few of them have considered the problem of how to select and effectively utilize these examples. Recently, methods based on trajectory-level retrieval with task meta-data and using trajectories as in-context examples have been proposed to improve the agent's overall performance in some sequential decision making tasks. However, these methods can be problematic due to plausible examples retrieved without task-specific state transition dynamics and long input with plenty of irrelevant context. In this paper, we propose a novel framework (TRAD) to address these issues. TRAD first conducts Thought Retrieval, achieving step-level demonstration selection via thought matching, leading to more helpful demonstrations and less irrelevant input noise. Then, TRAD introduces Aligned Decision, complementing retrieved demonstration steps with their previous or subsequent steps, which enables tolerance for imperfect thought and provides a choice for balance between more context and less noise. Extensive experiments on ALFWorld and Mind2Web benchmarks show that TRAD not only outperforms state-of-the-art models but also effectively helps in reducing noise and promoting generalization. Furthermore, TRAD has been deployed in real-world scenarios of a global business insurance company and improves the success rate of robotic process automation.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.06221

Country:

North America > United States (0.16)
Asia > China (0.15)

Genre: Research Report (1.00)

Industry:

Banking & Finance (0.74)
Information Technology > Services > e-Commerce Services (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Curriculum-Meta Learning for Order-Robust Continual Relation Extraction

Wu, Tongtong, Li, Xuekai, Li, Yuan-Fang, Haffari, Reza, Qi, Guilin, Zhu, Yujin, Xu, Guoqiang

arXiv.org Artificial IntelligenceJan-8-2021

Continual relation extraction is an important task that focuses on extracting new facts incrementally from unstructured text. Given the sequential arrival order of the relations, this task is prone to two serious challenges, namely catastrophic forgetting and order-sensitivity. We propose a novel curriculum-meta learning method to tackle the above two challenges in continual relation extraction. We combine meta learning and curriculum learning to quickly adapt model parameters to a new task and to reduce interference of previously seen tasks on the current task. We design a novel relation representation learning method through the distribution of domain and range types of relations. Such representations are utilized to quantify the difficulty of tasks for the construction of curricula. Moreover, we also present novel difficulty-based metrics to quantitatively measure the extent of order-sensitivity of a given model, suggesting new ways to evaluate model robustness. Our comprehensive experiments on three benchmark datasets show that our proposed method outperforms the state-of-the-art techniques. The code is available at the anonymous GitHub repository: https://github.com/wutong8023/AAAI_CML.

deep learning, neural network, relation, (18 more...)

arXiv.org Artificial Intelligence

2101.01926

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.66)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge

Zheng, Ziqi, Cao, Chenjie, Chen, Xingwei, Xu, Guoqiang

arXiv.org Artificial IntelligenceMay-2-2018

The continuous dimensional emotion modelled by arousal and valence can depict complex changes of emotions. In this paper, we present our works on arousal and valence predictions for One-Minute-Gradual (OMG) Emotion Challenge. Multimodal representations are first extracted from videos using a variety of acoustic, video and textual models and support vector machine (SVM) is then used for fusion of multimodal signals to make final predictions. Our solution achieves Concordant Correlation Coefficient (CCC) scores of 0.397 and 0.520 on arousal and valence respectively for the validation dataset, which outperforms the baseline systems with the best CCC scores of 0.15 and 0.23 on arousal and valence by a large margin.

deep learning, neural network, representation, (20 more...)

arXiv.org Artificial Intelligence

1805.0106

Country:

Europe (0.14)
Asia > China (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Add feedback