AITopics | Wu, Linjuan

Collaborating Authors

Wu, Linjuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification

Zhang, Xuan, Shen, Yongliang, Zheng, Zhe, Wu, Linjuan, Zhang, Wenqi, Yan, Yuchen, Peng, Qiuying, Wang, Jun, Lu, Weiming

arXiv.org Artificial IntelligenceMar-3-2025

Large language models (LLMs) have demonstrated remarkable capabilities in tool learning. In real-world scenarios, user queries are often ambiguous and incomplete, requiring effective clarification. However, existing interactive clarification approaches face two critical limitations: reliance on manually constructed datasets and lack of error correction mechanisms during multi-turn clarification. We present AskToAct, which addresses these challenges by exploiting the structural mapping between queries and their tool invocation solutions. Our key insight is that tool parameters naturally represent explicit user intents. By systematically removing key parameters from queries while retaining them as ground truth, we enable automated construction of high-quality training data. We further enhance model robustness by fine-tuning on error-correction augmented data using selective masking mechanism, enabling dynamic error detection during clarification interactions. Comprehensive experiments demonstrate that AskToAct significantly outperforms existing approaches, achieving above 79% accuracy in recovering critical unspecified intents and enhancing clarification efficiency by an average of 48.34% while maintaining high accuracy in tool invocation. Our framework exhibits robust performance across varying complexity levels and successfully generalizes to entirely unseen APIs without additional training, achieving performance comparable to GPT-4 with substantially fewer computational resources.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.0194

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind

Hou, Guiyang, Zhang, Wenqi, Shen, Yongliang, Wu, Linjuan, Lu, Weiming

arXiv.org Artificial IntelligenceJul-1-2024

Theory of Mind (ToM)-the cognitive ability to reason about mental states of ourselves and others, is the foundation of social interaction. Although ToM comes naturally to humans, it poses a significant challenge to even the most advanced Large Language Models (LLMs). Due to the complex logical chains in ToM reasoning, especially in higher-order ToM questions, simply utilizing reasoning methods like Chain of Thought (CoT) will not improve the ToM capabilities of LLMs. We present TimeToM, which constructs a temporal space and uses it as the foundation to improve the ToM capabilities of LLMs in multiple scenarios. Specifically, within the temporal space, we construct Temporal Belief State Chain (TBSC) for each character and inspired by the cognition perspective of the social world model, we divide TBSC into self-world beliefs and social world beliefs, aligning with first-order ToM (first-order beliefs) and higher-order ToM (higher-order beliefs) questions, respectively. Moreover, we design a novel tool-belief solver that, by considering belief communication between characters in temporal space, can transform a character's higher-order beliefs into another character's first-order beliefs under belief communication period. Experimental results indicate that TimeToM can dramatically improve the reasoning performance of LLMs on ToM questions while taking a big step towards coherent and robust ToM reasoning.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2407.01455

Country: Asia > China (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives

Zhang, Wenqi, Shen, Yongliang, Wu, Linjuan, Peng, Qiuying, Wang, Jun, Zhuang, Yueting, Lu, Weiming

arXiv.org Artificial IntelligenceJan-3-2024

The reflection capacity of Large Language Model (LLM) has garnered extensive attention. A post-hoc prompting strategy, e.g., reflexion and self-refine, refines LLM's response based on self-evaluated or external feedback. However, recent research indicates without external feedback, LLM's intrinsic reflection is unstable. Our investigation unveils that the key bottleneck is the quality of the self-evaluated feedback. We find LLMs often exhibit overconfidence or high randomness when self-evaluate, offering stubborn or inconsistent feedback, which causes poor reflection. To remedy this, we advocate Self-Contrast: It adaptively explores diverse solving perspectives tailored to the request, contrasts the differences, and summarizes these discrepancies into a checklist which could be used to re-examine and eliminate discrepancies. Our method endows LLM with diverse perspectives to alleviate stubborn biases. Moreover, their discrepancies indicate potential errors or inherent uncertainties that LLM often overlooks. Reflecting upon these can catalyze more accurate and stable reflection. Experiments conducted on a series of reasoning and translation tasks with different LLMs serve to underscore the effectiveness and generality of our strategy.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2401.02009

Country:

Asia (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension

Wu, Linjuan, Wu, Shaojuan, Zhang, Xiaowang, Xiong, Deyi, Chen, Shizhan, Zhuang, Zhiqiang, Feng, Zhiyong

arXiv.org Artificial IntelligenceJan-14-2023

Multilingual pre-trained models are able to zero-shot transfer knowledge from rich-resource to low-resource languages in machine reading comprehension (MRC). However, inherent linguistic discrepancies in different languages could make answer spans predicted by zero-shot transfer violate syntactic constraints of the target language. In this paper, we propose a novel multilingual MRC framework equipped with a Siamese Semantic Disentanglement Model (SSDM) to disassociate semantics from syntax in representations learned by multilingual pre-trained models. To explicitly transfer only semantic knowledge to the target language, we propose two groups of losses tailored for semantic and syntactic encoding and disentanglement. Experimental results on three multilingual MRC datasets (i.e., XQuAD, MLQA, and TyDi QA) demonstrate the effectiveness of our proposed approach over models based on mBERT and XLM-100. Code is available at:https://github.com/wulinjuan/SSDM_MRC.

artificial intelligence, learning disentangled semantic representation, zero-shot cross-lingual transfer, (2 more...)

arXiv.org Artificial Intelligence

doi: 10.48550/arXiv.2204.00996

2204.00996

Genre: Research Report (0.40)

Industry: Education > Assessment & Standards > Student Performance (0.60)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.80)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

Add feedback

Modeling Global Semantics for Question Answering over Knowledge Bases

Wu, Peiyun, Wu, Yunjie, Wu, Linjuan, Zhang, Xiaowang, Feng, Zhiyong

arXiv.org Artificial IntelligenceJan-5-2021

Semantic parsing, as an important approach However, the state-of-the-art semantic parsing approaches to question answering over knowledge bases utilize relational semantics of query graphs with pay little attention (KBQA), transforms a question into the complete to the structure semantics of a question. The structure query graph for further generating the correct logical semantics is an important part of the whole semantics query. Existing semantic parsing approaches of questions (e.g., Figure 1), especially in complex questions mainly focus on relations matching with paying where the complexity of a question often relies on its complicated less attention to the underlying internal structure structure. As a result, existing works only consider relational of questions (e.g., the dependencies and relations semantics cannot always perform complex questions between all entities in a question) to select the better. So it is necessary to pay more attention to the structure query graph. In this paper, we present a relational semantics of questions together with relational semantics graph convolutional network (RGCN)-based model when semantic parsing in KBQA. However, to model multirelational gRGCN for semantic parsing in KBQA.

artificial intelligence, information retrieval query processing, natural language, (17 more...)

arXiv.org Artificial Intelligence

2101.0151

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.82)

Add feedback