AITopics | Xu, Weidi

Plotting

Xu, Weidi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Autonomous Code Integration for Math Language Models

Wang, Haozhe, Li, Long, Qu, Chao, Zhu, Fengming, Xu, Weidi, Chu, Wei, Lin, Fangzhen

arXiv.org Artificial IntelligenceFeb-16-2025

Recent advances in mathematical problem-solving with language models (LMs) integrate chain-of-thought (CoT) reasoning and code execution to harness their complementary strengths. However, existing hybrid frameworks exhibit a critical limitation: they depend on externally dictated instructions or rigid code-integration templates, lacking metacognitive awareness -- the capacity to dynamically evaluate intrinsic capabilities and autonomously determine when and how to integrate tools. This rigidity motivates our study of autonomous code integration, enabling models to adapt tool-usage strategies as their reasoning abilities evolve during training. While reinforcement learning (RL) shows promise for boosting LLM reasoning at scale (e.g., DeepSeek-R1), we demonstrate its inefficiency in learning autonomous code integration due to inadequate exploration of the vast combinatorial space of CoT-code interleaving patterns. To address this challenge, we propose a novel Expectation-Maximization (EM) framework that synergizes structured exploration (E-step) with off-policy RL optimization (M-step), creating a self-reinforcing cycle between metacognitive tool-use decisions and evolving capabilities. Experiments reveal our method achieves superior results through improved exploration. Notably, our 7B model improves over 11% on MATH500 and 9.4% on AIME without o1-like CoT.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.00691

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)

Add feedback

Denoising Time Cycle Modeling for Recommendation

Xie, Sicong, Li, Qunwei, Xu, Weidi, Shen, Kaiming, Chen, Shaohu, Zhong, Wenliang

arXiv.org Artificial IntelligenceFeb-4-2024

Recently, modeling temporal patterns of user-item interactions have attracted much attention in recommender systems. We argue that existing methods ignore the variety of temporal patterns of user behaviors. We define the subset of user behaviors that are irrelevant to the target item as noises, which limits the performance of target-related time cycle modeling and affect the recommendation performance. In this paper, we propose Denoising Time Cycle Modeling (DiCycle), a novel approach to denoise user behaviors and select the subset of user behaviors that are highly related to the target item. DiCycle is able to explicitly model diverse time cycle patterns for recommendation. Extensive experiments are conducted on both public benchmarks and a real-world dataset, demonstrating the superior performance of DiCycle over the state-of-the-art recommendation methods.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.02718

Country: Asia > China > Zhejiang Province (0.15)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

LogicMP: A Neuro-symbolic Approach for Encoding First-order Logic Constraints

Xu, Weidi, Wang, Jingwei, Xie, Lele, He, Jianshan, Zhou, Hongting, Wang, Taifeng, Wan, Xiaopei, Chen, Jingdong, Qu, Chao, Chu, Wei

arXiv.org Artificial IntelligenceSep-29-2023

Integrating first-order logic constraints (FOLCs) with neural networks is a crucial but challenging problem since it involves modeling intricate correlations to satisfy the constraints. This paper proposes a novel neural layer, LogicMP, whose layers perform mean-field variational inference over an MLN. It can be plugged into any off-the-shelf neural network to encode FOLCs while retaining modularity and efficiency. By exploiting the structure and symmetries in MLNs, we theoretically demonstrate that our well-designed, efficient mean-field iterations effectively mitigate the difficulty of MLN inference, reducing the inference from sequential calculation to a series of parallel tensor operations. Empirical results in three kinds of tasks over graphs, images, and text show that LogicMP outperforms advanced competitors in both performance and efficiency.

logic & formal reasoning, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2309.15458

Country:

Europe (1.00)
North America > Canada > Quebec (0.28)
North America > United States > Massachusetts (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Question Directed Graph Attention Network for Numerical Reasoning over Text

Chen, Kunlong, Xu, Weidi, Cheng, Xingyi, Xiaochuan, Zou, Zhang, Yuyu, Song, Le, Wang, Taifeng, Qi, Yuan, Chu, Wei

arXiv.org Artificial IntelligenceSep-15-2020

Although NumNet achieves superior performance than Numerical reasoning over texts, such as addition, other numerically-aware models (Hu et al., 2019a; Andor subtraction, sorting and counting, is a et al., 2019; Geva et al., 2020; Chen et al., 2020), we challenging machine reading comprehension argue that NumNet is insufficient for sophisticated numerical task, since it requires both natural language understanding reasoning, since it lacks two critical ingredients and arithmetic computation. To for numerical reasoning: address this challenge, we propose a heterogeneous 1. Number Type and Entity Mention. The number graph representation for the context of comparison graph in NumNet is not able to identify the passage and question needed for such reasoning, different number types, and lacks the information of and design a question directed graph entities mentioned in the document that connect the attention network to drive multi-step numerical number nodes.

artificial intelligence, neural network, reasoning, (19 more...)

arXiv.org Artificial Intelligence

2009.07448

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Variational Autoencoder for Semi-Supervised Text Classification

Xu, Weidi (Peking University) | Sun, Haoze (Peking University) | Deng, Chao (Peking University) | Tan, Ying (Peking University)

AAAI ConferencesFeb-14-2017

Although semi-supervised variational autoencoder (SemiVAE) works in image classification task, it fails in text classification task if using vanilla LSTM as its decoder. From a perspective of reinforcement learning, it is verified that the decoder's capability to distinguish between different categorical labels is essential. Therefore, Semi-supervised Sequential Variational Autoencoder (SSVAE) is proposed, which increases the capability by feeding label into its decoder RNN at each time-step. Two specific decoder structures are investigated and both of them are verified to be effective. Besides, in order to reduce the computational complexity in training, a novel optimization method is proposed, which estimates the gradient of the unlabeled objective function by sampling, along with two variance reduction techniques. Experimental results on Large Movie Review Dataset (IMDB) and AG's News corpus show that the proposed approach significantly improves the classification accuracy compared with pure-supervised classifiers, and achieves competitive performance against previous advanced methods. State-of-the-art results can be obtained by integrating other pretraining-based methods.

deep learning, neural network, variational autoencoder, (19 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

Europe (1.00)
North America > United States > New York (0.14)
North America > United States > Oregon (0.14)
(2 more...)

Industry:

Media > Film (0.66)
Leisure & Entertainment (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback