AITopics | Wang, Chenyang

Collaborating Authors

Wang, Chenyang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LREF: A Novel LLM-based Relevance Framework for E-commerce

Tang, Tian, Tian, Zhixing, Zhu, Zhenyu, Wang, Chenyang, Hu, Haiqing, Tang, Guoyu, Liu, Lin, Xu, Sulong

arXiv.org Artificial IntelligenceMar-12-2025

Query and product relevance prediction is a critical component for ensuring a smooth user experience in e-commerce search. Traditional studies mainly focus on BERT-based models to assess the semantic relevance between queries and products. However, the discriminative paradigm and limited knowledge capacity of these approaches restrict their ability to comprehend the relevance between queries and products fully. With the rapid advancement of Large Language Models (LLMs), recent research has begun to explore their application to industrial search systems, as LLMs provide extensive world knowledge and flexible optimization for reasoning processes. Nonetheless, directly leveraging LLMs for relevance prediction tasks introduces new challenges, including a high demand for data quality, the necessity for meticulous optimization of reasoning processes, and an optimistic bias that can result in over-recall. To overcome the above problems, this paper proposes a novel framework called the LLM-based RElevance Framework (LREF) aimed at enhancing e-commerce search relevance. The framework comprises three main stages: supervised fine-tuning (SFT) with Data Selection, Multiple Chain of Thought (Multi-CoT) tuning, and Direct Preference Optimization (DPO) for de-biasing. We evaluate the performance of the framework through a series of offline experiments on large-scale real-world datasets, as well as online A/B testing. The results indicate significant improvements in both offline and online metrics. Ultimately, the model was deployed in a well-known e-commerce application, yielding substantial commercial benefits.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3701716.3715246

2503.09223

Country:

Oceania > Australia (0.16)
Asia > China (0.16)
North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

HuixiangDou2: A Robustly Optimized GraphRAG Approach

Kong, Huanjun, Wang, Zhefan, Wang, Chenyang, Ma, Zhe, Dong, Nanqing

arXiv.org Artificial IntelligenceMar-9-2025

Large Language Models (LLMs) perform well on familiar queries but struggle with specialized or emerging topics. Graph-based Retrieval-Augmented Generation (GraphRAG) addresses this by structuring domain knowledge as a graph for dynamic retrieval. However, existing pipelines involve complex engineering workflows, making it difficult to isolate the impact of individual components. Evaluating retrieval effectiveness is also challenging due to dataset overlap with LLM pretraining data. In this work, we introduce HuixiangDou2, a robustly optimized GraphRAG framework. Specifically, we leverage the effectiveness of dual-level retrieval and optimize its performance in a 32k context for maximum precision, and compare logic-based retrieval and dual-level retrieval to enhance overall functionality. Our implementation includes comparative experiments on a test set, where Qwen2.5-7B-Instruct initially underperformed. With our approach, the score improved significantly from 60 to 74.5, as illustrated in the Figure. Experiments on domain-specific datasets reveal that dual-level retrieval enhances fuzzy matching, while logic-form retrieval improves structured reasoning. Furthermore, we propose a multi-stage verification mechanism to improve retrieval robustness without increasing computational cost. Empirical results show significant accuracy gains over baselines, highlighting the importance of adaptive retrieval. To support research and adoption, we release HuixiangDou2 as an open-source resource https://github.com/tpoisonooo/huixiangdou2.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.06474

Country: Asia (0.28)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Consistency of Responses and Continuations Generated by Large Language Models on Social Media

Fan, Wenlu, Zhu, Yuqi, Wang, Chenyang, Wang, Bin, Xu, Wentao

arXiv.org Artificial IntelligenceJan-15-2025

Large Language Models (LLMs) demonstrate remarkable capabilities in text generation, yet their emotional consistency and semantic coherence in social media contexts remain insufficiently understood. This study investigates how LLMs handle emotional content and maintain semantic relationships through continuation and response tasks using two open-source models: Gemma and Llama. By analyzing climate change discussions from Twitter and Reddit, we examine emotional transitions, intensity patterns, and semantic similarity between human-authored and LLM-generated content. Our findings reveal that while both models maintain high semantic coherence, they exhibit distinct emotional patterns: Gemma shows a tendency toward negative emotion amplification, particularly anger, while maintaining certain positive emotions like optimism. Llama demonstrates superior emotional preservation across a broader spectrum of affects. Both models systematically generate responses with attenuated emotional intensity compared to human-authored content and show a bias toward positive emotions in response tasks. Additionally, both models maintain strong semantic similarity with original texts, though performance varies between continuation and response tasks. These findings provide insights into LLMs' emotional and semantic processing capabilities, with implications for their deployment in social media contexts and human-AI interaction design.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2501.08102

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (0.69)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

EmoPro: A Prompt Selection Strategy for Emotional Expression in LM-based Speech Synthesis

Wang, Haoyu, Qiang, Chunyu, Wang, Tianrui, Gong, Cheng, Liu, Qiuyu, Jiang, Yu, Wang, Xiaobao, Wang, Chenyang, Zhang, Chen

arXiv.org Artificial IntelligenceSep-27-2024

Recent advancements in speech synthesis models, trained on extensive datasets, have demonstrated remarkable zero-shot capabilities. These models can control content, timbre, and emotion in generated speech based on prompt inputs. Despite these advancements, the choice of prompts significantly impacts the output quality, yet most existing selection schemes do not adequately address the control of emotional intensity. To address this question, this paper proposes a two-stage prompt selection strategy EmoPro, which is specifically designed for emotionally controllable speech synthesis. This strategy focuses on selecting highly expressive and high-quality prompts by evaluating them from four perspectives: emotional expression strength, speech quality, text-emotion consistency, and model generation performance. Experimental results show that prompts selected using the proposed method result in more emotionally expressive and engaging synthesized speech compared to those obtained through baseline. Audio samples and codes will be available at https://whyrrrrun.github.io/EmoPro/.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2409.18512

Country: Asia > China (0.29)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Co-designing a Child-Robot Relational Norm Intervention to Regulate Children's Handwriting Posture

Wang, Chenyang, Tozadore, Daniel Carnieto, Bruno, Barbara, Dillenbourg, Pierre

arXiv.org Artificial IntelligenceJun-13-2024

Persuasive social robots employ their social influence to modulate children's behaviours in child-robot interaction. In this work, we introduce the Child-Robot Relational Norm Intervention (CRNI) model, leveraging the passive role of social robots and children's reluctance to inconvenience others to influence children's behaviours. Unlike traditional persuasive strategies that employ robots in active roles, CRNI utilizes an indirect approach by generating a disturbance for the robot in response to improper child behaviours, thereby motivating behaviour change through the avoidance of norm violations. The feasibility of CRNI is explored with a focus on improving children's handwriting posture. To this end, as a preliminary work, we conducted two participatory design workshops with 12 children and 1 teacher to identify effective disturbances that can promote posture correction.

artificial intelligence, disturbance, robot, (13 more...)

arXiv.org Artificial Intelligence

2406.07721

Country:

Europe (1.00)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)

Add feedback

Estimating the Number of Components in Finite Mixture Models via Variational Approximation

Wang, Chenyang, Yang, Yun

arXiv.org Machine LearningApr-25-2024

This work introduces a new method for selecting the number of components in finite mixture models (FMMs) using variational Bayes, inspired by the large-sample properties of the Evidence Lower Bound (ELBO) derived from mean-field (MF) variational approximation. Specifically, we establish matching upper and lower bounds for the ELBO without assuming conjugate priors, suggesting the consistency of model selection for FMMs based on maximizing the ELBO. As a by-product of our proof, we demonstrate that the MF approximation inherits the stable behavior (benefited from model singularity) of the posterior distribution, which tends to eliminate the extra components under model misspecification where the number of mixture components is over-specified. This stable behavior also leads to the $n^{-1/2}$ convergence rate for parameter estimation, up to a logarithmic factor, under this model overspecification. Empirical experiments are conducted to validate our theoretical findings and compare with other state-of-the-art methods for selecting the number of components in FMMs.

artificial intelligence, machine learning, mixture model, (19 more...)

arXiv.org Machine Learning

2404.16746

Country:

North America > United States > Illinois (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Adaptive Stochastic Nonlinear Model Predictive Control with Look-ahead Deep Reinforcement Learning for Autonomous Vehicle Motion Control

Zarrouki, Baha, Wang, Chenyang, Betz, Johannes

arXiv.org Artificial IntelligenceNov-7-2023

In this paper, we present a Deep Reinforcement Learning (RL)-driven Adaptive Stochastic Nonlinear Model Predictive Control (SNMPC) to optimize uncertainty handling, constraints robustification, feasibility, and closed-loop performance. To this end, we conceive an RL agent to proactively anticipate upcoming control tasks and to dynamically determine the most suitable combination of key SNMPC parameters - foremost the robustification factor $\kappa$ and the Uncertainty Propagation Horizon (UPH) $T_u$. We analyze the trained RL agent's decision-making process and highlight its ability to learn context-dependent optimal parameters. One key finding is that adapting the constraints robustification factor with the learned policy reduces conservatism and improves closed-loop performance while adapting UPH renders previously infeasible SNMPC problems feasible when faced with severe disturbances. We showcase the enhanced robustness and feasibility of our Adaptive SNMPC (aSNMPC) through the real-time motion control task of an autonomous passenger vehicle to follow an optimal race line when confronted with significant time-variant disturbances. Experimental findings demonstrate that our look-ahead RL-driven aSNMPC outperforms its Static SNMPC (sSNMPC) counterpart in minimizing the lateral deviation both with accurate and inaccurate disturbance assumptions and even when driving in previously unexplored environments.

artificial intelligence, control, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2311.04303

Genre: Research Report (0.69)

Industry: Energy > Oil & Gas > Upstream (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Stochastic Nonlinear Model Predictive Control with an Uncertainty Propagation Horizon for Autonomous Vehicle Motion Control

Zarrouki, Baha, Wang, Chenyang, Betz, Johannes

arXiv.org Artificial IntelligenceOct-28-2023

Employing Stochastic Nonlinear Model Predictive Control (SNMPC) for real-time applications is challenging due to the complex task of propagating uncertainties through nonlinear systems. This difficulty becomes more pronounced in high-dimensional systems with extended prediction horizons, such as autonomous vehicles. To enhance closed-loop performance in and feasibility in SNMPCs, we introduce the concept of the Uncertainty Propagation Horizon (UPH). The UPH limits the time for uncertainty propagation through system dynamics, preventing trajectory divergence, optimizing feedback loop advantages, and reducing computational overhead. Our SNMPC approach utilizes Polynomial Chaos Expansion (PCE) to propagate uncertainties and incorporates nonlinear hard constraints on state expectations and nonlinear probabilistic constraints. We transform the probabilistic constraints into deterministic constraints by estimating the nonlinear constraints' expectation and variance. We then showcase our algorithm's effectiveness in real-time control of a high-dimensional, highly nonlinear system-the trajectory following of an autonomous passenger vehicle, modeled with a dynamic nonlinear single-track model. Experimental results demonstrate our approach's robust capability to follow an optimal racetrack trajectory at speeds of up to 37.5m/s while dealing with state estimation disturbances, achieving a minimum solving frequency of 97Hz. Additionally, our experiments illustrate that limiting the UPH renders previously infeasible SNMPC problems feasible, even when incorrect uncertainty assumptions or strong disturbances are present.

artificial intelligence, constraint, optimization problem, (14 more...)

arXiv.org Artificial Intelligence

2310.18753

Country: Europe > Germany > Bavaria (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Oil & Gas > Upstream (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Learning with Noisy Labels via Sparse Regularization

Zhou, Xiong, Liu, Xianming, Wang, Chenyang, Zhai, Deming, Jiang, Junjun, Ji, Xiangyang

arXiv.org Machine LearningJul-31-2021

Learning with noisy labels is an important and challenging task for training accurate deep neural networks. Some commonly-used loss functions, such as Cross Entropy (CE), suffer from severe overfitting to noisy labels. Robust loss functions that satisfy the symmetric condition were tailored to remedy this problem, which however encounter the underfitting effect. In this paper, we theoretically prove that \textbf{any loss can be made robust to noisy labels} by restricting the network output to the set of permutations over a fixed vector. When the fixed vector is one-hot, we only need to constrain the output to be one-hot, which however produces zero gradients almost everywhere and thus makes gradient-based optimization difficult. In this work, we introduce the sparse regularization strategy to approximate the one-hot constraint, which is composed of network output sharpening operation that enforces the output distribution of a network to be sharp and the $\ell_p$-norm ($p\le 1$) regularization that promotes the network output to be sparse. This simple approach guarantees the robustness of arbitrary loss functions while not hindering the fitting ability. Experimental results demonstrate that our method can significantly improve the performance of commonly-used loss functions in the presence of noisy labels and class imbalance, and outperform the state-of-the-art methods. The code is available at https://github.com/hitcszx/lnl_sr.

deep learning, label noise, neural network, (18 more...)

arXiv.org Machine Learning

2108.00192

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

Ma, Weizhi, Zhang, Min, Cao, Yue, Woojeong, null, Jin, null, Wang, Chenyang, Liu, Yiqun, Ma, Shaoping, Ren, Xiang

arXiv.org Artificial IntelligenceMar-8-2019

Explainability and effectiveness are two key aspects for building recommender systems. Prior efforts mostly focus on incorporating side information to achieve better recommendation performance. However, these methods have some weaknesses: (1) prediction of neural network-based embedding methods are hard to explain and debug; (2) symbolic, graph-based approaches (e.g., meta path-based models) require manual efforts and domain knowledge to define patterns and rules, and ignore the item association types (e.g. substitutable and complementary). In this paper, we propose a novel joint learning framework to integrate \textit{induction of explainable rules from knowledge graph} with \textit{construction of a rule-guided neural recommendation model}. The framework encourages two modules to complement each other in generating effective and explainable recommendation: 1) inductive rules, mined from item-centric knowledge graphs, summarize common multi-hop relational patterns for inferring different item associations and provide human-readable explanation for model prediction; 2) recommendation module can be augmented by induced rules and thus have better generalization ability dealing with the cold-start issue. Extensive experiments\footnote{Code and data can be found at: \url{https://github.com/THUIR/RuleRec}} show that our proposed method has achieved significant improvements in item recommendation over baselines on real-world datasets. Our model demonstrates robust performance over "noisy" item knowledge graphs, generated by linking item names to related entities.

artificial intelligence, neural network, recommendation, (18 more...)

arXiv.org Artificial Intelligence

1903.03714

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback