AITopics | Zhang, Xiaoyu

Collaborating Authors

Zhang, Xiaoyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exposing Product Bias in LLM Investment Recommendation

Zhi, Yuhan, Zhang, Xiaoyu, Wang, Longtian, Jiang, Shumin, Ma, Shiqing, Guan, Xiaohong, Shen, Chao

arXiv.org Artificial IntelligenceMar-11-2025

Large language models (LLMs), as a new generation of recommendation engines, possess powerful summarization and data analysis capabilities, surpassing traditional recommendation systems in both scope and performance. One promising application is investment recommendation. In this paper, we reveal a novel product bias in LLM investment recommendation, where LLMs exhibit systematic preferences for specific products. Such preferences can subtly influence user investment decisions, potentially leading to inflated valuations of products and financial bubbles, posing risks to both individual investors and market stability. To comprehensively study the product bias, we develop an automated pipeline to create a dataset of 567,000 samples across five asset classes (stocks, mutual funds, cryptocurrencies, savings, and portfolios). With this dataset, we present the bf first study on product bias in LLM investment recommendations. Our findings reveal that LLMs exhibit clear product preferences, such as certain stocks (e.g., `AAPL' from Apple and `MSFT' from Microsoft). Notably, this bias persists even after applying debiasing techniques. We urge AI researchers to take heed of the product bias in LLM investment recommendations and its implications, ensuring fairness and security in the digital space and market.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.0875

Country:

Asia (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation

Wang, Wenxuan, Wu, Kai, Li, Yujian Betterest, Wang, Dan, Zhang, Xiaoyu, Liu, Jing

arXiv.org Artificial IntelligenceFeb-21-2025

Foundation models for time series analysis (TSA) have attracted significant attention. However, challenges such as data scarcity and data imbalance continue to hinder their development. To address this, we consider modeling complex systems through symbolic expressions that serve as semantic descriptors of time series. Building on this concept, we introduce a series-symbol (S2) dual-modulity data generation mechanism, enabling the unrestricted creation of high-quality time series data paired with corresponding symbolic representations. Leveraging the S2 dataset, we develop SymTime, a pre-trained foundation model for TSA. SymTime demonstrates competitive performance across five major TSA tasks when fine-tuned with downstream task, rivaling foundation models pre-trained on real-world datasets. This approach underscores the potential of dual-modality data generation and pretraining mechanisms in overcoming data scarcity and enhancing task performance.

data mining, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.15466

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.45)
Government (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Time Series Treatment Effects Analysis with Always-Missing Controls

Shu, Juan, Han, Qiyu, Chen, George, Cao, Xihao, Luo, Kangming, Pallotta, Dan, Agrawal, Shivam, Lu, Yuping, Zhang, Xiaoyu, Mansoor, Jawad, Anand, Jyoti

arXiv.org Machine LearningFeb-17-2025

Estimating treatment effects in time series data presents a significant challenge, especially when the control group is always unobservable. For example, in analyzing the effects of Christmas on retail sales, we lack direct observation of what would have occurred in late December without the Christmas's impact. To address this, we try to recover the control group in the event period while accounting for confounders and temporal dependencies. Experimental results on the M5 Walmart retail sales data demonstrate robust estimation of the potential outcome of the control group as well as accurate predicted holiday effect. Furthermore, we provided theoretical guarantees for the estimated treatment effect, proving its consistency and asymptotic normality. The proposed methodology is applicable not only to this always-missing control scenario but also in other conventional time series causal inference settings.

artificial intelligence, machine learning, treatment effect, (14 more...)

arXiv.org Machine Learning

2502.12393

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Retail (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.70)

Add feedback

TSS GAZ PTP: Towards Improving Gumbel AlphaZero with Two-stage Self-play for Multi-constrained Electric Vehicle Routing Problems

Wang, Hui, Zhang, Xufeng, Zhang, Xiaoyu, Ding, Zhenhuan, Mu, Chaoxu

arXiv.org Artificial IntelligenceFeb-16-2025

Recently, Gumbel AlphaZero (GAZ) was proposed to solve classic combinatorial optimization problems such as TSP and JSSP by creating a carefully designed competition model (consisting of a learning player and a competitor player), which leverages the idea of self-play. However, if the competitor is too strong or too weak, the effectiveness of self-play training can be reduced, particularly in complex CO problems. To address this problem, we further propose a two-stage self-play strategy to improve the GAZ method (named TSS GAZ PTP). In the first stage, the learning player uses the enhanced policy network based on the Gumbel Monte Carlo Tree Search (MCTS), and the competitor uses the historical best trained policy network (acts as a greedy player). In the second stage, we employ Gumbel MCTS for both players, which makes the competition fiercer so that both players can continuously learn smarter trajectories. We first investigate the performance of our proposed TSS GAZ PTP method on TSP since it is also used as a test problem by the original GAZ. The results show the superior performance of TSS GAZ PTP . Then we extend TSS GAZ PTP to deal with multi-constrained Electric V ehicle Routing Problems (EVRP), which is a recently well-known real application research topic and remains challenging as a complex CO problem. Impressively, the experimental results show that the TSS GAZ PTP outperforms the state-of-the-art Deep Reinforcement Learning methods in all types of instances tested and outperforms the optimization solver in tested large-scale instances, indicating the importance and promising of employing more dynamic self-play strategies for complex CO problems.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2502.15777

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (0.86)
Transportation > Freight & Logistics Services (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Beyond English: Unveiling Multilingual Bias in LLM Copyright Compliance

Chen, Yupeng, Zhang, Xiaoyu, Huang, Yixian, Xie, Qian

arXiv.org Artificial IntelligenceFeb-14-2025

Large Language Models (LLMs) have raised significant concerns regarding the fair use of copyright-protected content. While prior studies have examined the extent to which LLMs reproduce copyrighted materials, they have predominantly focused on English, neglecting multilingual dimensions of copyright protection. In this work, we investigate multilingual biases in LLM copyright protection by addressing two key questions: (1) Do LLMs exhibit bias in protecting copyrighted works across languages? (2) Is it easier to elicit copyrighted content using prompts in specific languages? To explore these questions, we construct a dataset of popular song lyrics in English, French, Chinese, and Korean and systematically probe seven LLMs using prompts in these languages. Our findings reveal significant imbalances in LLMs' handling of copyrighted content, both in terms of the language of the copyrighted material and the language of the prompt. These results highlight the need for further research and development of more robust, language-agnostic copyright protection mechanisms to ensure fair and consistent protection across languages.

large language model, lyric, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.05713

Country:

Europe (0.68)
Asia > China (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unveiling Provider Bias in Large Language Models for Code Generation

Zhang, Xiaoyu, Zhai, Juan, Ma, Shiqing, Bao, Qingshuang, Jiang, Weipeng, Shen, Chao, Liu, Yang

arXiv.org Artificial IntelligenceJan-14-2025

Large Language Models (LLMs) have emerged as the new recommendation engines, outperforming traditional methods in both capability and scope, particularly in code generation applications. Our research reveals a novel provider bias in LLMs, namely without explicit input prompts, these models show systematic preferences for services from specific providers in their recommendations (e.g., favoring Google Cloud over Microsoft Azure). This bias holds significant implications for market dynamics and societal equilibrium, potentially promoting digital monopolies. It may also deceive users and violate their expectations, leading to various consequences. This paper presents the first comprehensive empirical study of provider bias in LLM code generation. We develop a systematic methodology encompassing an automated pipeline for dataset generation, incorporating 6 distinct coding task categories and 30 real-world application scenarios. Our analysis encompasses over 600,000 LLM-generated responses across seven state-of-the-art models, utilizing approximately 500 million tokens (equivalent to \$5,000+ in computational costs). The study evaluates both the generated code snippets and their embedded service provider selections to quantify provider bias. Additionally, we conduct a comparative analysis of seven debiasing prompting techniques to assess their efficacy in mitigating these biases. Our findings demonstrate that LLMs exhibit significant provider preferences, predominantly favoring services from Google and Amazon, and can autonomously modify input code to incorporate their preferred providers without users' requests. Notably, we observe discrepancies between providers recommended in conversational contexts versus those implemented in generated code. The complete dataset and analysis results are available in our repository.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.07849

Country:

Asia (0.68)
North America > United States > District of Columbia (0.15)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)
Law > Business Law (0.92)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Backdoor Attack Scheme with Invisible Triggers Based on Model Architecture Modification

Ma, Yuan, Ma, Xu, Wei, Jiankang, Tang, Jinmeng, Zhang, Xiaoyu, Lyu, Yilun, Chen, Kehao, Huang, Jingtong

arXiv.org Artificial IntelligenceJan-6-2025

Machine learning systems are vulnerable to backdoor attacks, where attackers manipulate model behavior through data tampering or architectural modifications. Traditional backdoor attacks involve injecting malicious samples with specific triggers into the training data, causing the model to produce targeted incorrect outputs in the presence of the corresponding triggers. More sophisticated attacks modify the model's architecture directly, embedding backdoors that are harder to detect as they evade traditional data-based detection methods. However, the drawback of the architectural modification based backdoor attacks is that the trigger must be visible in order to activate the backdoor. To further strengthen the invisibility of the backdoor attacks, a novel backdoor attack method is presented in the paper. To be more specific, this method embeds the backdoor within the model's architecture and has the capability to generate inconspicuous and stealthy triggers. The attack is implemented by modifying pre-trained models, which are then redistributed, thereby posing a potential threat to unsuspecting users. Comprehensive experiments conducted on standard computer vision benchmarks validate the effectiveness of this attack and highlight the stealthiness of its triggers, which remain undetectable through both manual visual inspection and advanced detection tools.

artificial intelligence, backdoor attack, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.16905

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cognitive Biases in Large Language Models for News Recommendation

Lyu, Yougang, Zhang, Xiaoyu, Ren, Zhaochun, de Rijke, Maarten

arXiv.org Artificial IntelligenceOct-3-2024

Despite large language models (LLMs) increasingly becoming important components of news recommender systems, employing LLMs in such systems introduces new risks, such as the influence of cognitive biases in LLMs. Cognitive biases refer to systematic patterns of deviation from norms or rationality in the judgment process, which can result in inaccurate outputs from LLMs, thus threatening the reliability of news recommender systems. Specifically, LLM-based news recommender systems affected by cognitive biases could lead to the propagation of misinformation, reinforcement of stereotypes, and the formation of echo chambers. In this paper, we explore the potential impact of multiple cognitive biases on LLM-based news recommender systems, including anchoring bias, framing bias, status quo bias and group attribution bias. Furthermore, to facilitate future research at improving the reliability of LLM-based news recommender systems, we discuss strategies to mitigate these biases through data augmentation, prompt engineering and learning algorithms aspects.

large language model, natural language, recommender system, (16 more...)

arXiv.org Artificial Intelligence

2410.02897

Country:

Asia (0.96)
Europe > Netherlands (0.29)
Europe > Italy (0.28)

Genre: Research Report (0.82)

Industry: Media > News (0.35)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Speculative Coreset Selection for Task-Specific Fine-tuning

Zhang, Xiaoyu, Zhai, Juan, Ma, Shiqing, Shen, Chao, Li, Tianlin, Jiang, Weipeng, Liu, Yang

arXiv.org Artificial IntelligenceOct-2-2024

Task-specific fine-tuning is essential for the deployment of large language models (LLMs), but it requires significant computational resources and time. Existing solutions have proposed coreset selection methods to improve data efficiency and reduce model training overhead, but they still have limitations: 1) Overlooking valuable samples at high pruning rates, which degrades the coreset's performance. 2) Requiring high time overhead during coreset selection to fine-tune and evaluate the target LLM. In this paper, we introduce STAFF, a speculative coreset selection method. STAFF leverages a small model from the same family as the target LLM to efficiently estimate data scores and then verifies the scores on the target LLM to accurately identify and allocate more selection budget to important regions while maintaining coverage of easy regions. We evaluate STAFF on three LLMs and three downstream tasks and show that STAFF improves the performance of SOTA methods by up to 54.3% and reduces selection overhead by up to 70.5% at different pruning rates. Furthermore, we observe that the coreset selected by STAFF at low pruning rates (i.e., 20%) can even obtain better fine-tuning performance than the full dataset.

large language model, natural language, selection, (17 more...)

arXiv.org Artificial Intelligence

2410.01296

Country:

North America > United States > Massachusetts (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Efficient DNN-Powered Software with Fair Sparse Models

Gao, Xuanqi, Jiang, Weipeng, Zhai, Juan, Ma, Shiqing, Zhang, Xiaoyu, Shen, Chao

arXiv.org Artificial IntelligenceJul-3-2024

With the emergence of the Software 3.0 era, there is a growing trend of compressing and integrating large models into software systems, with significant societal implications. Regrettably, in numerous instances, model compression techniques impact the fairness performance of these models and thus the ethical behavior of DNN-powered software. One of the most notable example is the Lottery Ticket Hypothesis (LTH), a prevailing model pruning approach. This paper demonstrates that fairness issue of LTHbased pruning arises from both its subnetwork selection and training procedures, highlighting the inadequacy of existing remedies. To address this, we propose a novel pruning framework, Ballot, which employs a novel conflict-detection-based subnetwork selection to find accurate and fair subnetworks, coupled with a refined training process to attain a high-performance model, thereby improving the fairness of DNN-powered software. By means of this procedure, Ballot improves the fairness of pruning by 38.00%, 33.91%, 17.96%, and 35.82% compared to state-of-the-art baselines, namely Magnitude Pruning, Standard LTH, SafeCompress, and FairScratch respectively, based on our evaluation of five popular datasets and three widely used models. Our code is available at https://anonymous.4open.science/r/Ballot-506E.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.02805

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology (1.00)
Education (0.92)
Leisure & Entertainment > Gambling (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback