AITopics | Cui, Peng

Collaborating Authors

Cui, Peng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study

Zhang, Xingxuan, Wang, Haoran, Li, Jiansheng, Xue, Yuan, Guan, Shikai, Xu, Renzhe, Zou, Hao, Yu, Han, Cui, Peng

arXiv.org Artificial IntelligenceMar-19-2025

Large language models (LLMs) like GPT-4 and LLaMA-3 utilize the powerful in-context learning (ICL) capability of Transformer architecture to learn on the fly from limited examples. While ICL underpins many LLM applications, its full potential remains hindered by a limited understanding of its generalization boundaries and vulnerabilities. We present a systematic investigation of transformers' generalization capability with ICL relative to training data coverage by defining a task-centric framework along three dimensions: inter-problem, intra-problem, and intra-task generalization. Through extensive simulation and real-world experiments, encompassing tasks such as function fitting, API calling, and translation, we find that transformers lack inter-problem generalization with ICL, but excel in intra-task and intra-problem generalization. When the training data includes a greater variety of mixed tasks, it significantly enhances the generalization ability of ICL on unseen tasks and even on known simple tasks. This guides us in designing training data to maximize the diversity of tasks covered and to combine different tasks whenever possible, rather than solely focusing on the target task for testing.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.15579

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

Sample Weight Averaging for Stable Prediction

Yu, Han, He, Yue, Xu, Renzhe, Li, Dongbai, Zhang, Jiayin, Zou, Wenchao, Cui, Peng

arXiv.org Artificial IntelligenceFeb-11-2025

The challenge of Out-of-Distribution (OOD) generalization poses a foundational concern for the application of machine learning algorithms to risk-sensitive areas. Inspired by traditional importance weighting and propensity weighting methods, prior approaches employ an independence-based sample reweighting procedure. They aim at decorrelating covariates to counteract the bias introduced by spurious correlations between unstable variables and the outcome, thus enhancing generalization and fulfilling stable prediction under covariate shift. Nonetheless, these methods are prone to experiencing an inflation of variance, primarily attributable to the reduced efficacy in utilizing training samples during the reweighting process. Existing remedies necessitate either environmental labels or substantially higher time costs along with additional assumptions and supervised information. To mitigate this issue, we propose SAmple Weight Averaging (SAWA), a simple yet efficacious strategy that can be universally integrated into various sample reweighting algorithms to decrease the variance and coefficient estimation error, thus boosting the covariate-shift generalization and achieving stable prediction across different environments. We prove its rationality and benefits theoretically. Experiments across synthetic datasets and real-world datasets consistently underscore its superiority against covariate shift.

artificial intelligence, generalization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.07414

Country:

Europe (0.14)
Asia > China (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Grammar Control in Dialogue Response Generation for Language Learning Chatbots

Glandorf, Dominik, Cui, Peng, Meurers, Detmar, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceFeb-11-2025

Chatbots based on large language models offer cheap conversation practice opportunities for language learners. However, they are hard to control for linguistic forms that correspond to learners' current needs, such as grammar. We control grammar in chatbot conversation practice by grounding a dialogue response generation model in a pedagogical repository of grammar skills. We also explore how this control helps learners to produce specific grammar. We comprehensively evaluate prompting, fine-tuning, and decoding strategies for grammar-controlled dialogue response generation. Strategically decoding Llama3 outperforms GPT-3.5 when tolerating minor response quality losses. Our simulation predicts grammar-controlled responses to support grammar acquisition adapted to learner proficiency. Existing language learning chatbots and research on second language acquisition benefit from these affordances. Code available on GitHub.

grammar skill, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.07544

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Education > Curriculum > Subject-Specific Education (0.70)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Investigating the Zone of Proximal Development of Language Models for In-Context Learning

Cui, Peng, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceFeb-10-2025

In this paper, we introduce a learning analytics framework to analyze the in-context learning (ICL) behavior of large language models (LLMs) through the lens of the Zone of Proximal Development (ZPD), an established theory in educational psychology. ZPD delineates the space between what a learner is capable of doing unsupported and what the learner cannot do even with support. We adapt this concept to ICL, measuring the ZPD of LLMs based on model performance on individual examples with and without ICL. Furthermore, we propose an item response theory (IRT) model to predict the distribution of zones for LLMs. Our findings reveal a series of intricate and multifaceted behaviors of ICL, providing new insights into understanding and leveraging this technique. Finally, we demonstrate how our framework can enhance LLM in both inference and fine-tuning scenarios: (1) By predicting a model's zone of proximal development, we selectively apply ICL to queries that are most likely to benefit from demonstrations, achieving a better balance between inference cost and performance; (2) We propose a human-like curriculum for fine-tuning, which prioritizes examples within the model's ZPD. The curriculum results in improved performance, and we explain its effectiveness through an analysis of the training dynamics of LLMs.

demonstration, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.0699

Country:

Europe (0.46)
Asia (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Error Slice Discovery via Manifold Compactness

Yu, Han, Liu, Jiashuo, Zou, Hao, Xu, Renzhe, He, Yue, Zhang, Xingxuan, Cui, Peng

arXiv.org Artificial IntelligenceJan-31-2025

Despite the great performance of deep learning models in many areas, they still make mistakes and underperform on certain subsets of data, i.e. error slices. Given a trained model, it is important to identify its semantically coherent error slices that are easy to interpret, which is referred to as the error slice discovery problem. However, there is no proper metric of slice coherence without relying on extra information like predefined slice labels. Current evaluation of slice coherence requires access to predefined slices formulated by metadata like attributes or subclasses. Its validity heavily relies on the quality and abundance of metadata, where some possible patterns could be ignored. Besides, current algorithms cannot directly incorporate the constraint of coherence into their optimization objective due to the absence of an explicit coherence metric, which could potentially hinder their effectiveness. In this paper, we propose manifold compactness, a coherence metric without reliance on extra information by incorporating the data geometry property into its design, and experiments on typical datasets empirically validate the rationality of the metric. Then we develop Manifold Compactness based error Slice Discovery (MCSD), a novel algorithm that directly treats risk and coherence as the optimization objective, and is flexible to be applied to models of various tasks. Extensive experiments on the benchmark and case studies on other typical datasets demonstrate the superiority of MCSD.

artificial intelligence, error slice discovery, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2501.19032

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

How to Select Datapoints for Efficient Human Evaluation of NLG Models?

Zouhar, Vilém, Cui, Peng, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceJan-30-2025

Human evaluation is the gold-standard for evaluating text generation models. It is also expensive, and to fit budgetary constraints, a random subset of the test data is often chosen in practice. The randomly selected data may not accurately represent test performance, making this approach economically inefficient for model comparison. Thus, in this work, we develop a suite of selectors to get the most informative datapoints for human evaluation while taking the evaluation costs into account. We show that selectors based on variance in automated metric scores, diversity in model outputs, or Item Response Theory outperform random selection. We further develop an approach to distill these selectors to the scenario where the model outputs are not yet available. In particular, we introduce source-based estimators, which predict item usefulness for human evaluation just Figure 1: Output-based variant of our informative subset based on the source texts. We demonstrate the selection approach. Given model outputs and automated efficacy of our selectors in two common NLG metrics, we select items to be human-evaluated tasks, machine translation and summarization, on which the final model ranking can be computed.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2501.18251

Country:

North America > United States (0.14)
Europe > France (0.14)
Asia > Thailand (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.51)

Add feedback

Exploring Aleatoric Uncertainty in Object Detection via Vision Foundation Models

Cui, Peng, He, Guande, Zhang, Dan, Deng, Zhijie, Dong, Yinpeng, Zhu, Jun

arXiv.org Artificial IntelligenceNov-26-2024

Datasets collected from the open world unavoidably suffer from various forms of randomness or noiseness, leading to the ubiquity of aleatoric (data) uncertainty. Quantifying such uncertainty is particularly pivotal for object detection, where images contain multi-scale objects with occlusion, obscureness, and even noisy annotations, in contrast to images with centric and similar-scale objects in classification. This paper suggests modeling and exploiting the uncertainty inherent in object detection data with vision foundation models and develops a data-centric reliable training paradigm. Technically, we propose to estimate the data uncertainty of each object instance based on the feature space of vision foundation models, which are trained on ultra-large-scale datasets and able to exhibit universal data representation. In particular, we assume a mixture-of-Gaussian structure of the object features and devise Mahalanobis distance-based measures to quantify the data uncertainty. Furthermore, we suggest two curial and practical usages of the estimated uncertainty: 1) for defining uncertainty-aware sample filter to abandon noisy and redundant instances to avoid over-fitting, and 2) for defining sample adaptive regularizer to balance easy/hard samples for adaptive training. The estimated aleatoric uncertainty serves as an extra level of annotations of the dataset, so it can be utilized in a plug-and-play manner with any model. Extensive empirical studies verify the effectiveness of the proposed aleatoric uncertainty measure on various advanced detection models and challenging benchmarks.

artificial intelligence, machine learning, uncertainty score, (16 more...)

arXiv.org Artificial Intelligence

2411.17767

Country: Europe > Switzerland (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

LDACP: Long-Delayed Ad Conversions Prediction Model for Bidding Strategy

Cui, Peng, Yang, Yiming, Jin, Fusheng, Tang, Siyuan, Wang, Yunli, Yang, Fukang, Jia, Yalong, Cai, Qingpeng, Pan, Fei, Li, Changcheng, Jiang, Peng

arXiv.org Artificial IntelligenceNov-25-2024

In online advertising, once an ad campaign is deployed, the automated bidding system dynamically adjusts the bidding strategy to optimize Cost Per Action (CPA) based on the number of ad conversions. For ads with a long conversion delay, relying solely on the real-time tracked conversion number as a signal for bidding strategy can significantly overestimate the current CPA, leading to conservative bidding strategies. Therefore, it is crucial to predict the number of long-delayed conversions. Nonetheless, it is challenging to predict ad conversion numbers through traditional regression methods due to the wide range of ad conversion numbers. Previous regression works have addressed this challenge by transforming regression problems into bucket classification problems, achieving success in various scenarios. However, specific challenges arise when predicting the number of ad conversions: 1) The integer nature of ad conversion numbers exacerbates the discontinuity issue in one-hot hard labels; 2) The long-tail distribution of ad conversion numbers complicates tail data prediction. In this paper, we propose the Long-Delayed Ad Conversions Prediction model for bidding strategy (LDACP), which consists of two sub-modules. To alleviate the issue of discontinuity in one-hot hard labels, the Bucket Classification Module with label Smoothing method (BCMS) converts one-hot hard labels into non-normalized soft labels, then fits these soft labels by minimizing classification loss and regression loss. To address the challenge of predicting tail data, the Value Regression Module with Proxy labels (VRMP) uses the prediction bias of aggregated pCTCVR as proxy labels. Finally, a Mixture of Experts (MoE) structure integrates the predictions from BCMS and VRMP to obtain the final predicted ad conversion number.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.16095

Country:

Asia > China (0.16)
Oceania > Australia (0.16)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.48)

Industry:

Marketing (0.88)
Information Technology > Services (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Topology-Aware Dynamic Reweighting for Distribution Shifts on Graph

Zheng, Weihuang, Liu, Jiashuo, Li, Jiaxing, Wu, Jiayun, Cui, Peng, Kong, Youyong

arXiv.org Artificial IntelligenceJun-3-2024

Graph Neural Networks (GNNs) have been widely used in node classification tasks, such as advertising recommendation [15], social network anomaly detection [34], etc. However, these GNN models typically assume that the training and test graph data are drawn from the same distribution, which does not always hold in practice. In real-world graph data, sample selection bias [8, 12] as well as graph construction techniques [27, 43] often brings distribution shifts between training nodes and test nodes. For instance, In WebKB [26] datasets, web pages (nodes) and categories (labels) are heavily affected by the university they originate from, leading to distribution shifts among nodes drawn from different universities. Therefore, in order to enhance the practical validity of GNNs, it is of paramount importance to deal with distribution shifts on graph data. To address the distribution shift problem in node classification, recent works [18, 36, 32, 37, 23] borrow the idea of invariant learning methods from the literature of out-of-distribution (OOD) generalization and adopt them on graph-structured data. Invariant learning [1, 19] stems from the causal inference literature, and now becomes one of the key approaches to solving OOD problems on graphs. The core concept is to identify invariant features with stable prediction mechanisms across different environments, thereby mitigating performance degradation under distribution shifts. And most of the works in this line directly apply existing invariant learning algorithms to graph-level classification tasks (major) [18, 32, 23, 41] and node classification tasks (minor) [36, 38].

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2406.01066

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift

Wu, Jiayun, Liu, Jiashuo, Cui, Peng, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceJun-2-2024

We establish a new model-agnostic optimization framework for out-of-distribution generalization via multicalibration, a criterion that ensures a predictor is calibrated across a family of overlapping groups. Multicalibration is shown to be associated with robustness of statistical inference under covariate shift. We further establish a link between multicalibration and robustness for prediction tasks both under and beyond covariate shift. We accomplish this by extending multicalibration to incorporate grouping functions that consider covariates and labels jointly. This leads to an equivalence of the extended multicalibration and invariance, an objective for robust learning in existence of concept shift. We show a linear structure of the grouping function class spanned by density ratios, resulting in a unifying framework for robust learning by designing specific grouping functions. We propose MC-Pseudolabel, a post-processing algorithm to achieve both extended multicalibration and out-of-distribution generalization. The algorithm, with lightweight hyperparameters and optimization through a series of supervised regression steps, achieves superior performance on real-world datasets with distribution shift.

artificial intelligence, machine learning, predictor, (20 more...)

arXiv.org Artificial Intelligence

2406.00661

Country:

North America > United States > Maryland (0.14)
North America > United States > Hawaii (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback