AITopics | Cui, Peng

Plotting

Cui, Peng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization

Zhang, Xingxuan, Xu, Renzhe, Yu, Han, Zou, Hao, Cui, Peng

arXiv.org Artificial IntelligenceJul-4-2023

Recently, flat minima are proven to be effective for improving generalization and sharpness-aware minimization (SAM) achieves state-of-the-art performance. Yet the current definition of flatness discussed in SAM and its follow-ups are limited to the zeroth-order flatness (i.e., the worst-case loss within a perturbation radius). We show that the zeroth-order flatness can be insufficient to discriminate minima with low generalization error from those with high generalization error both when there is a single minimum or multiple minima within the given perturbation radius. Thus we present first-order flatness, a stronger measure of flatness focusing on the maximal gradient norm within a perturbation radius which bounds both the maximal eigenvalue of Hessian at local minima and the regularization function of SAM. We also present a novel training procedure named Gradient norm Aware Minimization (GAM) to seek minima with uniformly small curvature across all directions. Experimental results show that GAM improves the generalization of models trained with current optimizers such as SGD and AdamW on various datasets and networks. Furthermore, we show that GAM can help SAM find flatter minima and achieve better generalization.

artificial intelligence, flatness, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.03108

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Adaptive and Personalized Exercise Generation for Online Language Learning

Cui, Peng, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceJun-4-2023

Adaptive learning aims to provide customized educational activities (e.g., exercises) to address individual learning needs. However, manual construction and delivery of such activities is a laborious process. Thus, in this paper, we study a novel task of adaptive and personalized exercise generation for online language learning. To this end, we combine a knowledge tracing model that estimates each student's evolving knowledge states from their learning history and a controlled text generation model that generates exercise sentences based on the student's current estimated knowledge state and instructor requirements of desired properties (e.g., domain knowledge and difficulty). We train and evaluate our model on real-world learner interaction data from Duolingo and demonstrate that LMs guided by student states can generate superior exercises. Then, we discuss the potential use of our model in educational applications using various simulations. These simulations show that our model can adapt to students' individual abilities and can facilitate their learning efficiency by personalizing learning sequences.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.02457

Country: North America > United States > Massachusetts (0.28)

Genre:

Instructional Material > Online (0.60)
Research Report > New Finding (0.46)

Industry:

Education > Educational Setting (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)
Education > Curriculum > Subject-Specific Education (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Rethinking the Evaluation Protocol of Domain Generalization

Yu, Han, Zhang, Xingxuan, Xu, Renzhe, Liu, Jiashuo, He, Yue, Cui, Peng

arXiv.org Artificial IntelligenceMay-24-2023

Domain generalization aims to solve the challenge of Out-of-Distribution (OOD) generalization by leveraging common knowledge learned from multiple training domains to generalize to unseen test domains. To accurately evaluate the OOD generalization ability, it is necessary to ensure that test data information is unavailable. However, the current domain generalization protocol may still have potential test data information leakage. This paper examines the potential risks of test data information leakage in two aspects of the current protocol: pretraining on ImageNet and oracle model selection. We propose that training from scratch and using multiple test domains would result in a more precise evaluation of OOD generalization ability. We also rerun the algorithms with the modified protocol and introduce a new leaderboard to encourage future research in domain generalization with a fairer comparison.

artificial intelligence, generalization, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.15253

Country: Asia (0.46)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Meta Adaptive Task Sampling for Few-Domain Generalization

Shen, Zheyan, Yu, Han, Cui, Peng, Liu, Jiashuo, Zhang, Xingxuan, Zhou, Linjun, Liu, Furui

arXiv.org Artificial IntelligenceMay-24-2023

To ensure the out-of-distribution (OOD) generalization performance, traditional domain generalization (DG) methods resort to training on data from multiple sources with different underlying distributions. And the success of those DG methods largely depends on the fact that there are diverse training distributions. However, it usually needs great efforts to obtain enough heterogeneous data due to the high expenses, privacy issues or the scarcity of data. Thus an interesting yet seldom investigated problem arises: how to improve the OOD generalization performance when the perceived heterogeneity is limited. In this paper, we instantiate a new framework called few-domain generalization (FDG), which aims to learn a generalizable model from very few domains of novel tasks with the knowledge acquired from previous learning experiences on base tasks. Moreover, we propose a Meta Adaptive Task Sampling (MATS) procedure to differentiate base tasks according to their semantic and domain-shift similarity to the novel task. Empirically, we show that the newly introduced FDG framework can substantially improve the OOD generalization performance on the novel task and further combining MATS with episodic training could outperform several state-of-the-art DG baselines on widely used benchmarks like PACS and DomainNet.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.15644

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.54)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

Zhou, Wangchunshu, Jiang, Yuchen Eleanor, Cui, Peng, Wang, Tiannan, Xiao, Zhenxin, Hou, Yifan, Cotterell, Ryan, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceMay-22-2023

The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs. RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses natural language to simulate the Long Short-Term Memory mechanism in an LSTM. At each timestep, RecurrentGPT generates a paragraph of text and updates its language-based long-short term memory stored on the hard drive and the prompt, respectively. This recurrence mechanism enables RecurrentGPT to generate texts of arbitrary length without forgetting. Since human users can easily observe and edit the natural language memories, RecurrentGPT is interpretable and enables interactive generation of long text. RecurrentGPT is an initial step towards next-generation computer-assisted writing systems beyond local editing suggestions. In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers. We call this usage of generative models by ``AI As Contents'' (AIAC), which we believe is the next form of conventional AIGC. We further demonstrate the possibility of using RecurrentGPT to create personalized interactive fiction that directly interacts with readers instead of interacting with writers. More broadly, RecurrentGPT demonstrates the utility of borrowing ideas from popular model designs in cognitive science and deep learning for prompting LLMs. Our code is available at https://github.com/aiwaves-cn/RecurrentGPT and an online demo is available at https://www.aiwaves.org/recurrentgpt.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.13304

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Exploring and Exploiting Data Heterogeneity in Recommendation

Wang, Zimu, Liu, Jiashuo, Zou, Hao, Zhang, Xingxuan, He, Yue, Liang, Dongxu, Cui, Peng

arXiv.org Artificial IntelligenceMay-21-2023

Massive amounts of data are the foundation of data-driven recommendation models. As an inherent nature of big data, data heterogeneity widely exists in real-world recommendation systems. It reflects the differences in the properties among sub-populations. Ignoring the heterogeneity in recommendation data could limit the performance of recommendation models, hurt the sub-populational robustness, and make the models misled by biases. However, data heterogeneity has not attracted substantial attention in the recommendation community. Therefore, it inspires us to adequately explore and exploit heterogeneity for solving the above problems and assisting data analysis. In this work, we focus on exploring two representative categories of heterogeneity in recommendation data that is the heterogeneity of prediction mechanism and covariate distribution and propose an algorithm that explores the heterogeneity through a bilevel clustering method. Furthermore, the uncovered heterogeneity is exploited for two purposes in recommendation scenarios which are prediction with multiple sub-models and supporting debias. Extensive experiments on real-world data validate the existence of heterogeneity in recommendation data and the effectiveness of exploring and exploiting data heterogeneity in recommendation.

data mining, heterogeneity, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.15431

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.94)
Media > Film (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

Predictive Heterogeneity: Measures and Applications

Liu, Jiashuo, Wu, Jiayun, Li, Bo, Cui, Peng

arXiv.org Artificial IntelligenceApr-1-2023

As an intrinsic and fundamental property of big data, data heterogeneity exists in a variety of real-world applications, such as precision medicine, autonomous driving, financial applications, etc. For machine learning algorithms, the ignorance of data heterogeneity will greatly hurt the generalization performance and the algorithmic fairness, since the prediction mechanisms among different sub-populations are likely to differ from each other. In this work, we focus on the data heterogeneity that affects the prediction of machine learning models, and firstly propose the \emph{usable predictive heterogeneity}, which takes into account the model capacity and computational constraints. We prove that it can be reliably estimated from finite data with probably approximately correct (PAC) bounds. Additionally, we design a bi-level optimization algorithm to explore the usable predictive heterogeneity from data. Empirically, the explored heterogeneity provides insights for sub-population divisions in income prediction, crop yield prediction and image classification tasks, and leveraging such heterogeneity benefits the out-of-distribution generalization performance.

artificial intelligence, heterogeneity, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2304.00305

Country:

North America > United States (0.46)
Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Epidemiology (0.94)
Health & Medicine > Therapeutic Area > Immunology (0.94)
Health & Medicine > Therapeutic Area > Gastroenterology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Confidence-based Reliable Learning under Dual Noises

Cui, Peng, Yue, Yang, Deng, Zhijie, Zhu, Jun

arXiv.org Artificial IntelligenceFeb-10-2023

Deep neural networks (DNNs) have achieved remarkable success in a variety of computer vision tasks, where massive labeled images are routinely required for model optimization. Yet, the data collected from the open world are unavoidably polluted by noise, which may significantly undermine the efficacy of the learned models. Various attempts have been made to reliably train DNNs under data noise, but they separately account for either the noise existing in the labels or that existing in the images. A naive combination of the two lines of works would suffer from the limitations in both sides, and miss the opportunities to handle the two kinds of noise in parallel. This work provides a first, unified framework for reliable learning under the joint (image, label)-noise. Technically, we develop a confidence-based sample filter to progressively filter out noisy data without the need of pre-specifying noise ratio. Then, we penalize the model uncertainty of the detected noisy data instead of letting the model continue over-fitting the misleading information in them. Experimental results on various challenging synthetic and real-world noisy datasets verify that the proposed method can outperform competing baselines in the aspect of classification performance.

artificial intelligence, machine learning, noise, (14 more...)

arXiv.org Artificial Intelligence

2302.05098

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Model Agnostic Sample Reweighting for Out-of-Distribution Learning

Zhou, Xiao, Lin, Yong, Pi, Renjie, Zhang, Weizhong, Xu, Renzhe, Cui, Peng, Zhang, Tong

arXiv.org Artificial IntelligenceJan-24-2023

Distributionally robust optimization (DRO) and invariant risk minimization (IRM) are two popular methods proposed to improve out-of-distribution (OOD) generalization performance of machine learning models. While effective for small models, it has been observed that these methods can be vulnerable to overfitting with large overparameterized models. This work proposes a principled method, \textbf{M}odel \textbf{A}gnostic sam\textbf{PL}e r\textbf{E}weighting (\textbf{MAPLE}), to effectively address OOD problem, especially in overparameterized scenarios. Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model on the weighted training data leads to superior OOD generalization performance. The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size. We present theoretical analysis in linear case to prove the insensitivity of MAPLE to model size, and empirically verify its superiority in surpassing state-of-the-art methods by a large margin. Code is available at \url{https://github.com/x-zho14/MAPLE}.

artificial intelligence, machine learning, model agnostic sample reweighting, (13 more...)

arXiv.org Artificial Intelligence

2301.09819

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Product Ranking for Revenue Maximization with Multiple Purchases

Xu, Renzhe, Zhang, Xingxuan, Li, Bo, Zhang, Yafeng, Chen, Xiaolong, Cui, Peng

arXiv.org Artificial IntelligenceJan-2-2023

Product ranking is the core problem for revenue-maximizing online retailers. To design proper product ranking algorithms, various consumer choice models are proposed to characterize the consumers' behaviors when they are provided with a list of products. However, existing works assume that each consumer purchases at most one product or will keep viewing the product list after purchasing a product, which does not agree with the common practice in real scenarios. In this paper, we assume that each consumer can purchase multiple products at will. To model consumers' willingness to view and purchase, we set a random attention span and purchase budget, which determines the maximal amount of products that he/she views and purchases, respectively. Under this setting, we first design an optimal ranking policy when the online retailer can precisely model consumers' behaviors. Based on the policy, we further develop the Multiple-Purchase-with-Budget UCB (MPB-UCB) algorithms with $\~O(\sqrt{T})$ regret that estimate consumers' behaviors and maximize revenue simultaneously in online settings. Experiments on both synthetic and semi-synthetic datasets prove the effectiveness of the proposed algorithms.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2210.08268

Genre: Research Report (0.81)

Industry:

Retail (1.00)
Information Technology > Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management (0.93)
Information Technology > Enterprise Applications (0.75)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback