AITopics | Cui, Peng

Collaborating Authors

Cui, Peng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accurate and Reliable Predictions with Mutual-Transport Ensemble

Liu, Han, Cui, Peng, Wang, Bingning, Zhu, Jun, Hu, Xiaolin

arXiv.org Artificial IntelligenceMay-29-2024

Table 3 presents the performance results for various models in detecting misclassifications. Our method showed significant improvements over other single-model calibration techniques and the DE method. OOD Detection: A reliable classification model should exhibit higher prediction uncertainty and lower confidence when encountering test samples significantly different from the training data. We assessed different calibration methods' abilities to differentiate OOD samples by blending indistribution test data with OOD data. We assessed two capabilities of models trained on CIFAR-10 and CIFAR-100: far OOD detection and near OOD detection Fort et al. (2019); Hendrycks et al. (2019). Far OOD detection involved distinguishing between CIFAR-10 and SVHN datasets Netzer et al. (2011) for models trained on CIFAR-10, and between CIFAR-100 and SVHN datasets for models trained on CIFAR-100. Near OOD detection required distinguishing between CIFAR-10 and CIFAR-100 datasets, which have similar domains. The results, presented in Table 4, demonstrate significant improvements of our method compared to other single-model calibration methods, even surpassing the performance of the DE method, known for its effectiveness in OOD detection.

artificial intelligence, calibration, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2405.19656

Country: Europe > Spain (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Stability Evaluation via Distributional Perturbation Analysis

Blanchet, Jose, Cui, Peng, Li, Jiajin, Liu, Jiashuo

arXiv.org Machine LearningMay-6-2024

The performance of learning models often deteriorates when deployed in out-of-sample environments. To ensure reliable deployment, we propose a stability evaluation criterion based on distributional perturbations. Conceptually, our stability evaluation criterion is defined as the minimal perturbation required on our observed dataset to induce a prescribed deterioration in risk evaluation. In this paper, we utilize the optimal transport (OT) discrepancy with moment constraints on the \textit{(sample, density)} space to quantify this perturbation. Therefore, our stability evaluation criterion can address both \emph{data corruptions} and \emph{sub-population shifts} -- the two most common types of distribution shifts in real-world scenarios. To further realize practical benefits, we present a series of tractable convex formulations and computational methods tailored to different classes of loss functions. The key technical tool to achieve this is the strong duality theorem provided in this paper. Empirically, we validate the practical utility of our stability evaluation criterion across a host of real-world applications. These empirical studies showcase the criterion's ability not only to compare the stability of different learning models and features but also to provide valuable guidelines and strategies to further improve models.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2405.03198

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Debiased Collaborative Filtering with Kernel-Based Causal Balancing

Li, Haoxuan, Zheng, Chunyuan, Xiao, Yanghao, Wu, Peng, Geng, Zhi, Chen, Xu, Cui, Peng

arXiv.org Artificial IntelligenceApr-30-2024

Debiased collaborative filtering aims to learn an unbiased prediction model by removing different biases in observational datasets. To solve this problem, one of the simple and effective methods is based on the propensity score, which adjusts the observational sample distribution to the target one by reweighting observed instances. Ideally, propensity scores should be learned with causal balancing constraints. However, existing methods usually ignore such constraints or implement them with unreasonable approximations, which may affect the accuracy of the learned propensity scores. To bridge this gap, in this paper, we first analyze the gaps between the causal balancing requirements and existing methods such as learning the propensity with cross-entropy loss or manually selecting functions to balance. Inspired by these gaps, we propose to approximate the balancing functions in reproducing kernel Hilbert space and demonstrate that, based on the universal property and representer theorem of kernel functions, the causal balancing constraints can be better satisfied. Meanwhile, we propose an algorithm that adaptively balances the kernel function and theoretically analyze the generalization error bound of our methods. We conduct extensive experiments to demonstrate the effectiveness of our methods, and to promote this research direction, we have released our project at https://github.com/haoxuanli-pku/ICLR24-Kernel-Balancing.

artificial intelligence, kbdr, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.19596

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.55)

Add feedback

PPA-Game: Characterizing and Learning Competitive Dynamics Among Online Content Creators

Xu, Renzhe, Wang, Haotian, Zhang, Xingxuan, Li, Bo, Cui, Peng

arXiv.org Artificial IntelligenceMar-22-2024

We introduce the Proportional Payoff Allocation Game (PPA-Game) to model how agents, akin to content creators on platforms like YouTube and TikTok, compete for divisible resources and consumers' attention. Payoffs are allocated to agents based on heterogeneous weights, reflecting the diversity in content quality among creators. Our analysis reveals that although a pure Nash equilibrium (PNE) is not guaranteed in every scenario, it is commonly observed, with its absence being rare in our simulations. Beyond analyzing static payoffs, we further discuss the agents' online learning about resource payoffs by integrating a multi-player multi-armed bandit framework. We propose an online algorithm facilitating each agent's maximization of cumulative payoffs over $T$ rounds. Theoretically, we establish that the regret of any agent is bounded by $O(\log^{1 + \eta} T)$ for any $\eta > 0$. Empirical results further validate the effectiveness of our approach.

data mining, equilibrium, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2403.15524

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

A Survey on Evaluation of Out-of-Distribution Generalization

Yu, Han, Liu, Jiashuo, Zhang, Xingxuan, Wu, Jiayun, Cui, Peng

arXiv.org Artificial IntelligenceMar-4-2024

Machine learning models, while progressively advanced, rely heavily on the IID assumption, which is often unfulfilled in practice due to inevitable distribution shifts. This renders them susceptible and untrustworthy for deployment in risk-sensitive applications. Such a significant problem has consequently spawned various branches of works dedicated to developing algorithms capable of Out-of-Distribution (OOD) generalization. Despite these efforts, much less attention has been paid to the evaluation of OOD generalization, which is also a complex and fundamental problem. Its goal is not only to assess whether a model's OOD generalization capability is strong or not, but also to evaluate where a model generalizes well or poorly. This entails characterizing the types of distribution shifts that a model can effectively address, and identifying the safe and risky input regions given a model. This paper serves as the first effort to conduct a comprehensive review of OOD evaluation. We categorize existing research into three paradigms: OOD performance testing, OOD performance prediction, and OOD intrinsic property characterization, according to the availability of test data. Additionally, we briefly discuss OOD evaluation in the context of pretrained models. In closing, we propose several promising directions for future research in OOD evaluation.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2403.01874

Country: Asia > China (0.46)

Genre: Overview (0.87)

Industry:

Information Technology > Security & Privacy (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Enhancing Distributional Stability among Sub-populations

Liu, Jiashuo, Wu, Jiayun, Peng, Jie, Wu, Xiaoyu, Zheng, Yang, Li, Bo, Cui, Peng

arXiv.org Artificial IntelligenceFeb-13-2024

Enhancing the stability of machine learning algorithms under distributional shifts is at the heart of the Out-of-Distribution (OOD) Generalization problem. Derived from causal learning, recent works of invariant learning pursue strict invariance with multiple training environments. Although intuitively reasonable, strong assumptions on the availability and quality of environments are made to learn the strict invariance property. In this work, we come up with the ``distributional stability" notion to mitigate such limitations. It quantifies the stability of prediction mechanisms among sub-populations down to a prescribed scale. Based on this, we propose the learnability assumption and derive the generalization error bound under distribution shifts. Inspired by theoretical analyses, we propose our novel stable risk minimization (SRM) algorithm to enhance the model's stability w.r.t. shifts in prediction mechanisms ($Y|X$-shifts). Experimental results are consistent with our intuition and validate the effectiveness of our algorithm. The code can be found at https://github.com/LJSthu/SRM.

artificial intelligence, machine learning, stability, (15 more...)

arXiv.org Artificial Intelligence

2206.0299

Country:

North America > Canada (0.14)
Europe > Sweden (0.14)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Out-Of-Distribution Generalization of Multimodal Large Language Models

Zhang, Xingxuan, Li, Jiansheng, Chu, Wenjing, Hai, Junjia, Xu, Renzhe, Yang, Yuqing, Guan, Shikai, Xu, Jiazheng, Cui, Peng

arXiv.org Artificial IntelligenceFeb-9-2024

We investigate the generalization boundaries of current Multimodal Large Language Models (MLLMs) via comprehensive evaluation under out-of-distribution scenarios and domain-specific tasks. We evaluate their zero-shot generalization across synthetic images, real-world distributional shifts, and specialized datasets like medical and molecular imagery. Empirical results indicate that MLLMs struggle with generalization beyond common training domains, limiting their direct application without adaptation. To understand the cause of unreliable performance, we analyze three hypotheses: semantic misinterpretation, visual feature extraction insufficiency, and mapping deficiency. Results identify mapping deficiency as the primary hurdle. To address this problem, we show that in-context learning (ICL) can significantly enhance MLLMs' generalization, opening new avenues for overcoming generalization barriers. We further explore the robustness of ICL under distribution shifts and show its vulnerability to domain shifts, label shifts, and spurious correlation shifts between in-context examples and test data.

confidence score, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2402.06599

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Emergence and Causality in Complex Systems: A Survey on Causal Emergence and Related Quantitative Studies

Yuan, Bing, Jiang, Zhang, Lyu, Aobo, Wu, Jiayun, Wang, Zhipeng, Yang, Mingzhe, Liu, Kaiwei, Mou, Muyun, Cui, Peng

arXiv.org Artificial IntelligenceFeb-1-2024

Emergence and causality are two fundamental concepts for understanding complex systems. They are interconnected. On one hand, emergence refers to the phenomenon where macroscopic properties cannot be solely attributed to the cause of individual properties. On the other hand, causality can exhibit emergence, meaning that new causal laws may arise as we increase the level of abstraction. Causal emergence theory aims to bridge these two concepts and even employs measures of causality to quantify emergence. This paper provides a comprehensive review of recent advancements in quantitative theories and applications of causal emergence. Two key problems are addressed: quantifying causal emergence and identifying it in data. Addressing the latter requires the use of machine learning techniques, thus establishing a connection between causal emergence and artificial intelligence. We highlighted that the architectures used for identifying causal emergence are shared by causal representation learning, causal model abstraction, and world model-based reinforcement learning. Consequently, progress in any of these areas can benefit the others. Potential applications and future perspectives are also discussed in the final section of the review.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2312.16815

Country:

Asia (0.67)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
(2 more...)

Add feedback

Agents: An Open-source Framework for Autonomous Language Agents

Zhou, Wangchunshu, Jiang, Yuchen Eleanor, Li, Long, Wu, Jialong, Wang, Tiannan, Qiu, Shi, Zhang, Jintian, Chen, Jing, Wu, Ruipu, Wang, Shuai, Zhu, Shiding, Chen, Jiyu, Zhang, Wentao, Tang, Xiangru, Zhang, Ningyu, Chen, Huajun, Cui, Peng, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceDec-11-2023

Recent advances on large language models (LLMs) enable researchers and developers to build autonomous language agents that can automatically solve various tasks and interact with environments, humans, and other agents using natural language interfaces. We consider language agents as a promising direction towards artificial general intelligence and release Agents, an open-source library with the goal of opening up these advances to a wider non-specialist audience. Agents is carefully engineered to support important features including planning, memory, tool usage, multi-agent communication, and fine-grained symbolic control. Agents is user-friendly as it enables non-specialists to build, customize, test, tune, and deploy state-of-the-art autonomous language agents without much coding. The library is also research-friendly as its modularized design makes it easily extensible for researchers. Agents is available at https://github.com/aiwaves-cn/agents.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2309.0787

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Neural Eigenfunctions Are Structured Representation Learners

Deng, Zhijie, Shi, Jiaxin, Zhang, Hao, Cui, Peng, Lu, Cewu, Zhu, Jun

arXiv.org Artificial IntelligenceDec-8-2023

This paper introduces a structured, adaptive-length deep representation called Neural Eigenmap. Unlike prior spectral methods such as Laplacian Eigenmap that operate in a nonparametric manner, Neural Eigenmap leverages NeuralEF (Deng et al., 2022) to parametrically model eigenfunctions using a neural network. We show that, when the eigenfunction is derived from positive relations in a data augmentation setup, applying NeuralEF results in an objective function that resembles those of popular self-supervised learning methods, with an additional symmetry-breaking property that leads to structured representations where features are ordered by importance. We demonstrate using such representations as adaptive-length codes in image retrieval systems. By truncation according to feature importance, our method requires up to 16 shorter representation length than leading self-supervised learning ones to achieve similar retrieval performance. We further apply our method to graph data and report strong results on a node representation learning benchmark with more than one million nodes. Automatically learning representations from unlabelled data is a long-standing challenge in machine learning. Often, the motivation is to map data to a vector space where the geometric distance reflects semantic closeness. This enables, for example, retrieving semantically related information via finding nearest neighbors, or discovering concepts with clustering. One can also pass such representations as inputs to supervised learning procedures, which removes the need for feature engineering.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2210.12637

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback