AITopics | Bai, Qinxun

Collaborating Authors

Bai, Qinxun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Concurrent Learning with Aggregated States via Randomized Least Squares Value Iteration

Chen, Yan, Bai, Qinxun, Zhang, Yiteng, Dong, Shi, Dimakopoulou, Maria, Sun, Qi, Zhou, Zhengyuan

arXiv.org Artificial IntelligenceJan-30-2025

Designing learning agents that explore efficiently in a complex environment has been widely recognized as a fundamental challenge in reinforcement learning. While a number of works have demonstrated the effectiveness of techniques based on randomized value functions on a single agent, it remains unclear, from a theoretical point of view, whether injecting randomization can help a society of agents {\it concurently} explore an environment. The theoretical results %that we established in this work tender an affirmative answer to this question. We adapt the concurrent learning framework to \textit{randomized least-squares value iteration} (RLSVI) with \textit{aggregated state representation}. We demonstrate polynomial worst-case regret bounds in both finite- and infinite-horizon environments. In both setups the per-agent regret decreases at an optimal rate of $\Theta\left(\frac{1}{\sqrt{N}}\right)$, highlighting the advantage of concurent learning. Our algorithm exhibits significantly lower space complexity compared to \cite{russo2019worst} and \cite{agrawal2021improved}. We reduce the space complexity by a factor of $K$ while incurring only a $\sqrt{K}$ increase in the worst-case regret bound, compared to \citep{agrawal2021improved,russo2019worst}. Additionally, we conduct numerical experiments to demonstrate our theoretical findings.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2501.13394

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

SLIM: Sim-to-Real Legged Instructive Manipulation via Long-Horizon Visuomotor Learning

Zhang, Haichao, Yu, Haonan, Zhao, Le, Choi, Andrew, Bai, Qinxun, Yang, Break, Xu, Wei

arXiv.org Artificial IntelligenceJan-29-2025

We present a low-cost legged mobile manipulation system that solves long-horizon real-world tasks, trained by reinforcement learning purely in simulation. This system is made possible by 1) a hierarchical design of a high-level policy for visual-mobile manipulation following task instructions, and a low-level quadruped locomotion policy, 2) a teacher and student training pipeline for the high level, which trains a teacher to tackle long-horizon tasks using privileged task decomposition and target object information, and further trains a student for visual-mobile manipulation via RL guided by the teacher's behavior, and 3) a suite of techniques for minimizing the sim-to-real gap. In contrast to many previous works that use high-end equipments, our system demonstrates effective performance with more accessible hardware -- specifically, a Unitree Go1 quadruped, a WidowX-250S arm, and a single wrist-mounted RGB camera -- despite the increased challenges of sim-to-real transfer. Trained fully in simulation, a single policy autonomously solves long-horizon tasks involving search, move to, grasp, transport, and drop into, achieving nearly 80% real-world success. This performance is comparable to that of expert human teleoperation on the same tasks while the robot is more efficient, operating at about 1.5x the speed of the teleoperation. Finally, we perform extensive ablations on key techniques for efficient RL training and effective sim-to-real transfer, and demonstrate effective deployment across diverse indoor and outdoor scenes under various lighting conditions.

artificial intelligence, conference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.09905

Genre: Research Report (0.50)

Industry:

Education (0.86)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Enhancing Diversity in Bayesian Deep Learning via Hyperspherical Energy Minimization of CKA

Smerkous, David, Bai, Qinxun, Li, Fuxin

arXiv.org Artificial IntelligenceOct-31-2024

Particle-based Bayesian deep learning often requires a similarity metric to compare two networks. However, naive similarity metrics lack permutation invariance and are inappropriate for comparing networks. Centered Kernel Alignment (CKA) on feature kernels has been proposed to compare deep networks but has not been used as an optimization objective in Bayesian deep learning. In this paper, we explore the use of CKA in Bayesian deep learning to generate diverse ensembles and hypernetworks that output a network posterior. Noting that CKA projects kernels onto a unit hypersphere and that directly optimizing the CKA objective leads to diminishing gradients when two networks are very similar. We propose adopting the approach of hyperspherical energy (HE) on top of CKA kernels to address this drawback and improve training stability. Additionally, by leveraging CKA-based feature kernels, we derive feature repulsive terms applied to synthetically generated outlier examples. Experiments on both diverse ensembles and hypernetworks show that our approach significantly outperforms baselines in terms of uncertainty quantification in both synthetic and realistic outlier detection tasks.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.00259

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large Legislative Models: Towards Efficient AI Policymaking in Economic Simulations

Gasztowtt, Henry, Smith, Benjamin, Zhu, Vincent, Bai, Qinxun, Zhang, Edwin

arXiv.org Artificial IntelligenceOct-10-2024

The improvement of economic policymaking presents an opportunity for broad societal benefit, a notion that has inspired research towards AI-driven policymaking tools. AI policymaking holds the potential to surpass human performance through the ability to process data quickly at scale. However, existing RL-based methods exhibit sample inefficiency, and are further limited by an inability to flexibly incorporate nuanced information into their decision-making processes. Thus, we propose a novel method in which we instead utilize pre-trained Large Language Models (LLMs), as sample-efficient policymakers in socially complex multi-agent reinforcement learning (MARL) scenarios. We demonstrate significant efficiency gains, outperforming existing methods across three environments. Our code is available at https://github.com/hegasz/large-legislative-models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.08345

Genre: Research Report (1.00)

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)
Government (1.00)
Banking & Finance > Economy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

Li, Jiachen, Zhang, Edwin, Yin, Ming, Bai, Qinxun, Wang, Yu-Xiang, Wang, William Yang

arXiv.org Artificial IntelligenceJul-22-2023

Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a policy is trained to maximize a learned value function while constrained by the behavior policy to avoid a significant distributional shift. In this paper, we propose our closed-form policy improvement operators. We make a novel observation that the behavior constraint naturally motivates the use of first-order Taylor approximation, leading to a linear approximation of the policy objective. Additionally, as practical datasets are usually collected by heterogeneous policies, we model the behavior policies as a Gaussian Mixture and overcome the induced optimization difficulties by leveraging the LogSumExp's lower bound and Jensen's Inequality, giving rise to a closed-form policy improvement operator. We instantiate offline RL algorithms with our novel policy improvement operators and empirically demonstrate their effectiveness over state-of-the-art algorithms on the standard D4RL benchmark. Our code is available at https://cfpi-icml23.github.io/.

artificial intelligence, machine learning, offline reinforcement learning, (11 more...)

arXiv.org Artificial Intelligence

2211.15956

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Generative Particle Variational Inference via Estimation of Functional Gradients

Ratzlaff, Neale, Bai, Qinxun, Fuxin, Li, Xu, Wei

arXiv.org Machine LearningMar-1-2021

Recently, particle-based variational inference (ParVI) methods have gained interest because they directly minimize the Kullback-Leibler divergence and do not suffer from approximation errors from the evidence-based lower bound. However, many ParVI approaches do not allow arbitrary sampling from the posterior, and the few that do allow such sampling suffer from suboptimality. This work proposes a new method for learning to approximately sample from the posterior distribution. We construct a neural sampler that is trained with the functional gradient of the KL-divergence between the empirical sampling distribution and the target distribution, assuming the gradient resides within a reproducing kernel Hilbert space. Our generative ParVI (GPVI) approach maintains the asymptotic performance of ParVI methods while offering the flexibility of a generative sampler. Through carefully constructed experiments, we show that GPVI outperforms previous generative ParVI methods such as amortized SVGD, and is competitive with ParVI as well as gold-standard approaches like Hamiltonian Monte Carlo for fitting both exactly known and intractable target distributions.

artificial intelligence, generative particle variational inference, neural network, (13 more...)

arXiv.org Machine Learning

2103.01291

Country:

North America > United States > Oregon (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)

Add feedback

TopoReg: A Topological Regularizer for Classifiers

Chen, Chao, Ni, Xiuyan, Bai, Qinxun, Wang, Yusu

arXiv.org Machine LearningJun-27-2018

Regularization plays a crucial role in supervised learning. A successfully regularized model strikes a balance between a perfect description of the training data and the ability to generalize to unseen data. Most existing methods enforce a global regularization in a structure agnostic manner. In this paper, we initiate a new direction and propose to enforce the structural simplicity of the classification boundary by regularizing over its topological complexity. In particular, our measurement of topological complexity incorporates the importance of topological features (e.g., connected components, handles, and so on) in a meaningful manner, and provides a direct control over spurious topological structures. We incorporate the new measurement as a topological loss in training classifiers. We also propose an efficient algorithm to compute the gradient. Our method provides a novel way to topologically simplify the global structure of the model, without having to sacrifice too much of the flexibility of the model. We demonstrate the effectiveness of our new topological regularizer on a range of synthetic and real-world datasets.

classifier, health & medicine, oncology, (18 more...)

arXiv.org Machine Learning

1806.10714

Country: North America > United States (0.69)

Genre: Research Report (0.65)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Class Probability Estimation via Differential Geometric Regularization

Bai, Qinxun, Rosenberg, Steven, Wu, Zheng, Sclaroff, Stan

arXiv.org Machine LearningFeb-11-2016

We study the problem of supervised learning for both binary and multiclass classification from a unified geometric perspective. In particular, we propose a geometric regularization technique to find the submanifold corresponding to a robust estimator of the class probability $P(y|\pmb{x})$. The regularization term measures the volume of this submanifold, based on the intuition that overfitting produces rapid local oscillations and hence large volume of the estimator. This technique can be applied to regularize any classification function that satisfies two requirements: firstly, an estimator of the class probability can be obtained; secondly, first and second derivatives of the class probability estimator can be calculated. In experiments, we apply our regularization technique to standard loss functions for classification, our RBF-based implementation compares favorably to widely used regularization methods for both binary and multiclass classification.

artificial intelligence, class probability estimation, machine learning, (13 more...)

arXiv.org Machine Learning

1503.01436

Country: North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback