AITopics

2504.04829

Country: Asia > China (0.46)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Machine LearningMar-23-2025

Efficient Transformed Gaussian Process State-Space Models for Non-Stationary High-Dimensional Dynamical Systems

Lin, Zhidi, Li, Ying, Yin, Feng, Maroñas, Juan, Thiéry, Alexandre H.

Gaussian process state-space models (GPSSMs) have emerged as a powerful framework for modeling dynamical systems, offering interpretable uncertainty quantification and inherent regularization. However, existing GPSSMs face significant challenges in handling high-dimensional, non-stationary systems due to computational inefficiencies, limited scalability, and restrictive stationarity assumptions. In this paper, we propose an efficient transformed Gaussian process state-space model (ETGPSSM) to address these limitations. Our approach leverages a single shared Gaussian process (GP) combined with normalizing flows and Bayesian neural networks, enabling efficient modeling of complex, high-dimensional state transitions while preserving scalability. To address the lack of closed-form expressions for the implicit process in the transformed GP, we follow its generative process and introduce an efficient variational inference algorithm, aided by the ensemble Kalman filter (EnKF), to enable computationally tractable learning and inference. Extensive empirical evaluations on synthetic and real-world datasets demonstrate the superior performance of our ETGPSSM in system dynamics learning, high-dimensional state estimation, and time-series forecasting, outperforming existing GPSSMs and neural network-based methods in both accuracy and computational efficiency.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

2503.18309

Country: Asia > China (0.28)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

arXiv.org Artificial IntelligenceNov-18-2024

Hybrid Data-Driven SSM for Interpretable and Label-Free mmWave Channel Prediction

Sun, Yiyong, He, Jiajun, Lin, Zhidi, Pu, Wenqiang, Yin, Feng, So, Hing Cheung

Accurate prediction of mmWave time-varying channels is essential for mitigating the issue of channel aging in complex scenarios owing to high user mobility. Existing channel prediction methods have limitations: classical model-based methods often struggle to track highly nonlinear channel dynamics due to limited expert knowledge, while emerging data-driven methods typically require substantial labeled data for effective training and often lack interpretability. To address these issues, this paper proposes a novel hybrid method that integrates a data-driven neural network into a conventional model-based workflow based on a state-space model (SSM), implicitly tracking complex channel dynamics from data without requiring precise expert knowledge. Additionally, a novel unsupervised learning strategy is developed to train the embedded neural network solely with unlabeled data. Theoretical analyses and ablation studies are conducted to interpret the enhanced benefits gained from the hybrid integration. Numerical simulations based on the 3GPP mmWave channel model corroborate the superior prediction accuracy of the proposed method, compared to state-of-the-art methods that are either purely model-based or data-driven. Furthermore, extensive experiments validate its robustness against various challenging factors, including among others severe channel variations and high noise levels.

artificial intelligence, label-free mmwave channel prediction, machine learning, (1 more...)

2411.11576

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.44)

arXiv.org Machine LearningApr-2-2024

Preventing Model Collapse in Gaussian Process Latent Variable Models

Li, Ying, Lin, Zhidi, Yin, Feng, Zhang, Michael Minyi

Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models, commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, which leads to a type of model collapse characterized primarily by vague latent representations that do not reflect the underlying structure of the data. This paper addresses these issues by, first, theoretically examining the impact of the projection variance on model collapse through the lens of a linear GPLVM. Second, we address the problem of model collapse due to inadequate kernel flexibility by integrating the spectral mixture (SM) kernel and a differentiable random Fourier feature (RFF) kernel approximation, which ensures computational scalability and efficiency through off-the-shelf automatic differentiation tools for learning the kernel hyperparameters, projection variance, and latent representations within the variational inference framework. The proposed GPLVM, named advisedRFLVM, is evaluated across diverse datasets and consistently outperforms various salient competing models, including state-of-the-art variational autoencoders (VAEs) and GPLVM variants, in terms of informative latent representations and missing data imputation.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2404.01697

Country:

Asia > China (0.28)
North America > United States > Virginia (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

arXiv.org Artificial IntelligenceMar-8-2024

Inverse Design of Photonic Crystal Surface Emitting Lasers is a Sequence Modeling Problem

Zhang, Ceyao, Li, Renjie, Zhang, Cheng, Zhang, Zhaoyu, Yin, Feng

Photonic Crystal Surface Emitting Lasers (PCSEL)'s inverse design demands expert knowledge in physics, materials science, and quantum mechanics which is prohibitively labor-intensive. Advanced AI technologies, especially reinforcement learning (RL), have emerged as a powerful tool to augment and accelerate this inverse design process. By modeling the inverse design of PCSEL as a sequential decision-making problem, RL approaches can construct a satisfactory PCSEL structure from scratch. However, the data inefficiency resulting from online interactions with precise and expensive simulation environments impedes the broader applicability of RL approaches. Recently, sequential models, especially the Transformer architecture, have exhibited compelling performance in sequential decision-making problems due to their simplicity and scalability to large language models. In this paper, we introduce a novel framework named PCSEL Inverse Design Transformer (PiT) that abstracts the inverse design of PCSEL as a sequence modeling problem. The central part of our PiT is a Transformer-based structure that leverages the past trajectories and current states to predict the current actions. Compared with the traditional RL approaches, PiT can output the optimal actions and achieve target PCSEL designs by leveraging offline data and conditioning on the desired return. Results demonstrate that PiT achieves superior performance and data efficiency compared to baselines.

evolutionary algorithm, large language model, machine learning, (17 more...)

2403.05149

Country: Asia > China (0.30)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Artificial IntelligenceJan-11-2024

ProAgent: Building Proactive Cooperative Agents with Large Language Models

Zhang, Ceyao, Yang, Kaijie, Hu, Siyi, Wang, Zihao, Li, Guanghe, Sun, Yihang, Zhang, Cheng, Zhang, Zhaowei, Liu, Anji, Zhu, Song-Chun, Chang, Xiaojun, Zhang, Junge, Yin, Feng, Liang, Yitao, Yang, Yaodong

Building agents with adaptive behavior in cooperative tasks stands as a paramount goal in the realm of multi-agent systems. Current approaches to developing cooperative agents rely primarily on learning-based methods, whose policy generalization depends heavily on the diversity of teammates they interact with during the training phase. Such reliance, however, constrains the agents' capacity for strategic adaptation when cooperating with unfamiliar teammates, which becomes a significant challenge in zero-shot coordination scenarios. To address this challenge, we propose ProAgent, a novel framework that harnesses large language models (LLMs) to create proactive agents capable of dynamically adapting their behavior to enhance cooperation with teammates. ProAgent can analyze the present state, and infer the intentions of teammates from observations. It then updates its beliefs in alignment with the teammates' subsequent actual behaviors. Moreover, ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various of coordination scenarios. Experimental evaluations conducted within the Overcooked-AI environment unveil the remarkable performance superiority of ProAgent, outperforming five methods based on self-play and population-based training when cooperating with AI agents. Furthermore, in partnered with human proxy models, its performance exhibits an average improvement exceeding 10% compared to the current state-of-the-art method. For more information about our project, please visit~\url{https://pku-proagent.github.io}.

large language model, natural language, proagent, (18 more...)

2308.11339

Country: Asia > China (0.46)

Genre:

Research Report (0.84)
Press Release (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

arXiv.org Machine LearningJan-8-2024

Ensemble Kalman Filtering Meets Gaussian Process SSM for Non-Mean-Field and Online Inference

Lin, Zhidi, Sun, Yiyong, Yin, Feng, Thiéry, Alexandre Hoang

Gaussian process state-space models (GPSSMs) are a versatile and principled family of nonlinear dynamical system models. However, existing variational learning and inference methods for GPSSMs often necessitate optimizing a substantial number of variational parameters, leading to inadequate performance and efficiency. To overcome this issue, we propose incorporating the ensemble Kalman filter (EnKF), a well-established model-based filtering technique, into the variational inference framework to approximate the posterior distribution of latent states. This utilization of EnKF can effectively exploit the dependencies between latent states and GP dynamics, while eliminating the need for parameterizing the variational distribution, thereby significantly reducing the number of variational parameters. Moreover, we show that our proposed algorithm allows straightforward evaluation of an approximated evidence lower bound (ELBO) in variational inference via simply summating multiple terms with readily available closed-form solutions. Leveraging automatic differentiation tools, we hence can maximize the ELBO and train the GPSSM efficiently. We also extend the proposed algorithm to accommodate an online setting and provide detailed algorithmic analyses and insights. Extensive evaluation on diverse real and synthetic datasets demonstrates the superiority of our EnKF-aided variational inference algorithms in terms of learning and inference performance compared to existing methods.

artificial intelligence, machine learning, variational distribution, (17 more...)

2312.0591

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJan-7-2024

Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects

Cheng, Yuheng, Zhang, Ceyao, Zhang, Zhengwen, Meng, Xiangrui, Hong, Sirui, Li, Wenhao, Wang, Zihao, Wang, Zekai, Yin, Feng, Zhao, Junhua, He, Xiuqiang

Intelligent agents stand out as a potential path toward artificial general intelligence (AGI). Thus, researchers have dedicated significant effort to diverse implementations for them. Benefiting from recent progress in large language models (LLMs), LLM-based agents that use universal natural language as an interface exhibit robust generalization capabilities across various applications -- from serving as autonomous general-purpose task assistants to applications in coding, social, and economic domains, LLM-based agents offer extensive exploration opportunities. This paper surveys current research to provide an in-depth overview of LLM-based intelligent agents within single-agent and multi-agent systems. It covers their definitions, research frameworks, and foundational components such as their composition, cognitive and planning methods, tool utilization, and responses to environmental feedback. We also delve into the mechanisms of deploying LLM-based agents in multi-agent systems, including multi-role collaboration, message passing, and strategies to alleviate communication issues between agents. The discussions also shed light on popular datasets and application scenarios. We conclude by envisioning prospects for LLM-based agents, considering the evolving landscape of AI and natural language processing.

large language model, machine learning, natural language, (19 more...)

2401.03428

Country:

Asia (0.92)
North America > United States (0.67)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(8 more...)

arXiv.org Artificial IntelligenceDec-26-2023

Sparsity-Aware Distributed Learning for Gaussian Processes with Linear Multiple Kernel

Suwandi, Richard Cornelius, Lin, Zhidi, Yin, Feng, Wang, Zhiguo, Theodoridis, Sergios

Gaussian processes (GPs) stand as crucial tools in machine learning and signal processing, with their effectiveness hinging on kernel design and hyper-parameter optimization. This paper presents a novel GP linear multiple kernel (LMK) and a generic sparsity-aware distributed learning framework to optimize the hyper-parameters. The newly proposed grid spectral mixture (GSM) kernel is tailored for multi-dimensional data, effectively reducing the number of hyper-parameters while maintaining good approximation capabilities. We further demonstrate that the associated hyper-parameter optimization of this kernel yields sparse solutions. To exploit the inherent sparsity property of the solutions, we introduce the Sparse LInear Multiple Kernel Learning (SLIM-KL) framework. The framework incorporates a quantized alternating direction method of multipliers (ADMM) scheme for collaborative learning among multiple agents, where the local optimization problem is solved using a distributed successive convex approximation (DSCA) algorithm. SLIM-KL effectively manages large-scale hyper-parameter optimization for the proposed kernel, simultaneously ensuring data privacy and minimizing communication costs. Theoretical analysis establishes convergence guarantees for the learning framework, while experiments on diverse datasets demonstrate the superior prediction performance and efficiency of our proposed methods.

artificial intelligence, machine learning, optimization problem, (17 more...)

2309.08201

Country:

Asia > China (1.00)
North America (0.67)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningNov-28-2023

Attentional Graph Neural Networks for Robust Massive Network Localization

Yan, Wenzhong, Wang, Juntao, Yin, Feng, Zoubir, Abdelhak M.

Graph neural networks (GNNs) have gained significant popularity for classification tasks in machine learning, yet their applications to regression problems remain limited. Concurrently, attention mechanisms have emerged as powerful tools in sequential learning tasks. In this paper, we employ GNNs and attention mechanisms to address a classical but challenging nonlinear regression problem: network localization. We propose a novel GNN-based network localization method that achieves exceptional stability and accuracy in the presence of severe non-line-of-sight (NLOS) propagations, while eliminating the need for laborious offline calibration or NLOS identification. Extensive experimental results validate the effectiveness and high accuracy of our GNN-based localization model, particularly in challenging NLOS scenarios. However, the proposed GNN-based model exhibits limited flexibility, and its accuracy is highly sensitive to a specific hyperparameter that determines the graph structure. To address the limitations and extend the applicability of the GNN-based model to real scenarios, we introduce two attentional graph neural networks (AGNNs) that offer enhanced flexibility and the ability to automatically learn the optimal hyperparameter for each node. Experimental results confirm that the AGNN models are able to enhance localization accuracy, providing a promising solution for real-world applications. We also provide some analyses of the improved performance achieved by the AGNN models from the perspectives of dynamic attention and signal denoising characteristics.

artificial intelligence, machine learning, node, (14 more...)

2311.16856

Genre: Research Report > New Finding (0.48)

Industry: Telecommunications (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)