Goto

Collaborating Authors

 Liu, Yanfang


A Training-Free Conditional Diffusion Model for Learning Stochastic Dynamical Systems

arXiv.org Artificial Intelligence

This study introduces a training-free conditional diffusion model for learning unknown stochastic differential equations (SDEs) using data. The proposed approach addresses key challenges in computational efficiency and accuracy for modeling SDEs by utilizing a score-based diffusion model to approximate their stochastic flow map. Unlike the existing methods, this technique is based on an analytically derived closed-form exact score function, which can be efficiently estimated by Monte Carlo method using the trajectory data, and eliminates the need for neural network training to learn the score function. By generating labeled data through solving the corresponding reverse ordinary differential equation, the approach enables supervised learning of the flow map. Extensive numerical experiments across various SDE types, including linear, nonlinear, and multi-dimensional systems, demonstrate the versatility and effectiveness of the method. The learned models exhibit significant improvements in predicting both short-term and long-term behaviors of unknown stochastic systems, often surpassing baseline methods like GANs in estimating drift and diffusion coefficients.


Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers and Docking

arXiv.org Artificial Intelligence

In the pursuit of autonomous spacecraft proximity maneuvers and docking(PMD), we introduce a novel Bayesian actor-critic reinforcement learning algorithm to learn a control policy with the stability guarantee. The PMD task is formulated as a Markov decision process that reflects the relative dynamic model, the docking cone and the cost function. Drawing from the principles of Lyapunov theory, we frame the temporal difference learning as a constrained Gaussian process regression problem. This innovative approach allows the state-value function to be expressed as a Lyapunov function, leveraging the Gaussian process and deep kernel learning. We develop a novel Bayesian quadrature policy optimization procedure to analytically compute the policy gradient while integrating Lyapunov-based stability constraints. This integration is pivotal in satisfying the rigorous safety demands of spaceflight missions. The proposed algorithm has been experimentally evaluated on a spacecraft air-bearing testbed and shows impressive and promising performance.


Diffusion-Model-Assisted Supervised Learning of Generative Models for Density Estimation

arXiv.org Artificial Intelligence

We present a supervised learning framework of training generative models for density estimation. Generative models, including generative adversarial networks, normalizing flows, variational auto-encoders, are usually considered as unsupervised learning models, because labeled data are usually unavailable for training. Despite the success of the generative models, there are several issues with the unsupervised training, e.g., requirement of reversible architectures, vanishing gradients, and training instability. To enable supervised learning in generative models, we utilize the score-based diffusion model to generate labeled data. Unlike existing diffusion models that train neural networks to learn the score function, we develop a training-free score estimation method. This approach uses mini-batch-based Monte Carlo estimators to directly approximate the score function at any spatial-temporal location in solving an ordinary differential equation (ODE), corresponding to the reverse-time stochastic differential equation (SDE). This approach can offer both high accuracy and substantial time savings in neural network training. Once the labeled data are generated, we can train a simple fully connected neural network to learn the generative model in the supervised manner. Compared with existing normalizing flow models, our method does not require to use reversible neural networks and avoids the computation of the Jacobian matrix. Compared with existing diffusion models, our method does not need to solve the reverse-time SDE to generate new samples. As a result, the sampling efficiency is significantly improved. We demonstrate the performance of our method by applying it to a set of 2D datasets as well as real data from the UCI repository.


Contrastive Prompt Learning-based Code Search based on Interaction Matrix

arXiv.org Artificial Intelligence

Code search aims to retrieve the code snippet that highly matches the given query described in natural language. Recently, many code pre-training approaches have demonstrated impressive performance on code search. However, existing code search methods still suffer from two performance constraints: inadequate semantic representation and the semantic gap between natural language (NL) and programming language (PL). In this paper, we propose CPLCS, a contrastive prompt learning-based code search method based on the cross-modal interaction mechanism. CPLCS comprises:(1) PL-NL contrastive learning, which learns the semantic matching relationship between PL and NL representations; (2) a prompt learning design for a dual-encoder structure that can alleviate the problem of inadequate semantic representation; (3) a cross-modal interaction mechanism to enhance the fine-grained mapping between NL and PL. We conduct extensive experiments to evaluate the effectiveness of our approach on a real-world dataset across six programming languages. The experiment results demonstrate the efficacy of our approach in improving semantic representation quality and mapping ability between PL and NL.


A Multilayer Framework for Online Metric Learning

arXiv.org Artificial Intelligence

Online metric learning has been widely applied in classification and retrieval. It can automatically learn a suitable metric from data by restricting similar instances to be separated from dissimilar instances with a given margin. However, the existing online metric learning algorithms have limited performance in real-world classifications, especially when data distributions are complex. To this end, this paper proposes a multilayer framework for online metric learning to capture the nonlinear similarities among instances. Different from the traditional online metric learning, which can only learn one metric space, the proposed Multi-Layer Online Metric Learning (MLOML) takes an online metric learning algorithm as a metric layer and learns multiple hierarchical metric spaces, where each metric layer follows a nonlinear layers for the complicated data distribution. Moreover, the forward propagation (FP) strategy and backward propagation (BP) strategy are employed to train the hierarchical metric layers. To build a metric layer of the proposed MLOML, a new Mahalanobis-based Online Metric Learning (MOML) algorithm is presented based on the passive-aggressive strategy and one-pass triplet construction strategy. Furthermore, in a progressively and nonlinearly learning way, MLOML has a stronger learning ability than traditional online metric learning in the case of limited available training data. To make the learning process more explainable and theoretically guaranteed, theoretical analysis is provided. The proposed MLOML enjoys several nice properties, indeed learns a metric progressively, and performs better on the benchmark datasets. Extensive experiments with different settings have been conducted to verify these properties of the proposed MLOML.


Seeking the Yield Barrier: High-Dimensional SRAM Evaluation Through Optimal Manifold

arXiv.org Artificial Intelligence

Being able to efficiently obtain an accurate estimate of the failure probability of SRAM components has become a central issue as model circuits shrink their scale to submicrometer with advanced technology nodes. In this work, we revisit the classic norm minimization method. We then generalize it with infinite components and derive the novel optimal manifold concept, which bridges the surrogate-based and importance sampling (IS) yield estimation methods. We then derive a sub-optimal manifold, optimal hypersphere, which leads to an efficient sampling method being aware of the failure boundary called onion sampling. Finally, we use a neural coupling flow (which learns from samples like a surrogate model) as the IS proposal distribution. These combinations give rise to a novel yield estimation method, named Optimal Manifold Important Sampling (OPTIMIS), which keeps the advantages of the surrogate and IS methods to deliver state-of-the-art performance with robustness and consistency, with up to 3.5x in efficiency and 3x in accuracy over the best of SOTA methods in High-dimensional SRAM evaluation.


Characteristic of partition-circuit matroid through approximation number

arXiv.org Artificial Intelligence

Rough set theory is a useful tool to deal with uncertain, granular and incomplete knowledge in information systems. And it is based on equivalence relations or partitions. Matroid theory is a structure that generalizes linear independence in vector spaces, and has a variety of applications in many fields. In this paper, we propose a new type of matroids, namely, partition-circuit matroids, which are induced by partitions. Firstly, a partition satisfies circuit axioms in matroid theory, then it can induce a matroid which is called a partition-circuit matroid. A partition and an equivalence relation on the same universe are one-to-one corresponding, then some characteristics of partition-circuit matroids are studied through rough sets. Secondly, similar to the upper approximation number which is proposed by Wang and Zhu, we define the lower approximation number. Some characteristics of partition-circuit matroids and the dual matroids of them are investigated through the lower approximation number and the upper approximation number.


Parametric matroid of rough set

arXiv.org Artificial Intelligence

Rough set is mainly concerned with the approximations of objects through an equivalence relation on a universe. Matroid is a combinatorial generalization of linear independence in vector spaces. In this paper, we define a parametric set family, with any subset of a universe as its parameter, to connect rough sets and matroids. On the one hand, for a universe and an equivalence relation on the universe, a parametric set family is defined through the lower approximation operator. This parametric set family is proved to satisfy the independent set axiom of matroids, therefore it can generate a matroid, called a parametric matroid of the rough set. Three equivalent representations of the parametric set family are obtained. Moreover, the parametric matroid of the rough set is proved to be the direct sum of a partition-circuit matroid and a free matroid. On the other hand, since partition-circuit matroids were well studied through the lower approximation number, we use it to investigate the parametric matroid of the rough set. Several characteristics of the parametric matroid of the rough set, such as independent sets, bases, circuits, the rank function and the closure operator, are expressed by the lower approximation number.