AITopics

2312.13473

Country:

Europe > San Marino > Fiorentino > Fiorentino (0.04)
Europe > Italy (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.62)

Goplerud, Max, Papaspiliopoulos, Omiros, Zanella, Giacomo

Partially factorized variational inference for high-dimensional mixed models

arXiv.org Machine LearningDec-20-2023

While generalized linear mixed models (GLMMs) are a fundamental tool in applied statistics, many specifications -- such as those involving categorical factors with many levels or interaction terms -- can be computationally challenging to estimate due to the need to compute or approximate high-dimensional integrals. Variational inference (VI) methods are a popular way to perform such computations, especially in the Bayesian context. However, naive VI methods can provide unreliable uncertainty quantification. We show that this is indeed the case in the GLMM context, proving that standard VI (i.e. mean-field) dramatically underestimates posterior uncertainty in high-dimensions. We then show how appropriately relaxing the mean-field assumption leads to VI methods whose uncertainty quantification does not deteriorate in high-dimensions, and whose total computational cost scales linearly with the number of parameters and observations. Our theoretical and numerical results focus on GLMMs with Gaussian or binomial likelihoods, and rely on connections to random graph theory to obtain sharp high-dimensional asymptotic analysis. We also provide generic results, which are of independent interest, relating the accuracy of variational inference to the convergence rate of the corresponding coordinate ascent variational inference (CAVI) algorithm for Gaussian targets. Our proposed partially-factorized VI (PF-VI) methodology for GLMMs is implemented in the R package vglmer, see https://github.com/mgoplerud/vglmer . Numerical results with simulated and real data examples illustrate the favourable computation cost versus accuracy trade-off of PF-VI.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2312.13148

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningDec-20-2023

Data-driven Piecewise Affine Decision Rules for Stochastic Programming with Covariate Information

Zhang, Yiyang, Liu, Junyi, Zhao, Xiaobo

Focusing on stochastic programming (SP) with covariate information, this paper proposes an empirical risk minimization (ERM) method embedded within a nonconvex piecewise affine decision rule (PADR), which aims to learn the direct mapping from features to optimal decisions. We establish the nonasymptotic consistency result of our PADR-based ERM model for unconstrained problems and asymptotic consistency result for constrained ones. To solve the nonconvex and nondifferentiable ERM problem, we develop an enhanced stochastic majorization-minimization algorithm and establish the asymptotic convergence to (composite strong) directional stationarity along with complexity analysis. We show that the proposed PADR-based ERM method applies to a broad class of nonconvex SP problems with theoretical consistency guarantees and computational tractability. Our numerical study demonstrates the superior performance of PADR-based ERM methods compared to state-of-the-art approaches under various settings, with significantly lower costs, less computation time, and robustness to feature dimensions and nonlinearity of the underlying dependency.

artificial intelligence, data-driven padr, machine learning, (19 more...)

2304.13646

Country:

North America > United States > New York (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.84)

arXiv.org Artificial IntelligenceDec-19-2023

Gaussian process learning of nonlinear dynamics

Ye, Dongwei, Guo, Mengwu

One of the pivotal tasks in scientific machine learning is to represent underlying dynamical systems from time series data. Many methods for such dynamics learning explicitly require the derivatives of state data, which are not directly available and can be approximated conventionally by finite differences. However, the discrete approximations of time derivatives may result in a poor estimation when state data are scarce and/or corrupted by noise, thus compromising the predictiveness of the learned dynamical models. To overcome this technical hurdle, we propose a new method that learns nonlinear dynamics through a Bayesian inference of characterizing model parameters. This method leverages a Gaussian process representation of states, and constructs a likelihood function using the correlation between state data and their derivatives, yet prevents explicit evaluations of time derivatives. Through a Bayesian scheme, a probabilistic estimate of the model parameters is given by the posterior distribution, and thus a quantification is facilitated for uncertainties from noisy state data and the learning process. Specifically, we will discuss the applicability of the proposed method to two typical scenarios for dynamical systems: parameter identification and estimation with an affine structure of the system, and nonlinear parametric approximation without prior knowledge.

approximation, gaussian process, initial condition, (16 more...)

2312.12193

Country:

Oceania > New Zealand (0.04)
Europe > Netherlands (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
(2 more...)

Hur, YoonHaeng, Khoo, Yuehaw

Matching via Distance Profiles

arXiv.org Machine LearningDec-19-2023

In this paper, we introduce and study matching methods based on distance profiles. For the matching of point clouds, the proposed method is easily implementable by solving a linear program, circumventing the computational obstacles of quadratic matching. Also, we propose and analyze a flexible way to execute location-to-location matching using distance profiles. Moreover, we provide a statistical estimation error analysis in the context of location-to-location matching using empirical process theory. Furthermore, we apply our method to a certain model and show its noise stability by characterizing conditions on the noise level for the matching to be successful. Lastly, we demonstrate the performance of the proposed method and compare it with some existing methods using synthetic and real data.

artificial intelligence, distance profile, machine learning, (15 more...)

2312.12641

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)

arXiv.org Artificial IntelligenceDec-18-2023

FedCompetitors: Harmonious Collaboration in Federated Learning with Competing Participants

Tan, Shanli, Cheng, Hao, Wu, Xiaohu, Yu, Han, He, Tiantian, Ong, Yew-Soon, Wang, Chongjun, Tao, Xiaofeng

Federated learning (FL) provides a privacy-preserving approach for collaborative training of machine learning models. Given the potential data heterogeneity, it is crucial to select appropriate collaborators for each FL participant (FL-PT) based on data complementarity. Recent studies have addressed this challenge. Similarly, it is imperative to consider the inter-individual relationships among FL-PTs where some FL-PTs engage in competition. Although FL literature has acknowledged the significance of this scenario, practical methods for establishing FL ecosystems remain largely unexplored. In this paper, we extend a principle from the balance theory, namely ``the friend of my enemy is my enemy'', to ensure the absence of conflicting interests within an FL ecosystem. The extended principle and the resulting problem are formulated via graph theory and integer linear programming. A polynomial-time algorithm is proposed to determine the collaborators of each FL-PT. The solution guarantees high scalability, allowing even competing FL-PTs to smoothly join the ecosystem without conflict of interest. The proposed framework jointly considers competition and data heterogeneity. Extensive experiments on real-world and synthetic data demonstrate its efficacy compared to five alternative approaches, and its ability to establish efficient collaboration networks among FL-PTs.

fl-pt, node, time complexity, (13 more...)

2312.11391

Country:

Asia > Singapore (0.05)
Asia > China > Beijing > Beijing (0.04)
North America > United States (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Information Technology > Data Science > Data Mining > Big Data (0.34)

Santos, Augusto, Rente, Diogo, Seabra, Rui, Moura, José M. F.

Learning the Causal Structure of Networked Dynamical Systems under Latent Nodes and Structured Noise

arXiv.org Artificial IntelligenceDec-18-2023

This paper considers learning the hidden causal network of a linear networked dynamical system (NDS) from the time series data at some of its nodes -- partial observability. The dynamics of the NDS are driven by colored noise that generates spurious associations across pairs of nodes, rendering the problem much harder. To address the challenge of noise correlation and partial observability, we assign to each pair of nodes a feature vector computed from the time series data of observed nodes. The feature embedding is engineered to yield structural consistency: there exists an affine hyperplane that consistently partitions the set of features, separating the feature vectors corresponding to connected pairs of nodes from those corresponding to disconnected pairs. The causal inference problem is thus addressed via clustering the designed features. We demonstrate with simple baseline supervised methods the competitive performance of the proposed causal inference mechanism under broad connectivity regimes and noise correlation levels, including a real world network. Further, we devise novel technical guarantees of structural consistency for linear NDS under the considered regime.

matrix, observability, osc, (15 more...)

2312.05974

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(5 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Jongeneel, Wouter, Li, Mengmeng, Kuhn, Daniel

A Large Deviations Perspective on Policy Gradient Algorithms

arXiv.org Machine LearningDec-14-2023

Motivated by policy gradient methods in the context of reinforcement learning, we derive the first large deviation rate function for the iterates generated by stochastic gradient descent for possibly non-convex objectives satisfying a Polyak-Lojasiewicz condition. Leveraging the contraction principle from large deviations theory, we illustrate the potential of this result by showing how convergence properties of policy gradient with a softmax parametrization and an entropy regularized objective can be naturally extended to a wide spectrum of other policy parametrizations.

artificial intelligence, exp, machine learning, (15 more...)

2311.07411

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Tang, Bo, Khalil, Elias B.

CaVE: A Cone-Aligned Approach for Fast Predict-then-optimize with Binary Linear Programs

arXiv.org Artificial IntelligenceDec-12-2023

The end-to-end predict-then-optimize framework, also known as decision-focused learning, has gained popularity for its ability to integrate optimization into the training procedure of machine learning models that predict the unknown cost (objective function) coefficients of optimization problems from contextual instance information. Naturally, most of the problems of interest in this space can be cast as integer linear programs. In this work, we focus on binary linear programs (BLPs) and propose a new end-to-end training method for predict-then-optimize. Our method, Cone-aligned Vector Estimation (CaVE), aligns the predicted cost vectors with the cone corresponding to the true optimal solution of a training instance. When the predicted cost vector lies inside the cone, the optimal solution to the linear relaxation of the binary problem is optimal w.r.t. to the true cost vector. Not only does this alignment produce decision-aware learning models, but it also dramatically reduces training time as it circumvents the need to solve BLPs to compute a loss function with its gradients. Experiments across multiple datasets show that our method exhibits a favorable trade-off between training time and solution quality, particularly with large-scale optimization problems such as vehicle routing, a hard BLP that has yet to benefit from predict-then-optimize methods in the literature due to its difficulty.

optimal subcone, optimization problem, projection, (13 more...)

2312.07718

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Industry: Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceDec-11-2023

System-level Safety Guard: Safe Tracking Control through Uncertain Neural Network Dynamics Models

Li, Xiao, Li, Yutong, Girard, Anouck, Kolmanovsky, Ilya

The Neural Network (NN), as a black-box function approximator, has been considered in many control and robotics applications. However, difficulties in verifying the overall system safety in the presence of uncertainties hinder the modular deployment of NN in safety-critical systems. In this paper, we leverage the NNs as predictive models for trajectory tracking of unknown dynamical systems. We consider controller design in the presence of both intrinsic uncertainty and uncertainties from other system modules. In this setting, we formulate the constrained trajectory tracking problem and show that it can be solved using Mixed-integer Linear Programming (MILP). The proposed MILP-based solution enjoys a provable safety guarantee for the overall system, and the approach is empirically demonstrated in robot navigation and obstacle avoidance through simulations. The demonstration videos are available at https://xiaolisean.github.io/publication/2023-11-01-L4DC2024.

constraint, inequality, module, (15 more...)

2312.0681

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.40)

Industry:

Automobiles & Trucks (0.46)
Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.86)