AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Convex Non-negative Matrix Factorization Through Quantum Annealing

Zaiou, Ahmed, Matei, Basarab, Bennani, Younès, Hibti, Mohamed

arXiv.org Machine LearningMar-28-2022

In this paper we provide the quantum version of the Convex Non-negative Matrix Factorization algorithm (Convex-NMF) by using the D-wave quantum annealer. More precisely, we use D-wave 2000Q to find the low rank approximation of a fixed real-valued matrix X by the product of two non-negative matrices factors W and G such that the Frobenius norm of the difference X-XWG is minimized. In order to solve this optimization problem we proceed in two steps. In the first step we transform the global real optimization problem depending on W,G into two quadratic unconstrained binary optimization problems (QUBO) depending on W and G respectively. In the second step we use an alternative strategy between the two QUBO problems corresponding to W and G to find the global solution. The running of these two QUBO problems on D-wave 2000Q need to use an embedding to the chimera graph of D-wave 2000Q, this embedding is limited by the number of qubits of D-wave 2000Q. We perform a study on the maximum number of real data to be used by our approach on D-wave 2000Q. The proposed study is based on the number of qubits used to represent each real variable. We also tested our approach on D-Wave 2000Q with several randomly generated data sets to prove that our approach is faster than the classical approach and also to prove that it gets the best results.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Machine Learning

2203.15634

Country: Europe > France (0.05)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

FedLGA: Towards System-Heterogeneity of Federated Learning via Local Gradient Approximation

Li, Xingyu, Qu, Zhe, Tang, Bo, Lu, Zhuo

arXiv.org Artificial IntelligenceMar-28-2022

Federated Learning (FL) is a decentralized machine learning architecture, which leverages a large number of remote devices to learn a joint model with distributed training data. However, the system-heterogeneity is one major challenge in a FL network to achieve robust distributed learning performance, which comes from two aspects: i) device-heterogeneity due to the diverse computational capacity among devices; ii) data-heterogeneity due to the non-identically distributed data across the network. Prior studies addressing the heterogeneous FL issue, e.g., FedProx, lack formalization and it remains an open problem. This work first formalizes the system-heterogeneous FL problem and proposes a new algorithm, called FedLGA, to address this problem by bridging the divergence of local model updates via gradient approximation. To achieve this, FedLGA provides an alternated Hessian estimation method, which only requires extra linear complexity on the aggregator. Theoretically, we show that with a device-heterogeneous ratio $\rho$, FedLGA achieves convergence rates on non-i.i.d. distributed FL training data for the non-convex optimization problems with $\mathcal{O} \left( \frac{(1+\rho)}{\sqrt{ENT}} + \frac{1}{T} \right)$ and $\mathcal{O} \left( \frac{(1+\rho)\sqrt{E}}{\sqrt{TK}} + \frac{1}{T} \right)$ for full and partial device participation respectively, where $E$ is the number of local learning epoch, $T$ is the number of total communication round, $N$ is the total device number and $K$ is the number of selected device in one communication round under partially participation scheme. The results of comprehensive experiments on multiple datasets show that FedLGA outperforms current FL methods against the system-heterogeneity.

fedlga, remote device, tex class file, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TCYB.2023.3247365

2112.11989

Country:

North America > United States > Florida > Hillsborough County > Tampa (0.14)
North America > United States > New York (0.04)
North America > United States > Mississippi > Mississippi County > Mississippi State (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.93)
Education > Educational Setting (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Deep reinforcement learning for optimal well control in subsurface systems with uncertain geology

Nasir, Yusuf, Durlofsky, Louis J.

arXiv.org Artificial IntelligenceMar-24-2022

A general control policy framework based on deep reinforcement learning (DRL) is introduced for closed-loop decision making in subsurface flow settings. Traditional closed-loop modeling workflows in this context involve the repeated application of data assimilation/history matching and robust optimization steps. Data assimilation can be particularly challenging in cases where both the geological style (scenario) and individual model realizations are uncertain. The closed-loop reservoir management (CLRM) problem is formulated here as a partially observable Markov decision process, with the associated optimization problem solved using a proximal policy optimization algorithm. This provides a control policy that instantaneously maps flow data observed at wells (as are available in practice) to optimal well pressure settings. The policy is represented by a temporal convolution and gated transformer blocks. Training is performed in a preprocessing step with an ensemble of prior geological models, which can be drawn from multiple geological scenarios. Example cases involving the production of oil via water injection, with both 2D and 3D geological models, are presented. The DRL-based methodology is shown to result in an NPV increase of 15% (for the 2D cases) and 33% (3D cases) relative to robust optimization over prior models, and to an average improvement of 4% in NPV relative to traditional CLRM. The solutions from the control policy are found to be comparable to those from deterministic optimization, in which the geological model is assumed to be known, even when multiple geological scenarios are considered. The control policy approach results in a 76% decrease in computational cost relative to traditional CLRM with the algorithms and parameter settings considered in this work.

machine learning, optimization, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jcp.2023.111945

2203.13375

Country: North America > United States (0.93)

Genre:

Workflow (0.86)
Research Report (0.82)

Industry:

Energy > Renewable (1.00)
Energy > Oil & Gas > Upstream (1.00)
Water & Waste Management > Water Management > Lifecycle > Disposal/Injection (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Bounds on Wasserstein distances between continuous distributions using independent samples

Papp, Tamás, Sherlock, Chris

arXiv.org Machine LearningMar-22-2022

The plug-in estimator of the Wasserstein distance is known to be conservative, however its usefulness is severely limited when the distributions are similar as its bias does not decay to zero with the true Wasserstein distance. We propose a linear combination of plug-in estimators for the squared 2-Wasserstein distance with a reduced bias that decays to zero with the true distance. The new estimator is provably conservative provided one distribution is appropriately overdispersed with respect the other, and is unbiased when the distributions are equal. We apply it to approximately bound from above the 2-Wasserstein distance between the target and current distribution in Markov chain Monte Carlo, running multiple identically distributed chains which start, and remain, overdispersed with respect to the target. Our bound consistently outperforms the current state-of-the-art bound, which uses coupling, improving mixing time bounds by up to an order of magnitude.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

2203.11627

Country:

Europe > Austria > Vienna (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.45)

Add feedback

Robust Pivoting: Exploiting Frictional Stability Using Bilevel Optimization

Shirai, Yuki, Jha, Devesh K., Raghunathan, Arvind, Romeres, Diego

arXiv.org Artificial IntelligenceMar-21-2022

Generalizable manipulation requires that robots be able to interact with novel objects and environment. This requirement makes manipulation extremely challenging as a robot has to reason about complex frictional interaction with uncertainty in physical properties of the object. In this paper, we study robust optimization for control of pivoting manipulation in the presence of uncertainties. We present insights about how friction can be exploited to compensate for the inaccuracies in the estimates of the physical properties during manipulation. In particular, we derive analytical expressions for stability margin provided by friction during pivoting manipulation. This margin is then used in a bilevel trajectory optimization algorithm to design a controller that maximizes this stability margin to provide robustness against uncertainty in physical properties of the object. We demonstrate our proposed method using a 6 DoF manipulator for manipulating several different objects.

artificial intelligence, optimization, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICRA46639.2022.9811812

2203.11412

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Preference Exploration for Efficient Bayesian Optimization with Multiple Outcomes

Lin, Zhiyuan Jerry, Astudillo, Raul, Frazier, Peter I., Bakshy, Eytan

arXiv.org Machine LearningMar-21-2022

We consider Bayesian optimization of expensive-to-evaluate experiments that generate vector-valued outcomes over which a decision-maker (DM) has preferences. These preferences are encoded by a utility function that is not known in closed form but can be estimated by asking the DM to express preferences over pairs of outcome vectors. To address this problem, we develop Bayesian optimization with preference exploration, a novel framework that alternates between interactive real-time preference learning with the DM via pairwise comparisons between outcomes, and Bayesian optimization with a learned compositional model of DM utility and outcomes. Within this framework, we propose preference exploration strategies specifically designed for this task, and demonstrate their performance via extensive simulation studies.

experiment, optimization, query, (13 more...)

arXiv.org Machine Learning

2203.11382

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > British Columbia (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning latent causal relationships in multiple time series

Dmochowski, Jacek P.

arXiv.org Machine LearningMar-20-2022

Identifying the causal structure of systems with multiple dynamic elements is critical to several scientific disciplines. The conventional approach is to conduct statistical tests of causality, for example with Granger Causality, between observed signals that are selected a priori. Here it is posited that, in many systems, the causal relations are embedded in a latent space that is expressed in the observed data as a linear mixture. A technique for blindly identifying the latent sources is presented: the observations are projected into pairs of components -- driving and driven -- to maximize the strength of causality between the pairs. This leads to an optimization problem with closed form expressions for the objective function and gradient that can be solved with off-the-shelf techniques. After demonstrating proof-of-concept on synthetic data with known latent structure, the technique is applied to recordings from the human brain and historical cryptocurrency prices. In both cases, the approach recovers multiple strong causal relationships that are not evident in the observed data. The proposed technique is unsupervised and can be readily applied to any multiple time series to shed light on the causal relationships underlying the data.

causality, matrix, strength, (13 more...)

arXiv.org Machine Learning

2203.10679

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Middlesex County > Natick (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > e-Commerce > Financial Technology (0.89)

Add feedback

Convergence Error Analysis of Reflected Gradient Langevin Dynamics for Globally Optimizing Non-Convex Constrained Problems

Sato, Kanji, Takeda, Akiko, Kawai, Reiichiro, Suzuki, Taiji

arXiv.org Machine LearningMar-18-2022

Non-convex optimization problems have various important applications, whereas many algorithms have been proven only to converge to stationary points. Meanwhile, gradient Langevin dynamics (GLD) and its variants have attracted increasing attention as a framework to provide theoretical convergence guarantees for a global solution in non-convex settings. The studies on GLD initially treated unconstrained convex problems and very recently expanded to convex constrained non-convex problems by Lamperski (2021). In this work, we can deal with non-convex problems with some kind of non-convex feasible region. This work analyzes reflected gradient Langevin dynamics (RGLD), a global optimization algorithm for smoothly constrained problems, including non-convex constrained ones, and derives a convergence rate to a solution with $\epsilon$-sampling error. The convergence rate is faster than the one given by Lamperski (2021) for convex constrained cases. Our proofs exploit the Poisson equation to effectively utilize the reflection for the faster convergence rate.

artificial intelligence, convergence error analysis, machine learning, (13 more...)

arXiv.org Machine Learning

2203.10215

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > Indiana (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Distributionally Robust Models at Scale via Composite Optimization

Haddadpour, Farzin, Kamani, Mohammad Mahdi, Mahdavi, Mehrdad, Karbasi, Amin

arXiv.org Machine LearningMar-17-2022

To train machine learning models that are robust to distribution shifts in the data, distributionally robust optimization (DRO) has been proven very effective. However, the existing approaches to learning a distributionally robust model either require solving complex optimization problems such as semidefinite programming or a first-order method whose convergence scales linearly with the number of data samples -- which hinders their scalability to large datasets. In this paper, we show how different variants of DRO are simply instances of a finite-sum composite optimization for which we provide scalable methods. We also provide empirical results that demonstrate the effectiveness of our proposed algorithm with respect to the prior art in order to learn robust models from very large datasets.

algorithm, optimization, optimization problem, (14 more...)

arXiv.org Machine Learning

2203.09607

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Differentiable DAG Sampling

Charpentier, Bertrand, Kibler, Simon, Günnemann, Stephan

arXiv.org Machine LearningMar-16-2022

We propose a new differentiable probabilistic model over DAGs (DP-DAG). DP-DAG allows fast and differentiable DAG sampling suited to continuous optimization. To this end, DP-DAG samples a DAG by successively (1) sampling a linear ordering of the node and (2) sampling edges consistent with the sampled linear ordering. We further propose VI-DP-DAG, a new method for DAG learning from observational data which combines DP-DAG with variational inference. Hence, VI-DP-DAG approximates the posterior probability over DAG edges given the observed data. VI-DP-DAG is guaranteed to output a valid DAG at any time during training and does not require any complex augmented Lagrangian optimization scheme in contrast to existing differentiable DAG learning approaches. In our extensive experiments, we compare VI-DP-DAG to other differentiable DAG learning baselines on synthetic and real datasets. VI-DP-DAG significantly improves DAG structure and causal mechanism learning while training faster than competitors. Directed Acyclic Graphs (DAGs) are important mathematical objects in many machine learning tasks. For example, a direct application of DAGs is to represent causal relationships in a system of variables. In this case, variables are represented as nodes and causal relationships are represented as directed edges. Hence, DAG learning has found many applications for causal discovery in biology, economics or planning (Pearl, 1988; Ramsey et al., 2017; Sachs et al., 2005; Zhang et al., 2013). However, DAG learning is a challenging problem for two reasons. First, while DAG learning with data from randomized and controlled experiments is the gold-standard for causal discovery, experimental data might be hard or unethical to obtain in practice.

dag, dataset, vi-dp-dag, (15 more...)

arXiv.org Machine Learning

2203.08509

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback