AITopics

2109.14856

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
North America > United States > Michigan (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.89)
Health & Medicine > Diagnostic Medicine > Imaging (0.86)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
(2 more...)

Partitioning Cloud-based Microservices (via Deep Learning)

Yedida, Rahul, Krishna, Rahul, Kalia, Anup, Menzies, Tim, Xiao, Jin, Vukovic, Maja

Cloud-based software has many advantages. When services are divided into many independent components, they are easier to update. Also, during peak demand, it is easier to scale cloud services (just hire more CPUs). Hence, many organizations are partitioning their monolithic enterprise applications into cloud-based microservices. Recently there has been much work using machine learning to simplify this partitioning task. Despite much research, no single partitioning method can be recommended as generally useful. More specifically, those prior solutions are "brittle''; i.e. if they work well for one kind of goal in one dataset, then they can be sub-optimal if applied to many datasets and multiple goals. In order to find a generally useful partitioning method, we propose DEEPLY. This new algorithm extends the CO-GCN deep learning partition generator with (a) a novel loss function and (b) some hyper-parameter optimization. As shown by our experiments, DEEPLY generally outperforms prior work (including CO-GCN, and others) across multiple datasets and goals. To the best of our knowledge, this is the first report in SE of such stable hyper-parameter optimization. To aid reuse of this work, DEEPLY is available on-line at https://bit.ly/2WhfFlB.

algorithm, loss function, partition, (14 more...)

2109.14569

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
Europe > Italy (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Nazari, Parvin, Khorram, Esmaile

Dynamic Regret Analysis for Online Meta-Learning

The online meta-learning framework has arisen as a powerful tool for the continual lifelong learning setting. The goal for an agent is to quickly learn new tasks by drawing on prior experience, while it faces with tasks one after another. This formulation involves two levels: outer level which learns meta-learners and inner level which learns task-specific models, with only a small amount of data from the current task. While existing methods provide static regret analysis for the online meta-learning framework, we establish performance in terms of dynamic regret which handles changing environments from a global prospective. We also build off of a generalized version of the adaptive gradient methods that covers both ADAM and ADAGRAD to learn meta-learners in the outer level. We carry out our analyses in a stochastic setting, and in expectation prove a logarithmic local dynamic regret which depends explicitly on the total number of iterations T and parameters of the learner. Apart from, we also indicate high probability bounds on the convergence rates of proposed algorithm with appropriate selection of parameters, which have not been argued before.

algorithm, probability, suppose assumption 1, (10 more...)

2109.14375

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Instructional Material (0.89)
Research Report (0.64)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

LightSecAgg: Rethinking Secure Aggregation in Federated Learning

Yang, Chien-Sheng, So, Jinhyun, He, Chaoyang, Li, Songze, Yu, Qian, Avestimehr, Salman

Secure model aggregation is a key component of federated learning (FL) that aims at protecting the privacy of each user's individual model, while allowing their global aggregation. It can be applied to any aggregation-based approaches, including algorithms for training a global model, as well as personalized FL frameworks. Model aggregation needs to also be resilient to likely user dropouts in FL system, making its design substantially more complex. State-of-the-art secure aggregation protocols essentially rely on secret sharing of the random-seeds that are used for mask generations at the users, in order to enable the reconstruction and cancellation of those belonging to dropped users. The complexity of such approaches, however, grows substantially with the number of dropped users. We propose a new approach, named LightSecAgg, to overcome this bottleneck by turning the focus from "random-seed reconstruction of the dropped users" to "one-shot aggregate-mask reconstruction of the active users". More specifically, in LightSecAgg each user protects its local model by generating a single random mask. This mask is then encoded and shared to other users, in such a way that the aggregate-mask of any sufficiently large set of active users can be reconstructed directly at the server via encoded masks. We show that LightSecAgg achieves the same privacy and dropout-resiliency guarantees as the state-of-the-art protocols, while significantly reducing the overhead for resiliency to dropped users. Furthermore, our system optimization helps to hide the runtime cost of offline processing by parallelizing it with model training. We evaluate LightSecAgg via extensive experiments for training diverse models on various datasets in a realistic FL system, and demonstrate that LightSecAgg significantly reduces the total training time, achieving a performance gain of up to $12.7\times$ over baselines.

lightsecagg, protocol, server, (9 more...)

2109.14236

Country:

North America > United States > California (0.14)
North America > United States > Virginia (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization

Dou, Zehao, Li, Yuanzhi

Adam-type methods, the extension of adaptive gradient methods, have shown great performance in the training of both supervised and unsupervised machine learning models. In particular, Adam-type optimizers have been widely used empirically as the default tool for training generative adversarial networks (GANs). On the theory side, however, despite the existence of theoretical results showing the efficiency of Adam-type methods in minimization problems, the reason of their wonderful performance still remains absent in GAN's training. In existing works, the fast convergence has long been considered as one of the most important reasons and multiple works have been proposed to give a theoretical guarantee of the convergence to a critical point of min-max optimization algorithms under certain assumptions. In this paper, we firstly argue empirically that in GAN's training, Adam does not converge to a critical point even upon successful training: Only the generator is converging while the discriminator's gradient norm remains high throughout the training. We name this one-sided convergence. Then we bridge the gap between experiments and theory by showing that Adam-type algorithms provably converge to a one-sided first order stationary points in min-max optimization problems under the one-sided MVI condition. We also empirically verify that such one-sided MVI condition is satisfied for standard GANs after trained over standard data sets. To the best of our knowledge, this is the very first result which provides an empirical observation and a strict theoretical guarantee on the one-sided convergence of Adam-type algorithms in min-max optimization.

algorithm, convergence, mvi condition, (14 more...)

2109.14213

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Duy, Vo Nguyen Le, Takeuchi, Ichiro

Exact Statistical Inference for the Wasserstein Distance by Selective Inference

In this paper, we study statistical inference for the Wasserstein distance, which has attracted much attention and has been applied to various machine learning tasks. Several studies have been proposed in the literature, but almost all of them are based on asymptotic approximation and do not have finite-sample validity. In this study, we propose an exact (non-asymptotic) inference method for the Wasserstein distance inspired by the concept of conditional Selective Inference (SI). To our knowledge, this is the first method that can provide a valid confidence interval (CI) for the Wasserstein distance with finite-sample coverage guarantee, which can be applied not only to one-dimensional problems but also to multi-dimensional problems. We evaluate the performance of the proposed method on both synthetic and real-world datasets.

inference, selective ci, wasserstein distance, (13 more...)

2109.14206

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Cervera-Lierta, Alba, Krenn, Mario, Aspuru-Guzik, Alán

Design of quantum optical experiments with logic artificial intelligence

arXiv.org Artificial IntelligenceSep-27-2021

Logic artificial intelligence (AI) is a subfield of AI where variables can take two defined arguments, True or False, and are arranged in clauses that follow the rules of formal logic. Several problems that span from physical systems to mathematical conjectures can be encoded into these clauses and be solved by checking their satisfiability (SAT). Recently, SAT solvers have become a sophisticated and powerful computational tool capable, among other things, of solving long-standing mathematical conjectures. In this work, we propose the use of logic AI for the design of optical quantum experiments. We show how to map into a SAT problem the experimental preparation of an arbitrary quantum state and propose a logic-based algorithm, called Klaus, to find an interpretable representation of the photonic setup that generates it. We compare the performance of Klaus with the state-of-the-art algorithm for this purpose based on continuous optimization. We also combine both logic and numeric strategies to find that the use of logic AI improves significantly the resolution of this problem, paving the path to develop more formal-based approaches in the context of quantum physics experiments.

algorithm, graph, target state, (16 more...)

arXiv.org Artificial Intelligence

2109.13273

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)

Duarte, Guilherme, Finkelstein, Noam, Knox, Dean, Mummolo, Jonathan, Shpitser, Ilya

An Automated Approach to Causal Inference in Discrete Settings

arXiv.org Machine LearningSep-27-2021

When causal quantities cannot be point identified, researchers often pursue partial identification to quantify the range of possible values. However, the peculiarities of applied research conditions can make this analytically intractable. We present a general and automated approach to causal inference in discrete settings. We show causal questions with discrete data reduce to polynomial programming problems, and we present an algorithm to automatically bound causal effects using efficient dual relaxation and spatial branch-and-bound techniques. The user declares an estimand, states assumptions, and provides data (however incomplete or mismeasured). The algorithm then searches over admissible data-generating processes and outputs the most precise possible range consistent with available information -- i.e., sharp bounds -- including a point-identified solution if one exists. Because this search can be computationally intensive, our procedure reports and continually refines non-sharp ranges that are guaranteed to contain the truth at all times, even when the algorithm is not run to completion. Moreover, it offers an additional guarantee we refer to as $\epsilon$-sharpness, characterizing the worst-case looseness of the incomplete bounds. Analytically validated simulations show the algorithm accommodates classic obstacles, including confounding, selection, measurement error, noncompliance, and nonresponse.

assumption, confidence region, constraint, (16 more...)

2109.13471

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)

Sagan, April, Mitchell, John E.

Provable Low Rank Plus Sparse Matrix Separation Via Nonconvex Regularizers

arXiv.org Machine LearningSep-26-2021

This paper considers a large class of problems where we seek to recover a low rank matrix and/or sparse vector from some set of measurements. While methods based on convex relaxations suffer from a (possibly large) estimator bias, and other nonconvex methods require the rank or sparsity to be known a priori, we use nonconvex regularizers to minimize the rank and $l_0$ norm without the estimator bias from the convex relaxation. We present a novel analysis of the alternating proximal gradient descent algorithm applied to such problems, and bound the error between the iterates and the ground truth sparse and low rank matrices. The algorithm and error bound can be applied to sparse optimization, matrix completion, and robust principal component analysis as special cases of our results.

algorithm, matrix, matrix completion, (14 more...)

2109.12713

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.55)

Bertsimas, Dimitris, Cory-Wright, Ryan, Johnson, Nicholas A. G.

Sparse Plus Low Rank Matrix Decomposition: A Discrete Optimization Approach

arXiv.org Machine LearningSep-26-2021

We study the Sparse Plus Low Rank decomposition problem (SLR), which is the problem of decomposing a corrupted data matrix $\mathbf{D}$ into a sparse matrix $\mathbf{Y}$ containing the perturbations plus a low rank matrix $\mathbf{X}$. SLR is a fundamental problem in Operations Research and Machine Learning arising in many applications such as data compression, latent semantic indexing, collaborative filtering and medical imaging. We introduce a novel formulation for SLR that directly models the underlying discreteness of the problem. For this formulation, we develop an alternating minimization heuristic to compute high quality solutions and a novel semidefinite relaxation that provides meaningful bounds for the solutions returned by our heuristic. We further develop a custom branch and bound routine that leverages our heuristic and convex relaxation that solves small instances of SLR to certifiable near-optimality. Our heuristic can scale to $n=10000$ in hours, our relaxation can scale to $n=200$ in hours, and our branch and bound algorithm can scale to $n=25$ in minutes. Our numerical results demonstrate that our approach outperforms existing state-of-the-art approaches in terms of the MSE of the low rank matrix and that of the sparse matrix.

algorithm 1, matrix reconstruction error, reconstruction error, (10 more...)

2109.12701

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre:

Overview (0.87)
Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Industry: Health & Medicine (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)