AITopics

2202.05812

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Middlesex County > Medford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Communications > Networks (0.88)

#artificialintelligenceFeb-10-2022, 05:40:25 GMT

Continuous-time Distributed Heavy-ball Algorithm for Distributed Convex Optimization over Undirected and Directed Graphs - Machine Intelligence Research

This paper proposes second-order distributed algorithms over multi-agent networks to solve the convex optimization problem by utilizing the gradient tracking strategy, with convergence acceleration being achieved. Both the undirected and unbalanced directed graphs are considered, extending existing algorithms that primarily focus on undirected or balanced directed graphs. Our algorithms also have the advantage of abandoning the diminishing step-size strategy so that slow convergence can be avoided. Furthermore, the exact convergence to the optimal solution can be realized even under the constant step size adopted in this paper. Finally, two numerical examples are presented to show the convergence performance of our algorithms.

heavy-ball algorithm, machine intelligence research, undirected and directed graph, (3 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Sports > Tennis (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.73)

Nguyen, Van-Dinh, Chatzinotas, Symeon, Ottersten, Bjorn, Duong, Trung Q.

FedFog: Network-Aware Optimization of Federated Learning over Wireless Fog-Cloud Systems

arXiv.org Artificial IntelligenceFeb-10-2022

Federated learning (FL) is capable of performing large distributed machine learning tasks across multiple edge users by periodically aggregating trained local parameters. To address key challenges of enabling FL over a wireless fog-cloud system (e.g., non-i.i.d. data, users' heterogeneity), we first propose an efficient FL algorithm based on Federated Averaging (called FedFog) to perform the local aggregation of gradient parameters at fog servers and global training update at the cloud. Next, we employ FedFog in wireless fog-cloud systems by investigating a novel network-aware FL optimization problem that strikes the balance between the global loss and completion time. An iterative algorithm is then developed to obtain a precise measurement of the system performance, which helps design an efficient stopping criteria to output an appropriate number of global rounds. To mitigate the straggler effect, we propose a flexible user aggregation strategy that trains fast users first to obtain a certain level of accuracy before allowing slow users to join the global training updates. Extensive numerical results using several real-world FL tasks are provided to verify the theoretical convergence of FedFog. We also show that the proposed co-design of FL and communication is essential to substantially improve resource utilization while achieving comparable accuracy of the learning model.

artificial intelligence, fedfog, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2107.02755

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Shienman, Moshe, Indelman, Vadim

D2A-BSP: Distilled Data Association Belief Space Planning with Performance Guarantees Under Budget Constraints

arXiv.org Artificial IntelligenceFeb-10-2022

Unresolved data association in ambiguous and perceptually aliased environments leads to multi-modal hypotheses on both the robot's and the environment state. To avoid catastrophic results, when operating in such ambiguous environments, it is crucial to reason about data association within Belief Space Planning (BSP). However, explicitly considering all possible data associations, the number of hypotheses grows exponentially with the planning horizon and determining the optimal action sequence quickly becomes intractable. Moreover, with hard budget constraints where some non-negligible hypotheses must be pruned, achieving performance guarantees is crucial. In this work we present a computationally efficient novel approach that utilizes only a distilled subset of hypotheses to solve BSP problems while reasoning about data association. Furthermore, to provide performance guarantees, we derive error bounds with respect to the optimal solution. We then demonstrate our approach in an extremely aliased environment, where we manage to significantly reduce computation time without compromising on the quality of the solution.

data association, hypothesis, performance guarantee, (15 more...)

arXiv.org Artificial Intelligence

2202.04954

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)

Bietti, Alberto, Wei, Chen-Yu, Dudik, Miroslav, Langford, John, Wu, Zhiwei Steven

Personalization Improves Privacy-Accuracy Tradeoffs in Federated Optimization

arXiv.org Machine LearningFeb-10-2022

Large-scale machine learning systems often involve data distributed across a collection of users. Federated optimization algorithms leverage this structure by communicating model updates to a central server, rather than entire datasets. In this paper, we study stochastic optimization algorithms for a personalized federated learning setting involving local and global models subject to user-level (joint) differential privacy. While learning a private global model induces a cost of privacy, local learning is perfectly private. We show that coordinating local learning with private centralized learning yields a generically useful and improved tradeoff between accuracy and privacy. We illustrate our theoretical results with experiments on synthetic and real-world datasets.

learning, personalization, privacy, (15 more...)

2202.05318

Country:

North America > United States > California (0.14)
North America > United States > New York (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Dorfman, Ron, Levy, Kfir Y.

Adapting to Mixing Time in Stochastic Optimization with Markovian Data

We consider stochastic optimization problems where data is drawn from a Markov chain. Existing methods for this setting crucially rely on knowing the mixing time of the chain, which in real-world applications is usually unknown. We propose the first optimization method that does not require the knowledge of the mixing time, yet obtains the optimal asymptotic convergence rate when applied to convex problems. We further show that our approach can be extended to: (i) finding stationary points in non-convex optimization with Markovian data, and (ii) obtaining better dependence on the mixing time in temporal difference (TD) learning; in both cases, our method is completely oblivious to the mixing time. Our method relies on a novel combination of multi-level Monte Carlo (MLMC) gradient estimation together with an adaptive learning method.

gradient, markov chain, optimization, (14 more...)

2202.04428

Country:

Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.37)

Reproducibility in Optimization: Theoretical Framework and Limits

Ahn, Kwangjun, Jain, Prateek, Ji, Ziwei, Kale, Satyen, Netrapalli, Praneeth, Shamir, Gil I.

Machine learned models are increasingly entering wider ranges of domains in our lives, driving a constantly increasing number of important systems. Large scale systems can be trained in highly parallel and distributed training environments, with a large amount of randomness in training the models. While some systems may tolerate such randomness leading to models that differ from one another every time a model retrains, for many applications, reproducible models are required, where slight changes in training do not lead to drastic differences in the model learned. Beyond practical deployments of machine learned models, the reproducibility crisis in the machine learning academic world has also been well-documented: see [Pineau et al., 2021] and the references therein for an excellent discussion of the reasons for irreproducibility (insufficient exploration of hyperparameters and experimental setups, lack of sufficient documentation, inaccessible code, and different computational hardware) and for mitigation recommendations. However, recent papers [Chen et al., 2020, D'Amour et al., 2020, Dusenberry et al., 2020, Snapp and Shamir, 2021, Summers and Dinneen, 2021, Yu et al., 2021] have also demonstrated that even when models Part of this work was done when Kwangjun Ahn and Ziwei Ji were interns at Google Research.

algorithm, gradient oracle, oracle, (14 more...)

2202.04598

Country:

Asia > India (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (0.34)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

d'Ascoli, Stéphane, Refinetti, Maria, Biroli, Giulio

Optimal learning rate schedules in high-dimensional non-convex optimization problems

Learning rate schedules are ubiquitously used to speed up and improve optimisation. Many different policies have been introduced on an empirical basis, and theoretical analyses have been developed for convex settings. However, in many realistic problems the loss-landscape is high-dimensional and non convex -- a case for which results are scarce. In this paper we present a first analytical study of the role of learning rate scheduling in this setting, focusing on Langevin optimization with a learning rate decaying as $\eta(t)=t^{-\beta}$. We begin by considering models where the loss is a Gaussian random function on the $N$-dimensional sphere ($N\rightarrow \infty$), featuring an extensive number of critical points. We find that to speed up optimization without getting stuck in saddles, one must choose a decay rate $\beta<1$, contrary to convex setups where $\beta=1$ is generally optimal. We then add to the problem a signal to be recovered. In this setting, the dynamics decompose into two phases: an \emph{exploration} phase where the dynamics navigates through rough parts of the landscape, followed by a \emph{convergence} phase where the signal is detected and the dynamics enter a convex basin. In this case, it is optimal to keep a large learning rate during the exploration phase to escape the non-convex region as quickly as possible, then use the convex criterion $\beta=1$ to converge rapidly to the solution. Finally, we demonstrate that our conclusions hold in a common regression task involving neural networks.

arxiv preprint arxiv, equation, learning rate, (15 more...)

2202.04509

Country:

Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.05)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)

Vanderschueren, Toon, Baesens, Bart, Verdonck, Tim, Verbeke, Wouter

A new perspective on classification: optimally allocating limited resources to uncertain tasks

A central problem in business concerns the optimal allocation of limited resources to a set of available tasks, where the payoff of these tasks is inherently uncertain. In credit card fraud detection, for instance, a bank can only assign a small subset of transactions to their fraud investigations team. Typically, such problems are solved using a classification framework, where the focus is on predicting task outcomes given a set of characteristics. Resources are then allocated to the tasks that are predicted to be the most likely to succeed. However, we argue that using classification to address task uncertainty is inherently suboptimal as it does not take into account the available capacity. Therefore, we first frame the problem as a type of assignment problem. Then, we present a novel solution using learning to rank by directly optimizing the assignment's expected profit given limited, stochastic capacity. This is achieved by optimizing a specific instance of the net discounted cumulative gain, a commonly used class of metrics in learning to rank. Empirically, we demonstrate that our new method achieves higher expected profit and expected precision compared to a classification approach for a wide variety of application areas and data sets. This illustrates the benefit of an integrated approach and of explicitly considering the available resources when learning a predictive model.

assignment problem, classification model, proceedings, (13 more...)

2202.04369

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
Asia > Macao (0.04)
Asia > China (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Banking & Finance > Credit (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

AAAI ConferencesFeb-8-2022, 12:57:07 GMT

Leone

Datalog is the extension of Datalog, allowing existentially quantified variables in rule heads. This language is highly expressive and enables easy and powerful knowledge-modeling, but the presence of existentially quantified variables makes reasoning over Datalog E undecidable, in the general case. The results in this paper enable powerful, yet decidable and efficient reasoning (query answering) on top of Datalog programs. On the theoretical side, we define the class of parsimonious Datalog programs, and show that it allows of decidable and efficiently-computable reasoning. Unfortunately, we can demonstrate that recognizing parsimony is undecidable. However, we single out Shy, an easily recognizable fragment of parsimonious programs, that significantly extends both Datalog and Linear-Datalog, while preserving the same (data and combined) complexity of query answering over Datalog, although the addition of existential quantifiers. On the practical side, we implement a bottom-up evaluation strategy for Shy programs inside the DLV system, enhancing the computation by a number of optimization techniques to result in DLV -- a powerful system for answering conjunctive queries over Shy programs, which is profitably applicable to ontology-based query answering. Moreover, we carry out an experimental analysis, comparing DLV against a number of state-of-the-art systems for ontology-based query answering. The results confirm the effectiveness of DLV, which outperforms all other systems in the benchmark domain.

datalog program, existentially, reasoning, (4 more...)

AAAI Conferences

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.62)