AITopics

2109.07743

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (0.83)
Telecommunications > Networks (0.72)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

arXiv.org Artificial IntelligenceSep-15-2021

Sign-MAML: Efficient Model-Agnostic Meta-Learning by SignSGD

Fan, Chen, Ram, Parikshit, Liu, Sijia

We propose a new computationally-efficient first-order algorithm for Model-Agnostic Meta-Learning (MAML). The key enabling technique is to interpret MAML as a bilevel optimization (BLO) problem and leverage the sign-based SGD(signSGD) as a lower-level optimizer of BLO. We show that MAML, through the lens of signSGD-oriented BLO, naturally yields an alternating optimization scheme that just requires first-order gradients of a learned meta-model. We term the resulting MAML algorithm Sign-MAML. Compared to the conventional first-order MAML (FO-MAML) algorithm, Sign-MAML is theoretically-grounded as it does not impose any assumption on the absence of second-order derivatives during meta training. In practice, we show that Sign-MAML outperforms FO-MAML in various few-shot image classification tasks, and compared to MAML, it achieves a much more graceful tradeoff between classification accuracy and computation efficiency.

arxiv preprint arxiv, fo-maml, sign-maml, (13 more...)

2109.07497

Country:

North America > United States > Michigan (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

arXiv.org Machine LearningSep-15-2021

Non-smooth Bayesian Optimization in Tuning Problems

Luo, Hengrui, Demmel, James W., Cho, Younghyun, Li, Xiaoye S., Liu, Yang

Building surrogate models is one common approach when we attempt to learn unknown black-box functions. Bayesian optimization provides a framework which allows us to build surrogate models based on sequential samples drawn from the function and find the optimum. Tuning algorithmic parameters to optimize the performance of large, complicated "black-box" application codes is a specific important application, which aims at finding the optima of black-box functions. Within the Bayesian optimization framework, the Gaussian process model produces smooth or continuous sample paths. However, the black-box function in the tuning problem is often non-smooth. This difficult tuning problem is worsened by the fact that we usually have limited sequential samples from the black-box function. Motivated by these issues encountered in tuning, we propose a novel additive Gaussian process model called clustered Gaussian process (cGP), where the additive components are induced by clustering. In the examples we studied, the performance can be improved by as much as 90% among repetitive experiments. By using this surrogate model, we want to capture the non-smoothness of the black-box function. In addition to an algorithm for constructing this model, we also apply the model to several artificial and real applications to evaluate it.

cgp, exprate 0, max 112, (14 more...)

2109.07563

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
North America > United States > Virginia (0.04)
(4 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Energy (1.00)
Government > Regional Government (0.45)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

#artificialintelligenceSep-14-2021, 03:10:09 GMT

Global Big Data Conference

Getting the software right is important when developing machine learning models, such as recommendation or classification systems. But at eBay, optimizing the software to run on a particular piece of hardware using distillation and quantization techniques was absolutely essential to ensure scalability. "[I]n order to build a truly global marketplace that is driven by state of the art and powerful and scalable AI services," Kopru said, "you have to do a lot of optimizations after model training, and specifically for the target hardware." With 1.5 billion active listings from more than 19 million active sellers trying to reach 159 million active buyers, the ecommerce giant has a global reach that is matched by only a handful of firms. Machine learning and other AI techniques, such as natural language processing (NLP), play big roles in scaling eBay's operations to reach its massive audience. For instance, automatically generated descriptions of product listings is crucial for displaying information on the small screens of smart phones, Kopru said.

global big data conference, kopru, marketplace, (2 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > Mountain View (0.08)

Industry:

Information Technology > Services (1.00)
Consumer Products & Services (0.97)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)

Journal of Artificial Intelligence ResearchSep-14-2021

A Semi-exact Algorithm for Quickly Computing A Maximum Weight Clique in Large Sparse Graphs

Cai, Shaowei, Lin, Jinkun, Wang, Yiyuan, Strash, Darren

This paper explores techniques to quickly solve the maximum weight clique problem (MWCP) in very large scale sparse graphs. Due to their size, and the hardness of MWCP, it is infeasible to solve many of these graphs with exact algorithms. Although recent heuristic algorithms make progress in solving MWCP in large graphs, they still need considerable time to get a high-quality solution. In this work, we focus on solving MWCP for large sparse graphs within a short time limit. We propose a new method for MWCP which interleaves clique finding with data reduction rules. We propose novel ideas to make this process efficient, and develop an algorithm called FastWClq. Experiments on a broad range of large sparse graphs show that FastWClq finds better solutions than state-of-the-art algorithms while the running time of FastWClq is much shorter than the competitors for most instances. Further, FastWClq proves the optimality of its solutions for roughly half of the graphs, all with at least 105 vertices, with an average time of 21 seconds.

algorithm, graph, vertex, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12327

AI Access Foundation

12327

Journal of Artificial Intelligence Research

Country:

Europe > Germany (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Information Technology (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Savas, Yagiz, Verginis, Christos K., Topcu, Ufuk

Deceptive Decision-Making Under Uncertainty

arXiv.org Artificial IntelligenceSep-14-2021

We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks in stochastic, complex environments. By modeling the agent's behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals while deceiving outside observers about its true goal. We propose a novel approach to model observer predictions based on the principle of maximum entropy and to efficiently generate deceptive strategies via linear programming. The proposed approach enables the agent to exhibit a variety of tunable deceptive behaviors while ensuring the satisfaction of probabilistic constraints on the behavior. We evaluate the performance of the proposed approach via comparative user studies and present a case study on the streets of Manhattan, New York, using real travel time distributions.

agent, observer, trajectory, (17 more...)

2109.0674

Country:

North America > United States > New York > New York County > Manhattan (0.24)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

arXiv.org Artificial IntelligenceSep-14-2021

From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment

Mao, Xin, Wang, Wenting, Wu, Yuanbin, Lan, Man

Cross-lingual entity alignment (EA) aims to find the equivalent entities between crosslingual KGs, which is a crucial step for integrating KGs. Recently, many GNN-based EA methods are proposed and show decent performance improvements on several public datasets. Meanwhile, existing GNN-based EA methods inevitably inherit poor interpretability and low efficiency from neural networks. Motivated by the isomorphic assumption of GNNbased methods, we successfully transform the cross-lingual EA problem into the assignment problem. Based on this finding, we propose a frustratingly Simple but Effective Unsupervised entity alignment method (SEU) without neural networks. Extensive experiments show that our proposed unsupervised method even beats advanced supervised methods across all public datasets and has high efficiency, interpretability, and stability.

assignment problem, ea method, matrix, (13 more...)

2109.02363

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Texas > Harris County > Houston (0.04)
(17 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.56)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.50)

Mebratu, Derssie, Hasabnis, Niranjan, Mercati, Pietro, Sharma, Gaurit, Najnin, Shamima

Automatic Tuning of Tensorflow's CPU Backend using Gradient-Free Optimization Algorithms

arXiv.org Artificial IntelligenceSep-13-2021

Modern deep learning (DL) applications are built using DL libraries and frameworks such as TensorFlow and PyTorch. These frameworks have complex parameters and tuning them to obtain good training and inference performance is challenging for typical users, such as DL developers and data scientists. Manual tuning requires deep knowledge of the user-controllable parameters of DL frameworks as well as the underlying hardware. It is a slow and tedious process, and it typically delivers sub-optimal solutions. In this paper, we treat the problem of tuning parameters of DL frameworks to improve training and inference performance as a black-box optimization problem. We then investigate applicability and effectiveness of Bayesian optimization (BO), genetic algorithm (GA), and Nelder-Mead simplex (NMS) to tune the parameters of TensorFlow's CPU backend. While prior work has already investigated the use of Nelder-Mead simplex for a similar problem, it does not provide insights into the applicability of other more popular algorithms. Towards that end, we provide a systematic comparative analysis of all three algorithms in tuning TensorFlow's CPU backend on a variety of DL models. Our findings reveal that Bayesian optimization performs the best on the majority of models. There are, however, cases where it does not deliver the best results.

algorithm, bayesian optimization, tensorflow, (12 more...)

2109.06266

Country:

North America > United States > Oregon > Washington County > Hillsboro (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningSep-13-2021

On Tilted Losses in Machine Learning: Theory and Applications

Li, Tian, Beirami, Ahmad, Sanjabi, Maziar, Smith, Virginia

Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -- tilted empirical risk minimization (TERM) -- which uses exponential tilting to flexibly tune the impact of individual losses. The resulting framework has several useful properties: We show that TERM can increase or decrease the influence of outliers, respectively, to enable fairness or robustness; has variance-reduction properties that can benefit generalization; and can be viewed as a smooth approximation to a superquantile method. Our work makes rigorous connections between TERM and related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and distributionally robust optimization (DRO). We develop batch and stochastic first-order optimization methods for solving TERM, provide convergence guarantees for the solvers, and show that the framework can be efficiently solved relative to common alternatives. Finally, we demonstrate that TERM can be used for a multitude of applications in machine learning, such as enforcing fairness between subgroups, mitigating the effect of outliers, and handling class imbalance. Despite the straightforward modification TERM makes to traditional ERM objectives, we find that the framework can consistently outperform ERM and deliver competitive performance with state-of-the-art, problem-specific approaches.

objective, section 7, tilted loss, (13 more...)

2109.06141

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Goyens, Florentin, Cartis, Coralia, Eftekhari, Armin

Nonlinear matrix recovery using optimization on the Grassmann manifold

arXiv.org Machine LearningSep-13-2021

We investigate the problem of recovering a partially observed high-rank matrix whose columns obey a nonlinear structure such as a union of subspaces, an algebraic variety or grouped in clusters. The recovery problem is formulated as the rank minimization of a nonlinear feature map applied to the original matrix, which is then further approximated by a constrained non-convex optimization problem involving the Grassmann manifold. We propose two sets of algorithms, one arising from Riemannian optimization and the other as an alternating minimization scheme, both of which include first- and second-order variants. Both sets of algorithms have theoretical guarantees. In particular, for the alternating minimization, we establish global convergence and worst-case complexity bounds. Additionally, using the Kurdyka-Lojasiewicz property, we show that the alternating minimization converges to a unique limit point. We provide extensive numerical results for the recovery of union of subspaces and clustering under entry sampling and dense Gaussian sampling. Our methods are competitive with existing approaches and, in particular, high accuracy is achieved in the recovery using Riemannian second-order methods.

algorithm, kernel, subspace, (16 more...)

2109.06095

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)