AITopics

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance. We derive the minimax solutions for the case where the prediction and action spaces are the simplex (this setup is sometimes called the Brier game) and the $\ell_2$ ball (this setup is related to Gaussian density estimation). We show that in both cases the value of each sub-game is a quadratic function of a simple statistic of the state, with coefficients that can be efficiently computed using an explicit recurrence relation. The resulting deterministic minimax strategy and randomized maximin strategy are linear functions of the statistic.

artificial intelligence, machine learning, minimax strategy, (18 more...)

Country:

Europe (0.28)
North America > United States > California (0.14)

Industry: Leisure & Entertainment > Games (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)

Grinberg, Yuri, Precup, Doina, Gendreau, Michel

Optimizing Energy Production Using Policy Search and Predictive State Representations

We consider the challenging practical problem of optimizing the power production of a complex of hydroelectric power plants, which involves control over three continuous action variables, uncertainty in the amount of water inflows and a variety of constraints that need to be satisfied. We propose a policy-search-based approach coupled with predictive modelling to address this problem. This approach has some key advantages compared to other alternatives, such as dynamic programming: the policy representation and search algorithm can conveniently incorporate domain knowledge; the resulting policies are easy to interpret, and the algorithm is naturally parallelizable. Our algorithm obtains a policy which outperforms the solution found by dynamic programming both quantitatively and qualitatively.

artificial intelligence, constraint, optimization problem, (15 more...)

Country: North America > Canada > Quebec > Montreal (0.14)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Hydroelectric (0.72)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)

Bresler, Guy, Gamarnik, David, Shah, Devavrat

Structure learning of antiferromagnetic Ising models

In this paper we investigate the computational complexity of learning the graph structure underlying a discrete undirected graphical model from i.i.d. samples. Our first result is an unconditional computational lower bound of $\Omega (p^{d/2})$ for learning general graphical models on $p$ nodes of maximum degree $d$, for the class of statistical algorithms recently introduced by Feldman et al. The construction is related to the notoriously difficult learning parities with noise problem in computational learning theory. Our lower bound shows that the $\widetilde O(p^{d+2})$ runtime required by Bresler, Mossel, and Sly's exhaustive-search algorithm cannot be significantly improved without restricting the class of models. Aside from structural assumptions on the graph such as it being a tree, hypertree, tree-like, etc., most recent papers on structure learning assume that the model has the correlation decay property. Indeed, focusing on ferromagnetic Ising models, Bento and Montanari showed that all known low-complexity algorithms fail to learn simple graphs when the interaction strength exceeds a number related to the correlation decay threshold. Our second set of results gives a class of repelling (antiferromagnetic) models that have the \emph{opposite} behavior: very strong repelling allows efficient learning in time $\widetilde O(p^2)$. We provide an algorithm whose performance interpolates between $\widetilde O(p^2)$ and $\widetilde O(p^{d+2})$ depending on the strength of the repulsion.

algorithm, artificial intelligence, machine learning, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Qian, Jing, Saligrama, Venkatesh

Efficient Minimax Signal Detection on Graphs

Several problems such as network intrusion, community detection, and disease outbreak can be described by observations attributed to nodes or edges of a graph. In these applications presence of intrusion, community or disease outbreak is characterized by novel observations on some unknown connected subgraph. These problems can be formulated in terms of optimization of suitable objectives on connected subgraphs, a problem which is generally computationally difficult. We overcome the combinatorics of connectivity by embedding connected subgraphs into linear matrix inequalities (LMI). Computationally efficient tests are then realized by optimizing convex objective functions subject to these LMI constraints. We prove, by means of a novel Euclidean embedding argument, that our tests are minimax optimal for exponential family of distributions on 1-D and 2-D lattices. We show that internal conductance of the connected subgraph family plays a fundamental role in characterizing detectability.

artificial intelligence, detection, subgraph, (16 more...)

Country: North America > United States (0.46)

Industry: Health & Medicine > Epidemiology (0.55)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.61)

Prasad, Adarsh, Jegelka, Stefanie, Batra, Dhruv

Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets

To cope with the high level of ambiguity faced in domains such as Computer Vision or Natural Language processing, robust prediction methods often search for a diverse set of high-quality candidate solutions or proposals. In structured prediction problems, this becomes a daunting task, as the solution space (image labelings, sentence parses, etc.) is exponentially large. We study greedy algorithms for finding a diverse subset of solutions in structured-output spaces by drawing new connections between submodular functions over combinatorial item sets and High-Order Potentials (HOPs) studied for graphical models. Specifically, we show via examples that when marginal gains of submodular diversity functions allow structured representations, this enables efficient (sub-linear time) approximate maximization by reducing the greedy augmentation step to inference in a factor graph with appropriately constructed HOPs. We discuss benefits, tradeoffs, and show that our constructions lead to significantly better proposals.

machine learning, natural language, segmentation, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.34)

Suresh, Ananda Theertha, Orlitsky, Alon, Acharya, Jayadev, Jafarpour, Ashkan

Near-Optimal-Sample Estimators for Spherical Gaussian Mixtures

Many important distributions are high dimensional, and often they can be modeled as Gaussian mixtures. We derive the first sample-efficient polynomial-time estimator for high-dimensional spherical Gaussian mixtures. Based on intuitive spectral reasoning, it approximates mixtures of $k$ spherical Gaussians in $d$-dimensions to within$\ell_1$ distance $\epsilon$ using $\mathcal{O}({dk^9(\log^2 d)}/{\epsilon^4})$ samples and $\mathcal{O}_{k,\epsilon}(d^3\log^5 d)$ computation time. Conversely, we show that any estimator requires $\Omega\bigl({dk}/{\epsilon^2}\bigr)$ samples, hence the algorithm's sample complexity is nearly optimal in the dimension. The implied time-complexity factor \mathcal{O}_{k,\epsilon}$ is exponential in $k$, but much smaller than previously known. We also construct a simple estimator for one-dimensional Gaussian mixtures that uses $\tilde\mathcal{O}(k /\epsilon^2)$ samples and $\tilde\mathcal{O}((k/\epsilon)^{3k+1})$ computation time.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States (0.28)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Luo, Haipeng, Schapire, Robert E.

A Drifting-Games Analysis for Online Learning and Applications to Boosting

We provide a general mechanism to design online learning algorithms based on a minimax analysis within a drifting-games framework. Different online learning settings (Hedge, multi-armed bandit problems and online convex optimization) are studied by converting into various kinds of drifting games. The original minimax analysis for drifting games is then used and generalized by applying a series of relaxations, starting from choosing a convex surrogate of the 0-1 loss function. With different choices of surrogates, we not only recover existing algorithms, but also propose new algorithms that are totally parameter-free and enjoy other useful properties. Moreover, our drifting-games framework naturally allows us to study high probability bounds without resorting to any concentration results, and also a generalized notion of regret that measures how good the algorithm is compared to all but the top small fraction of candidates. Finally, we translate our new Hedge algorithm into a new adaptive boosting algorithm that is computationally faster as shown in experiments, since it ignores a large number of examples on each round.

artificial intelligence, data mining, machine learning, (19 more...)

Industry: Education > Educational Setting > Online (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.90)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.58)

Zaremba, Wojciech, Kurach, Karol, Fergus, Rob

Learning to Discover Efficient Mathematical Identities

In this paper we explore how machine learning techniques can be applied to the discovery of efficient mathematical identities. We introduce an attribute grammar framework for representing symbolic expressions. Given a grammar of math operators, we build trees that combine them in different ways, looking for compositions that are analytically equivalent to a target expression but of lower computational complexity. However, as the space of trees grows exponentially with the complexity of the target expression, brute force search is impractical for all but the simplest of expressions. Consequently, we introduce two novel learning approaches that are able to learn from simpler expressions to guide the tree search. The first of these is a simple n-gram model, the other being a recursive neural-network. We show how these approaches enable us to derive complex identities, beyond reach of brute-force search, or human derivation.

artificial intelligence, logic & formal reasoning, machine learning, (21 more...)

Country:

North America > United States (0.28)
Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)

Pan, Xinghao, Jegelka, Stefanie, Gonzalez, Joseph E., Bradley, Joseph K., Jordan, Michael I.

Parallel Double Greedy Submodular Maximization

Many machine learning problems can be reduced to the maximization of submodular functions.Although well understood in the serial setting, the parallel maximization of submodular functions remains an open area of research with recent results [1] only addressing monotone functions. The optimal algorithm for maximizing the more general class of non-monotone submodular functions was introduced by Buchbinder et al. [2] and follows a strongly serial double-greedy logic and program analysis. In this work, we propose two methods to parallelize the double-greedy algorithm. The first, coordination-free approach emphasizes speed at the cost of a weaker approximation guarantee. The second, concurrency control approach guarantees a tight 1/2-approximation, at the quantifiable cost of additional coordination and reduced parallelism. As a consequence we explore the tradeoff space between guaranteed performance and objective optimality. We implement and evaluate both algorithms on multi-core hardware and billion edge graphs, demonstrating both the scalability and tradeoffs of each approach.

algorithm, artificial intelligence, machine learning, (16 more...)

Country: North America > United States > California (0.46)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.36)