AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Compressed Linear Algebra for Declarative Large-Scale Machine Learning

Communications of the ACMApr-25-2019, 23:26:54 GMT

Large-scale Machine Learning (ML) algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications. Hence, it is crucial for performance to fit the data into single-node or distributed main memory to enable fast matrix-vector operations. General-purpose compression struggles to achieve both good compression ratios and fast decompression for block-wise uncompressed operations. Therefore, we introduce Compressed Linear Algebra (CLA) for lossless matrix compression. CLA encodes matrices with lightweight, value-based compression techniques and executes linear algebra operations directly on the compressed representations. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show good compression ratios and operations performance close to the uncompressed case, which enables fitting larger datasets into available memory. We thereby obtain significant end-to-end performance improvements. Large-scale ML leverages large data collections to find interesting patterns or build robust predictive models.7 Applications range from traditional regression, classification, and clustering to user recommendations and deep learning for unstructured data. The labeled data required to train these ML models is now abundant, thanks to feedback loops in data products and weak supervision techniques. Many ML systems exploit data-parallel frameworks such as Spark20 or Flink2 for parallel model training and scoring on commodity hardware. It remains challenging, however, to train ML models on massive labeled data sets in a cost-effective manner.

artificial intelligence, machine learning, opération, (18 more...)

Communications of the ACM

Country:

North America > United States > Massachusetts (0.28)
North America > United States > Maryland (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Solving zero-sum extensive-form games with arbitrary payoff uncertainty models

Leni, Juan, Levine, John, Quigley, John

arXiv.org Artificial IntelligenceApr-24-2019

Modeling strategic conflict from a game theoretical perspective involves dealing with epistemic uncertainty. Payoff uncertainty models are typically restricted to simple probability models due to computational restrictions. Recent breakthroughs Artificial Intelligence (AI) research applied to Poker have resulted in novel approximation approaches such as counterfactual regret minimization, that can successfully deal with large-scale imperfect games. By drawing from these ideas, this work addresses the problem of arbitrary continuous payoff distributions. We propose a method, Harsanyi-Counterfactual Regret Minimization, to solve two-player zero-sum extensive-form games with arbitrary payoff distribution models. Given a game $\Gamma$, using a Harsanyi transformation we generate a new game $\Gamma^\#$ to which we later apply Counterfactual Regret Minimization to obtain $\varepsilon$-Nash equilibria. We include numerical experiments showing how the method can be applied to a previously published problem.

artificial intelligence, extensive-form game, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1905.0385

Country:

North America > Canada > Alberta (0.28)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)

Add feedback

A Unified Framework for Structured Graph Learning via Spectral Constraints

Kumar, Sandeep, Ying, Jiaxi, Cardoso, José Vinícius de M., Palomar, Daniel

arXiv.org Machine LearningApr-22-2019

Graph learning from data represents a canonical problem that has received substantial attention in the literature. However, insufficient work has been done in incorporating prior structural knowledge onto the learning of underlying graphical models from data. Learning a graph with a specific structure is essential for interpretability and identification of the relationships among data. Useful structured graphs include the multi-component graph, bipartite graph, connected graph, sparse graph, and regular graph. In general, structured graph learning is an NP-hard combinatorial problem, therefore, designing a general tractable optimization method is extremely challenging. In this paper, we introduce a unified graph learning framework lying at the integration of Gaussian graphical models and spectral graph theory. To impose a particular structure on a graph, we first show how to formulate the combinatorial constraints as an analytical property of the graph matrix. Then we develop an optimization framework that leverages graph learning with specific structures via spectral constraints on graph matrices. The proposed algorithms are provably convergent, computationally efficient, and practically amenable for numerous graph-based tasks. Extensive numerical experiments with both synthetic and real data sets illustrate the effectiveness of the proposed algorithms. The code for all the simulations is made available as an open source repository.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Machine Learning

1904.09792

Country:

Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

On the Adaptivity of Stochastic Gradient-Based Optimization

Lei, Lihua, Jordan, Michael I.

arXiv.org Machine LearningApr-9-2019

Stochastic-gradient-based optimization has been a core enabling methodology in applications to large-scale problems in machine learning and related areas. Despite the progress, the gap between theory and practice remains significant, with theoreticians pursuing mathematical optimality at a cost of obtaining specialized procedures in different regimes (e.g., modulus of strong convexity, magnitude of target accuracy, signal-to-noise ratio), and with practitioners not readily able to know which regime is appropriate to their problem, and seeking broadly applicable algorithms that are reasonably close to optimality. To bridge these perspectives it is necessary to study algorithms that are adaptive to different regimes. We present the stochastically controlled stochastic gradient (SCSG) method for composite convex finite-sum optimization problems and show that SCSG is adaptive to both strong convexity and target accuracy. The adaptivity is achieved by batch variance reduction with adaptive batch sizes and a novel technique, which we referred to as \emph{geometrization}, which sets the length of each epoch as a geometric random variable. The algorithm achieves strictly better theoretical complexity than other existing adaptive algorithms, while the tuning parameters of the algorithm only depend on the smoothness parameter of the objective.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1904.0448

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.91)

Add feedback

Plant-wide fault and disturbance screening using combined transfer entropy and eigenvector centrality analysis

Streicher, Simon, Sandrock, Carl

arXiv.org Artificial IntelligenceApr-8-2019

Finding the source of a disturbance or fault in complex systems such as industrial chemical processing plants can be a difficult task and consume a significant number of engineering hours. In many cases, a systematic elimination procedure is considered to be the only feasible approach but can cause undesired process upsets. Practitioners desire robust alternative approaches. This paper presents an unsupervised, data-driven method for ranking process elements according to the magnitude and novelty of their influence. Partial bivariate transfer entropy estimation is used to infer a weighted directed graph of process elements. Eigenvector centrality is applied to rank network nodes according to their overall effect. As the ranking of process elements rely on emerging properties that depend on the aggregate of many connections, the results are robust to errors in the estimation of individual edge properties and the inclusion of indirect connections that do not represent the true causal structure of the process. A monitoring chart of continuously calculated process element importance scores over multiple overlapping time regions can assist with incipient fault detection. Ranking results combined with visual inspection of information transfer networks is also useful for root cause analysis of known faults and disturbances. A software implementation of the proposed method is available.

artificial intelligence, disturbance, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1904.04035

Country:

North America > United States > Tennessee (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Africa > South Africa > Gauteng > Pretoria (0.04)
Asia > Japan > Shikoku > Ehime Prefecture > Matsuyama (0.04)

Genre: Research Report (0.64)

Industry:

Materials > Chemicals (1.00)
Energy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.67)

Add feedback

Estimation of Monge Matrices

Hütter, Jan-Christian, Mao, Cheng, Rigollet, Philippe, Robeva, Elina

arXiv.org Machine LearningApr-5-2019

Monge matrices and their permuted versions known as pre-Monge matrices naturally appear in many domains across science and engineering. While the rich structural properties of such matrices have long been leveraged for algorithmic purposes, little is known about their impact on statistical estimation. In this work, we propose to view this structure as a shape constraint and study the problem of estimating a Monge matrix subject to additive random noise. More specifically, we establish the minimax rates of estimation of Monge and pre-Monge matrices. In the case of pre-Monge matrices, the minimax-optimal least-squares estimator is not efficiently computable, and we propose two efficient estimators and establish their rates of convergence. Our theoretical findings are supported by numerical experiments.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

1904.03136

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)

Add feedback

Online Topology Identification from Vector Autoregressive Time Series

Zaman, Bakht, Ramos, Luis Miguel Lopez, Romero, Daniel, Beferull-Lozano, Baltasar

arXiv.org Machine LearningApr-3-2019

Due to their capacity to condense the spatiotemporal structure of a data set in a format amenable for human interpretation, forecasting, and anomaly detection, causality graphs are routinely estimated in social sciences, natural sciences, and engineering. A popular approach to mathematically formalize causality is based on vector autoregressive (VAR) models, which constitutes an alternative to the well-known but usually intractable Granger causality. Relying on such a VAR causality notion, this paper develops two algorithms with complementary benefits to track time-varying causality graphs in an online fashion. Despite using data in a sequential fashion, both algorithms are shown to asymptotically attain the same average performance as a batch estimator with all data available at once. Moreover, their constant complexity per update renders these algorithms appealing for big-data scenarios. Theoretical and experimental performance analysis support the merits of the proposed algorithms. Remarkably, no probabilistic models or stationarity assumptions need to be introduced, which endows the developed algorithms with considerable generality

algorithm, artificial intelligence, health & medicine, (18 more...)

arXiv.org Machine Learning

1904.01864

Country:

Asia > Middle East > Israel (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States (0.14)
(2 more...)

Genre: Research Report (0.40)

Industry:

Health & Medicine (0.93)
Energy > Oil & Gas (0.67)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

A Stochastic Interpretation of Stochastic Mirror Descent: Risk-Sensitive Optimality

Azizan, Navid, Hassibi, Babak

arXiv.org Machine LearningApr-3-2019

Stochastic mirror descent (SMD) is a fairly new family of algorithms that has recently found a wide range of applications in optimization, machine learning, and control. It can be considered a generalization of the classical stochastic gradient algorithm (SGD), where instead of updating the weight vector along the negative direction of the stochastic gradient, the update is performed in a "mirror domain" defined by the gradient of a (strictly convex) potential function. This potential function, and the mirror domain it yields, provides considerable flexibility in the algorithm compared to SGD. While many properties of SMD have already been obtained in the literature, in this paper we exhibit a new interpretation of SMD, namely that it is a risk-sensitive optimal estimator when the unknown weight vector and additive noise are non-Gaussian and belong to the exponential family of distributions. The analysis also suggests a modified version of SMD, which we refer to as symmetric SMD (SSMD). The proofs rely on some simple properties of Bregman divergence, which allow us to extend results from quadratics and Gaussians to certain convex functions and exponential families in a rather seamless way.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1904.01855

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.54)

Add feedback

GraSPy: Graph Statistics in Python

Chung, Jaewon, Pedigo, Benjamin D., Bridgeford, Eric W., Varjavand, Bijan K., Vogelstein, Joshua T.

arXiv.org Machine LearningMar-29-2019

We introduce GraSPy, a Python library devoted to statistical inference, machine learning, and visualization of random graphs and graph populations. This package provides flexible and easy-to-use algorithms for analyzing and understanding graphs with a scikit-learn compliant API. GraSPy can be downloaded from Python Package Index (PyPi), and is released under the Apache 2.0 open-source license. The documentation and all releases are available at https://neurodata.io/graspy.

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Machine Learning

1904.05329

Country: North America > United States > Maryland > Baltimore (0.05)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)

Add feedback

Machine Learning Methods Economists Should Know About

Athey, Susan, Imbens, Guido

arXiv.org Machine LearningMar-24-2019

We discuss the relevance of the recent Machine Learning (ML) literature for economics and econometrics. First we discuss the differences in goals, methods and settings between the ML literature and the traditional econometrics and statistics literatures. Then we discuss some specific methods from the machine learning literature that we view as important for empirical researchers in economics. These include supervised learning methods for regression and classification, unsupervised learning methods, as well as matrix completion methods. Finally, we highlight newly developed methods at the intersection of ML and econometrics, methods that typically perform better than either off-the-shelf ML or more traditional econometric methods when applied to particular classes of problems, problems that include causal inference for average treatment effects, optimal policy estimation, and estimation of the counterfactual effect of price changes in consumer choice models.

artificial intelligence, inductive learning, machine learning, (20 more...)

arXiv.org Machine Learning

1903.10075

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(3 more...)

Add feedback