AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

A near-optimal stochastic gradient method for decentralized non-convex finite-sum optimization

arXiv.org Machine LearningSep-15-2020

This paper describes a $near$-$optimal$ stochastic first-order gradient method for decentralized finite-sum minimization of smooth non-convex functions. Specifically, we propose GT-SARAH that employs a local SARAH-type variance reduction and global gradient tracking to address the stochastic and decentralized nature of the problem. Considering a total number of $N$ cost functions, equally divided over a directed network of $n$ nodes, we show that GT-SARAH finds an $\epsilon$-accurate first-order stationary point in ${\mathcal{O}(N^{1/2}\epsilon^{-1})}$ gradient computations across all nodes, independent of the network topology, when ${n\leq\mathcal{O}(N^{1/2}(1-\lambda)^{3})}$, where ${(1-\lambda)}$ is the spectral gap of the network weight matrix. In this regime, GT-SARAH is thus, to the best our knowledge, the first decentralized method that achieves the algorithmic lower bound for this class of problems. Moreover, GT-SARAH achieves a $non$-$asymptotic$ $linear$ $speedup$, in that, the total number of gradient computations at each node is reduced by a factor of $1/n$ compared to the near-optimal algorithms for this problem class that process all data at a single node. We also establish the convergence rate of GT-SARAH in other regimes, in terms of the relative sizes of the number of nodes $n$, total number of functions $N$, and the network spectral gap $(1-\lambda)$. Over infinite time horizon, we establish the almost sure and mean-squared convergence of GT-SARAH to a first-order stationary point.

artificial intelligence, gt-sarah, machine learning, (18 more...)

arXiv.org Machine Learning

2008.07428

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Medford (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)

Add feedback

A Review on Modern Computational Optimal Transport Methods with Applications in Biomedical Research

Zhang, Jingyi, Zhong, Wenxuan, Ma, Ping

arXiv.org Machine LearningSep-10-2020

Optimal transport has been one of the most exciting subjects in mathematics, starting from the 18th century. As a powerful tool to transport between two probability measures, optimal transport methods have been reinvigorated nowadays in a remarkable proliferation of modern data science applications. To meet the big data challenges, various computational tools have been developed in the recent decade to accelerate the computation for optimal transport methods. In this review, we present some cutting-edge computational optimal transport methods with a focus on the regularization-based methods and the projection-based methods. We discuss their real-world applications in biomedical research.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2008.02995

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > New York (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

SeqROCTM: A Matlab toolbox for the analysis of Sequence of Random Objects driven by Context Tree Models

Duarte, Aline, Hernández, Noslen

arXiv.org Artificial IntelligenceSep-8-2020

In several research problems we face probabilistic sequences of inputs (e.g., sequence of stimuli) from which an agent generates a corresponding sequence of responses and it is of interest to model/discover some kind of relation between them. To model such relation in the context of statistical learning in neuroscience, a new class of stochastic process have been introduced [5], namely sequences of random objects driven by context tree models. In this paper we introduce a freely available Matlab toolbox (SeqROCTM) that implements three model selection methods to make inference about the parameters of this kind of stochastic process.

artificial intelligence, machine learning, programming language, (15 more...)

arXiv.org Artificial Intelligence

2009.06371

Country: South America > Brazil > São Paulo (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.66)
(2 more...)

Add feedback

Amazon.com: Introduction to Algorithms, third edition eBook: Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., Stein, Clifford: Kindle Store

#artificialintelligenceSep-5-2020, 00:34:26 GMT

Introduction to Algorithms, the'bible' of the field, is a comprehensive textbook covering the full spectrum of modern algorithms: from the fastest algorithms and data structures to polynomial-time algorithms for seemingly intractable problems, from classical algorithms in graph theory to special algorithms for string matching, computational geometry, and number theory. The revised third edition notably adds a chapter on van Emde Boas trees, one of the most useful data structures, and on multithreaded algorithms, a topic of increasing importance.

algorithm, artificial intelligence, kindle store, (8 more...)

#artificialintelligence

Industry: Retail > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.78)

Add feedback

Categorical Stochastic Processes and Likelihood

Shiebler, Dan

arXiv.org Artificial IntelligenceSep-2-2020

In this work we take a Category Theoretic perspective on the relationship between probabilistic modeling and function approximation. We begin by defining two extensions of function composition to stochastic process subordination: one based on the co-Kleisli category under the comonad (Omega x -) and one based on the parameterization of a category with a Lawvere theory. We show how these extensions relate to the category Stoch and other Markov Categories. Next, we apply the Para construction to extend stochastic processes to parameterized statistical models and we define a way to compose the likelihood functions of these models. We conclude with a demonstration of how the Maximum Likelihood Estimation procedure defines an identity-on-objects functor from the category of statistical models to the category of Learners. Code to accompany this paper can be found at https://github.com/dshieble/Categorical_Stochastic_Processes_and_Likelihood

artificial intelligence, category, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2005.04735

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Transfer learning for nonlinear dynamics and its application to fluid turbulence

Inubushi, Masanobu, Goto, Susumu

arXiv.org Machine LearningSep-2-2020

We introduce transfer learning for nonlinear dynamics, which enables efficient predictions of chaotic dynamics by utilizing a small amount of data. For the Lorenz chaos, by optimizing the transfer rate, we accomplish more accurate inference than the conventional method by an order of magnitude. Moreover, a surprisingly small amount of learning is enough to infer the energy dissipation rate of the Navier-Stokes turbulence because we can, thanks to the small-scale universality of turbulence, transfer a large amount of the knowledge learned from turbulence data at lower Reynolds number.

artificial intelligence, machine learning, target domain, (16 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.102.043301

2009.01407

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.61)

Add feedback

Data Science complete guide on Linear Algebra - DeepLearning

#artificialintelligenceSep-1-2020, 22:02:57 GMT

Then, this course is for you. The Common mistake by a data scientist is Applying the tools without the intuition of how it works and behaves. Having the solid foundation of mathematics will help you to understand how each algorithms work, its limitations and its underlying assumptions. It always pays to know the machinery under the hood, rather than being a guy who is just behind the wheel with no knowledge about the car. Linear Algebra is one of the area where everyone agrees to be a starting point in learning curve of Machine Learning, Data Science and Artificial intelligence.

artificial intelligence, linear algebra, machine learning, (8 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education > Educational Setting > Online (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.77)

Add feedback

Tensor Clustering with Planted Structures: Statistical Optimality and Computational Limits

Luo, Yuetian, Zhang, Anru R.

arXiv.org Machine LearningAug-28-2020

This paper studies the statistical and computational limits of high-order clustering with planted structures. We focus on two clustering models, constant high-order clustering (CHC) and rank-one higher-order clustering (ROHC), and study the methods and theory for testing whether a cluster exists (detection) and identifying the support of cluster (recovery). Specifically, we identify the sharp boundaries of signal-to-noise ratio for which CHC and ROHC detection/recovery are statistically possible. We also develop the tight computational thresholds: when the signal-to-noise ratio is below these thresholds, we prove that polynomial-time algorithms cannot solve these problems under the computational hardness conjectures of hypergraphic planted clique (HPC) detection and hypergraphic planted dense subgraph (HPDS) recovery. We also propose polynomial-time tensor algorithms that achieve reliable detection and recovery when the signal-to-noise ratio is above these thresholds. Both sparsity and tensor structures yield the computational barriers in high-order tensor clustering. The interplay between them results in significant differences between high-order tensor clustering and matrix clustering in literature in aspects of statistical and computational phase transition diagrams, algorithmic approaches, hardness conjecture, and proof techniques. To our best knowledge, we are the first to give a thorough characterization of the statistical and computational trade-off for such a double computational-barrier problem. Finally, we provide evidence for the computational hardness conjectures of HPC detection and HPDS recovery.

data mining, machine learning, tensor clustering, (16 more...)

arXiv.org Machine Learning

2005.10743

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.92)

Technology:

Information Technology > Data Science > Data Mining (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.45)

Add feedback

Learning to rank via combining representations

Helm, Hayden S., Basu, Amitabh, Athreya, Avanti, Park, Youngser, Vogelstein, Joshua T., Winding, Michael, Zlatic, Marta, Cardona, Albert, Bourke, Patrick, Larson, Jonathan, White, Chris, Priebe, Carey E.

arXiv.org Machine LearningAug-25-2020

Learning to rank - producing a ranked list of items specific to a query and with respect to a set of supervisory items - is a problem of general interest. The setting we consider is one in which no analytic description of what constitutes a good ranking is available. Instead, we have a collection of representations and supervisory information consisting of a (target item, interesting items set) pair. We demonstrate - analytically, in simulation, and in real data examples - that learning to rank via combining representations using an integer linear program is effective when the supervision is as light as "these few items are similar to your item of interest." While this nomination task is of general interest, for specificity we present our methodology from the perspective of vertex nomination in graphs. The methodology described herein is model agnostic. Introduction Given a query, a collection of items, and supervisory information, producing a ranked list relative to the query is of general interest. In particular, learning to rank [1] and algorithms from related problem settings [2] have been used to improve popular search engines and recommender systems and, impressively, aid in the identification of human traffickers [3]. When learning to rank, for each training query researchers typically have access to (feature vector, ordinal) pairs that are used to learn an ordinal regressor via fitting a model under a set of probabilistic assumptions [4] or via deep learning techniques [5] that generalize to ranking items for never-beforeseen queries.

artificial intelligence, machine learning, representation, (20 more...)

arXiv.org Machine Learning

2005.107

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Testing correlation of unlabeled random graphs

Wu, Yihong, Xu, Jiaming, Yu, Sophie H.

arXiv.org Machine LearningAug-23-2020

We study the problem of detecting the edge correlation between two random graphs with $n$ unlabeled nodes. This is formalized as a hypothesis testing problem, where under the null hypothesis, the two graphs are independently generated; under the alternative, the two graphs are edge-correlated under some latent node correspondence, but have the same marginal distributions as the null. For both Gaussian-weighted complete graphs and dense Erd\H{o}s-R\'enyi graphs (with edge probability $n^{-o(1)}$), we determine the sharp threshold at which the optimal testing error probability exhibits a phase transition from zero to one as $n\to \infty$. For sparse Erd\H{o}s-R\'enyi graphs with edge probability $n^{-\Omega(1)}$, we determine the threshold within a constant factor. The proof of the impossibility results is an application of the conditional second-moment method, where we bound the truncated second moment of the likelihood ratio by carefully conditioning on the typical behavior of the intersection graph (consisting of edges in both observed graphs) and taking into account the cycle structure of the induced random permutation on the edges. Notably, in the sparse regime, this is accomplished by leveraging the pseudoforest structure of subcritical Erd\H{o}s-R\'enyi graphs and a careful enumeration of subpseudoforests that can be assembled from short orbits of the edge permutation.

artificial intelligence, graph, orbit, (18 more...)

arXiv.org Machine Learning

2008.10097

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.49)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.70)

Add feedback