AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Two-sample Hypothesis Testing for Inhomogeneous Random Graphs

Ghoshdastidar, Debarghya, Gutzeit, Maurilio, Carpentier, Alexandra, von Luxburg, Ulrike

arXiv.org Machine LearningAug-1-2017

The study of networks leads to a wide range of high dimensional inference problems. In most practical scenarios, one needs to draw inference from a small population of large networks. The present paper studies hypothesis testing of graphs in this high-dimensional regime. We consider the problem of testing between two populations of inhomogeneous random graphs defined on the same set of vertices. We propose tests based on estimates of the Frobenius and operator norms of the difference between the population adjacency matrices. We show that the tests are uniformly consistent in both the "large graph, small sample" and "small graph, large sample" regimes. We further derive lower bounds on the minimax separation rate for the associated testing problems, and show that the constructed tests are near optimal.

artificial intelligence, scientific discovery, two-sample testing, (19 more...)

arXiv.org Machine Learning

1707.00833

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > Germany > Brandenburg > Potsdam (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)

Add feedback

Data Structures Related to Machine Learning Algorithms - DZone AI

#artificialintelligenceJul-31-2017, 22:15:24 GMT

In either case, the better your knowledge of data structures and algorithms, the easier time you'll have when it comes time to code up. I don't think the data structures used in machine learning are significantly different than those used in other areas of software development. Because of the size and difficulty of many of the problems, however, having a really solid handle on the basics is essential. Also, because machine learning is a very mathematical field, one should keep in mind how data structures can be used to solve mathematical problems and how they are mathematical objects in their own right. There are two ways to classify data structures: by their implementation and by their operation.

artificial intelligence, machine learning, programming language, (16 more...)

#artificialintelligence

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.49)

Add feedback

Centrality measures for graphons

Avella-Medina, Marco, Parise, Francesca, Schaub, Michael T., Segarra, Santiago

arXiv.org Machine LearningJul-28-2017

Graphs provide a natural mathematical abstraction for systems with pairwise interactions, and thus have become a prevalent tool for the representation of systems across various scientific domains. However, as the size of relational datasets continues to grow, traditional graph-based approaches are increasingly replaced by other modeling paradigms, which enable a more flexible treatment of such datasets. A promising framework in this context is provided by graphons, which have been formally introduced as the natural limiting objects for graphs of increasing sizes. However, while the theory of graphons is already well developed, some prominent tools in network analysis still have no counterpart within the realm of graphons. In particular, node centrality measures, which have been successfully employed in various applications to reveal important nodes in a network, have so far not been defined for graphons. In this work we introduce formal definitions of centrality measures for graphons and establish their connections to centrality measures defined on finite graphs. In particular, we build on the theory of linear integral operators to define degree, eigenvector, and Katz centrality functions for graphons. We further establish concentration inequalities showing that these centrality functions are natural limits of their analogous counterparts defined on sequences of random graphs of increasing size. We discuss several strategies for computing these centrality measures, and illustrate them through a set of numerical examples.

artificial intelligence, centrality measure, machine learning, (18 more...)

arXiv.org Machine Learning

1707.0935

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.49)

Add feedback

Stochastic Alternating Direction Method of Multipliers with Variance Reduction for Nonconvex Optimization

Huang, Feihu, Chen, Songcan, Lu, Zhaosong

arXiv.org Machine LearningJul-26-2017

In the paper, we study the stochastic alternating direction method of multipliers (ADMM) for the nonconvex optimizations, and propose three classes of the nonconvex stochastic ADMM with variance reduction, based on different reduced variance stochastic gradients. Specifically, the first class called the nonconvex stochastic variance reduced gradient ADMM (SVRG-ADMM), uses a multi-stage scheme to progressively reduce the variance of stochastic gradients. The second is the nonconvex stochastic average gradient ADMM (SAG-ADMM), which additionally uses the old gradients estimated in the previous iteration. The third called SAGA-ADMM is an extension of the SAG-ADMM method. Moreover, under some mild conditions, we establish the iteration complexity bound of $O(1/\epsilon)$ of the proposed methods to obtain an $\epsilon$-stationary solution of the nonconvex optimizations. In particular, we provide a general framework to analyze the iteration complexity of these nonconvex stochastic ADMM methods with variance reduction. Finally, some numerical experiments demonstrate the effectiveness of our methods.

artificial intelligence, machine learning, sequence, (14 more...)

arXiv.org Machine Learning

1610.02758

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

Submodular Variational Inference for Network Reconstruction

Chen, Lin, Crawford, Forrest W, Karbasi, Amin

arXiv.org Machine LearningJul-10-2017

In real-world and online social networks, individuals receive and transmit information in real time. Cascading information transmissions (e.g. phone calls, text messages, social media posts) may be understood as a realization of a diffusion process operating on the network, and its branching path can be represented by a directed tree. The process only traverses and thus reveals a limited portion of the edges. The network reconstruction/inference problem is to infer the unrevealed connections. Most existing approaches derive a likelihood and attempt to find the network topology maximizing the likelihood, a problem that is highly intractable. In this paper, we focus on the network reconstruction problem for a broad class of real-world diffusion processes, exemplified by a network diffusion scheme called respondent-driven sampling (RDS). We prove that under realistic and general models of network diffusion, the posterior distribution of an observed RDS realization is a Bayesian log-submodular model.We then propose VINE (Variational Inference for Network rEconstruction), a novel, accurate, and computationally efficient variational inference algorithm, for the network reconstruction problem under this model. Crucially, we do not assume any particular probabilistic model for the underlying network. VINE recovers any connected graph with high accuracy as shown by our experimental results on real-life networks.

artificial intelligence, diffusion process, machine learning, (19 more...)

arXiv.org Machine Learning

1603.08616

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)
(2 more...)

Add feedback

Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning

Yang, Jiyan, Chow, Yin-Lam, Ré, Christopher, Mahoney, Michael W.

arXiv.org Machine LearningJul-10-2017

In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems---e.g., $\ell_2$ and $\ell_1$ regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system. We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Particularly, when solving $\ell_1$ regression with size $n$ by $d$, pwSGD returns an approximate solution with $\epsilon$ relative error in the objective value in $\mathcal{O}(\log n \cdot \text{nnz}(A) + \text{poly}(d)/\epsilon^2)$ time. This complexity is uniformly better than that of RLA methods in terms of both $\epsilon$ and $d$ when the problem is unconstrained. For $\ell_2$ regression, pwSGD returns an approximate solution with $\epsilon$ relative error in the objective value and the solution vector measured in prediction norm in $\mathcal{O}(\log n \cdot \text{nnz}(A) + \text{poly}(d) \log(1/\epsilon) /\epsilon)$ time. We also provide lower bounds on the coreset complexity for more general regression problems, indicating that still new ideas will be needed to extend similar RLA preconditioning ideas to weighted SGD algorithms for more general regression problems. Finally, the effectiveness of such algorithms is illustrated numerically on both synthetic and real datasets.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1502.03571

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Facebook's advice to students interested in artificial intelligence

#artificialintelligenceJul-8-2017, 01:21:19 GMT

That's the gist of the advice to students interested in AI from Facebook's Yann LeCun and Joaquin Quiñonero Candela who run the company's Artificial Intelligence Lab and Applied Machine Learning group respectively. Tech companies often advocate STEM (science, technology, engineering and math), but today's tips are particularly pointed. The pair specifically note that students should eat their vegetables take Calc I, Calc II, Calc III, Linear Algebra, Probability and Statistics as early as possible. From this list, probability and statistics are perhaps the most interesting. From what I remember about high-school, those two subjects are regularly dismissed as too-obvious strategies for skirting the informal AP Calculus preference of top colleges and universities (AP Statistics is often thought of as a cop-out by students).

artificial intelligence, social media, student interested, (6 more...)

#artificialintelligence

Industry: Information Technology > Services (0.62)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.80)

Add feedback

Graph Learning from Data under Structural and Laplacian Constraints

Egilmez, Hilmi E., Pavez, Eduardo, Ortega, Antonio

arXiv.org Machine LearningJul-5-2017

RAPHS are generic mathematical structures consisting of sets of vertices and edges, which are used for modeling pairwise relations (edges) between a number of objects (vertices). In practice, this representation is often extended to weighted graphs, for which a set of scalar values (weights) are assigned to edges and potentially to vertices. Thus, weighted graphs offer general and flexible representations for modeling affinity relations between the objects of interest. Many practical problems can be represented using weighted graphs. For example, a broad class of combinatorial problems such as weighted matching, shortest-path and network-flow [2] are defined using weighted graphs. In signal/data-oriented problems, weighted graphs provide concise (sparse) representations for robust modeling of signals/data [3]. Such graphbased models are also useful for analyzing and visualizing the relations between their samples/features. Moreover, weighted graphs naturally emerge in networked data applications, such as learning, signal processing and analysis on computer, social, sensor, energy, transportation and biological networks [4], where the signals/data are inherently related to a graph associated with the underlying network.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

1611.05181

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Less than a Single Pass: Stochastically Controlled Stochastic Gradient Method

Lei, Lihua, Jordan, Michael I.

arXiv.org Machine LearningJul-2-2017

We develop and analyze a procedure for gradient-based optimization that we refer to as stochastically controlled stochastic gradient (SCSG). As a member of the SVRG family of algorithms, SCSG makes use of gradient estimates at two scales, with the number of updates at the faster scale being governed by a geometric random variable. Unlike most existing algorithms in this family, both the computation cost and the communication cost of SCSG do not necessarily scale linearly with the sample size $n$; indeed, these costs are independent of $n$ when the target accuracy is low. An experimental evaluation on real datasets confirms the effectiveness of SCSG.

artificial intelligence, convex case, machine learning, (17 more...)

arXiv.org Machine Learning

1609.03261

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Virginia (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.71)

Add feedback

Poisson intensity estimation with reproducing kernels

Flaxman, Seth, Teh, Yee Whye, Sejdinovic, Dino

arXiv.org Machine LearningJun-26-2017

Despite the fundamental nature of the inhomogeneous Poisson process in the theory and application of stochastic processes, and its attractive generalizations (e.g. Cox process), few tractable nonparametric modeling approaches of intensity functions exist, especially when observed points lie in a high-dimensional space. In this paper we develop a new, computationally tractable Reproducing Kernel Hilbert Space (RKHS) formulation for the inhomogeneous Poisson process. We model the square root of the intensity as an RKHS function. Whereas RKHS models used in supervised learning rely on the so-called representer theorem, the form of the inhomogeneous Poisson process likelihood means that the representer theorem does not apply. However, we prove that the representer theorem does hold in an appropriately transformed RKHS, guaranteeing that the optimization of the penalized likelihood can be cast as a tractable finite-dimensional problem. The resulting approach is simple to implement, and readily scales to high dimensions and large-scale datasets.

artificial intelligence, machine learning, poisson intensity estimation, (13 more...)

arXiv.org Machine Learning

1610.08623

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Oceania > New Zealand (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback