AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Three myths about data scientists and big data

@machinelearnbotNov-6-2016, 12:15:03 GMT

What I found useful during my PhD (this could apply to master program too) is that I immediately started to work for a company on GIS, digital cartography, and water management (predicting extreme floods locally - how much the water could rise, at worse in 100 years, at any (x,y) coordinate on a digital map, modeling how any drop of water falling somewhere runs down, goes underground, eventually reaches low elevation and merges with other water drops on the way down - the digital maps had elevation and land use data available for each pixel; by land use I mean crop, forest, water, rock and so on, as this is important to model how water moves). Very applied and interesting stuff. My first paper (after an article about flood predictions, in a local specialized journal) was in Journal of Number Theory though I never attended classes on number theory. I then started to publish in computational statistics journal, but also in IEEE Pattern Analysis and Machine Intelligence, and Journal of the Royal Statistical Society, series B. I'm currently finishing a book on data science (Wiley, exp. The take away from this is that it helps getting polyvalent, if the PhD/Master student can do applied work for a real company, hired and paid as a real employee (partnership between university and private sector), at the beginning of his program.

artificial intelligence, big data, data scientist, (3 more...)

@machinelearnbot

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.80)

Add feedback

Fast Eigenspace Approximation using Random Signals

Paratte, Johan, Martin, Lionel

arXiv.org Machine LearningNov-4-2016

We focus in this work on the estimation of the first $k$ eigenvectors of any graph Laplacian using filtering of Gaussian random signals. We prove that we only need $k$ such signals to be able to exactly recover as many of the smallest eigenvectors, regardless of the number of nodes in the graph. In addition, we address key issues in implementing the theoretical concepts in practice using accurate approximated methods. We also propose fast algorithms both for eigenspace approximation and for the determination of the $k$th smallest eigenvalue $\lambda_k$. The latter proves to be extremely efficient under the assumption of locally uniform distribution of the eigenvalue over the spectrum. Finally, we present experiments which show the validity of our method in practice and compare it to state-of-the-art methods for clustering and visualization both on synthetic small-scale datasets and larger real-world problems of millions of nodes. We show that our method allows a better scaling with the number of nodes than all previous methods while achieving an almost perfect reconstruction of the eigenspace formed by the first $k$ eigenvectors.

approximation, artificial intelligence, data mining, (18 more...)

arXiv.org Machine Learning

1611.00938

Country: North America > United States > California (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)

Add feedback

Estimating the Size of a Large Network and its Communities from a Random Sample

Chen, Lin, Karbasi, Amin, Crawford, Forrest W.

arXiv.org Machine LearningOct-26-2016

Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V;E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that correctly estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. We conclude with extensions and directions for future work.

immunology, internal medicine, vertex, (20 more...)

arXiv.org Machine Learning

1610.08473

Country:

South America > Brazil (0.14)
North America > Mexico (0.14)
Asia > Middle East > Iran (0.14)
Asia > China (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.94)
Health & Medicine > Therapeutic Area > Immunology (0.94)
Health & Medicine > Epidemiology (0.67)

Technology:

Information Technology > Communications > Social Media (0.94)
Information Technology > Communications > Networks (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

How Well Do Local Algorithms Solve Semidefinite Programs?

Fan, Zhou, Montanari, Andrea

arXiv.org Machine LearningOct-17-2016

Several probabilistic models from high-dimensional statistics and machine learning reveal an intriguing --and yet poorly understood-- dichotomy. Either simple local algorithms succeed in estimating the object of interest, or even sophisticated semi-definite programming (SDP) relaxations fail. In order to explore this phenomenon, we study a classical SDP relaxation of the minimum graph bisection problem, when applied to Erd\H{o}s-Renyi random graphs with bounded average degree $d>1$, and obtain several types of results. First, we use a dual witness construction (using the so-called non-backtracking matrix of the graph) to upper bound the SDP value. Second, we prove that a simple local algorithm approximately solves the SDP to within a factor $2d^2/(2d^2+d-1)$ of the upper bound. In particular, the local algorithm is at most $8/9$ suboptimal, and $1+O(1/d)$ suboptimal for large degree. We then analyze a more sophisticated local algorithm, which aggregates information according to the harmonic measure on the limiting Galton-Watson (GW) tree. The resulting lower bound is expressed in terms of the conductance of the GW tree and matches surprisingly well the empirically determined SDP values on large-scale Erd\H{o}s-Renyi graphs. We finally consider the planted partition model. In this case, purely local algorithms are known to fail, but they do succeed if a small amount of side information is available. Our results imply quantitative bounds on the threshold for partial recovery using SDP in this model.

artificial intelligence, optimization problem, vertex, (18 more...)

arXiv.org Machine Learning

1610.0535

Country: North America > United States (0.46)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.45)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.35)

Add feedback

Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues

Shah, Nihar B., Balakrishnan, Sivaraman, Guntuboyina, Adityanand, Wainwright, Martin J.

arXiv.org Machine LearningSep-27-2016

There are various parametric models for analyzing pairwise comparison data, including the Bradley-Terry-Luce (BTL) and Thurstone models, but their reliance on strong parametric assumptions is limiting. In this work, we study a flexible model for pairwise comparisons, under which the probabilities of outcomes are required only to satisfy a natural form of stochastic transitivity. This class includes parametric models including the BTL and Thurstone models as special cases, but is considerably more general. We provide various examples of models in this broader stochastically transitive class for which classical parametric models provide poor fits. Despite this greater flexibility, we show that the matrix of probabilities can be estimated at the same rate as in standard parametric models. On the other hand, unlike in the BTL and Thurstone models, computing the minimax-optimal estimator in the stochastically transitive model is non-trivial, and we explore various computationally tractable alternatives. We show that a simple singular value thresholding algorithm is statistically consistent but does not achieve the minimax rate. We then propose and study algorithms that achieve the minimax rate over interesting sub-classes of the full stochastically transitive class. We complement our theoretical results with thorough numerical simulations.

artificial intelligence, estimator, survey article, (17 more...)

arXiv.org Machine Learning

1510.0561

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.45)

Add feedback

Exact and Inexact Subsampled Newton Methods for Optimization

Bollapragada, Raghu, Byrd, Richard, Nocedal, Jorge

arXiv.org Machine LearningSep-27-2016

The paper studies the solution of stochastic optimization problems in which approximations to the gradient and Hessian are obtained through subsampling. We first consider Newton-like methods that employ these approximations and discuss how to coordinate the accuracy in the gradient and Hessian to yield a superlinear rate of convergence in expectation. The second part of the paper analyzes an inexact Newton method that solves linear systems approximately using the conjugate gradient (CG) method, and that samples the Hessian and not the gradient (the gradient is assumed to be exact). We provide a complexity analysis for this method based on the properties of the CG iteration and the quality of the Hessian approximation, and compare it with a method that employs a stochastic gradient iteration instead of the CG method. We report preliminary numerical results that illustrate the performance of inexact subsampled Newton methods on machine learning applications based on logistic regression.

artificial intelligence, newton method, optimization problem, (14 more...)

arXiv.org Machine Learning

1609.08502

Country:

North America > United States > Massachusetts > Middlesex County (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

An Impossibility Result for Reconstruction in a Degree-Corrected Planted-Partition Model

Gulikers, Lennart, Lelarge, Marc, Massoulié, Laurent

arXiv.org Machine LearningSep-22-2016

We consider a Degree-Corrected Planted-Partition model: a random graph on $n$ nodes with two asymptotically equal-sized clusters. The model parameters are two constants $a,b > 0$ and an i.i.d. sequence of weights $(\phi_u)_{u=1}^n$, with finite second moment $\Phi^{(2)}$. Vertices $u$ and $v$ are joined by an edge with probability $\frac{\phi_u \phi_v}{n}a$ when they are in the same class and with probability $\frac{\phi_u \phi_v}{n}b$ otherwise. We prove that it is information-theoretically impossible to estimate the spins in a way positively correlated with the true community structure when $(a-b)^2 \Phi^{(2)} \leq 2(a+b)$. A by-product of our proof is a precise coupling-result for local-neighbourhoods in Degree-Corrected Planted-Partition models, which could be of independent interest.

artificial intelligence, probability, vertex, (15 more...)

arXiv.org Machine Learning

1511.00546

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.35)

Add feedback

Neural nets - learning with total gradient rather than stochastic gradients? • /r/MachineLearning

#artificialintelligenceSep-12-2016, 20:15:51 GMT

The estimate of the gradient from just a mini-batch is usually good enough to point you in the right descent direction. It doesn't make sense to do the extra computation for a marginally better estimate. Plus, the inaccuracy or noise introduced by the mini-batch approximation can act as a regularizer. Here is an interesting paper that performs statistical tests during optimization: if the gradient is not statistically significant, more samples are added to the mini-batch.

artificial intelligence, gradient, neural network, (3 more...)

#artificialintelligence

Industry: Energy > Oil & Gas (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Spark Technology Center

#artificialintelligenceSep-11-2016, 06:15:29 GMT

The Best Paper award for this year's International Conference on Very Large Data Bases (VLDB) goes to "Compressed Linear Algebra for Large-Scale Machine Learning", authored by a PhD candidate at the University of Maryland and four senior researchers from IBM. Their method for compressing matrices for linear algebra operations promises to provide users significant increases in speed with less memory. In particular, the compression technology provides benefits at two different parts of the data science process. Before training a model, a data scientist typically goes through multiple iterations of feature engineering. Common feature engineering tasks include examining the data with descriptive statistics and transforming the values in columns to better suit the assumptions built into different types of machine learning models.

artificial intelligence, machine learning, spark technology center, (8 more...)

#artificialintelligence

Country: North America > United States > Maryland (0.26)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)

Add feedback

A possible implementation for an Intelligent Agent using Graph theories to crawl Reddit. (RedditSharp QuickGraph MongoDB)

#artificialintelligenceSep-5-2016, 18:55:37 GMT

I cannot think more than 2 hours without thinking how to introduce AI techniques into what I'm thinking about. The last time it happened was super interesting and stay with me to see how I used graph theories to crawl reddit and make a knowledge base about Magic the Gathering card relations. Long story short, I was browsing magiccardmarket.eu to check which cards to buy when I found a guy selling a 9 card for 6 . The card spiked over the week-end and I jumped on reddit to check out the reason. Is there a new deck using it?

artificial intelligence, intelligent agent, knowledge base, (11 more...)

#artificialintelligence

Industry: Media > News (0.96)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.79)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.63)

Add feedback