Goto

Collaborating Authors

 Mathematical & Statistical Methods


A simple efficient density estimator that enables fast systematic search

arXiv.org Machine Learning

This paper introduces a simple and efficient density estimator that enables fast systematic search. To show its advantage over commonly used kernel density estimator, we apply it to outlying aspects mining. Outlying aspects mining discovers feature subsets (or subspaces) that describe how a query stand out from a given dataset. The task demands a systematic search of subspaces. We identify that existing outlying aspects miners are restricted to datasets with small data size and dimensions because they employ kernel density estimator, which is computationally expensive, for subspace assessments. We show that a recent outlying aspects miner can run orders of magnitude faster by simply replacing its density estimator with the proposed density estimator, enabling it to deal with large datasets with thousands of dimensions that would otherwise be impossible.


A Simple Visual Proof of a Powerful Idea in Graph Theory - Facts So Romantic

Nautilus

A recent advance in geometry makes heavy use of Ramsey's theorem, an important idea in another field--graph theory. Ramsey's theorem states that in any graph where all points are connected by either red lines or blue lines, you're guaranteed to have a large subset of the graph that is completely uniform--that is, either all red or all blue. Equivalently, you can go the other way: Pick how big you want your uniform subset to be. Ramsey's theorem states that somewhere out there there's a graph in which a subset of that size must arise. It's not obvious why this is true.


Network Essence: PageRank Completion and Centrality-Conforming Markov Chains

arXiv.org Machine Learning

Ji\v{r}\'i Matou\v{s}ek (1963-2015) had many breakthrough contributions in mathematics and algorithm design. His milestone results are not only profound but also elegant. By going beyond the original objects --- such as Euclidean spaces or linear programs --- Jirka found the essence of the challenging mathematical/algorithmic problems as well as beautiful solutions that were natural to him, but were surprising discoveries to the field. In this short exploration article, I will first share with readers my initial encounter with Jirka and discuss one of his fundamental geometric results from the early 1990s. In the age of social and information networks, I will then turn the discussion from geometric structures to network structures, attempting to take a humble step towards the holy grail of network science, that is to understand the network essence that underlies the observed sparse-and-multifaceted network data. I will discuss a simple result which summarizes some basic algebraic properties of personalized PageRank matrices. Unlike the traditional transitive closure of binary relations, the personalized PageRank matrices take "accumulated Markovian closure" of network data. Some of these algebraic properties are known in various contexts. But I hope featuring them together in a broader context will help to illustrate the desirable properties of this Markovian completion of networks, and motivate systematic developments of a network theory for understanding vast and ubiquitous multifaceted network data.


How to become a Data Scientist – freeCodeCamp

#artificialintelligence

The main topics concerning mathematics that you should familiarize yourself with if you want to go into data science are probability, statistics, and linear algebra. As you learn more about other topics such as statistical learning (machine learning) these core mathematical foundations will serve as a base for you to continue learning from. Let's briefly describe each and give you a few resources to learn from! Probability -- is the measure of the likelihood that an event will occur. A lot of data science is based on attempting to measure likelihood of events, everything from the odds of an advertisement getting clicked on, to the probability of failure for a part on an assembly line. For this classic topic I recommend going with a book, such as A First Course in Probability by Sheldon Ross or Probability Theory by E.T. Jaynes.



Curious Mathematical Object: Hyperlogarithms

@machinelearnbot

Logarithms turn a product of numbers into a sum of numbers: log(xy) log(x) log(y). Hyperlogarithms generalize the concept as follows: Hlog(XY) Hlog(X) Hlog(y), where X and Y are any kind of objects, and the product and sum are replaced by operators in some arbitrary space. Here we focus exclusively on operations on sets: XY becomes the intersection of the sets X and Y, and X Y the union of X and Y. The question is: which functions satisfy Hlog(XY) Hlog(X) Hlog(y). We assume here that the argument for Hlog is a set X, and the returned value Hlog(X) Y is another set Y from the same set of sets. Let E {X, Y, ... } be the sets of all potential arguments for Hlog.


Hypotheses testing on infinite random graphs

arXiv.org Machine Learning

Drawing on some recent results that provide the formalism necessary to definite stationarity for infinite random graphs, this paper initiates the study of statistical and learning questions pertaining to these objects. Specifically, a criterion for the existence of a consistent test for complex hypotheses is presented, generalizing the corresponding results on time series. As an application, it is shown how one can test that a tree has the Markov property, or, more generally, to estimate its memory.


Four Weird Mathematical Objects

@machinelearnbot

Here I discuss four interesting mathematical problems (mostly involving famous unsolved conjectures) of considerable interest, and that even high school kids can understand. The field itself has been a source of constant innovation -- especially to develop distributed architectures, as well as HPC (high performance computing) and quantum computing to try to solve (to non avail so far) these very difficult yet basic problems. And for those interested in mathematical logic and measure theory, here is an interesting paradox, which somehow allows you to duplicate a ball made out of gold, into two balls, each having the same size as the original ball (though it does not double the mass.) The Banach–Tarski paradox is a theorem which states the following: Given a solid ball in 3‑dimensional space, there exists a decomposition of the ball into a finite number of disjoint subsets, which can then be put back together in a different way to yield two identical copies of the original ball.


Number crunchers in demand as data, AI startups see potential - Times of India

#artificialintelligence

CHENNAI: With a PhD in mathematics, Bharat Ramakrishna was preparing content for school children when suddenly he found a well-paying job in the machine learning & data sciences space. No longer is a mathematics background purely academic. Maths majors are now in demand for a job in artificial intelligence and data sciences. "After graduating from the University of Utah, I was into preparing question banks for students. Now, concepts such as matrices, linear algebra and calculus are being used in artificial intelligence and it is easier for a mathematics graduate to learn coding than vice versa," said Ramakrishna, data scientist at Skillenza.


Local Asymptotics for Stochastic Optimization: Optimality, Constraint Identification, and Dual Averaging

arXiv.org Machine Learning

We study local complexity measures for stochastic convex optimization problems, providing a local minimax theory analogous to that of H\'{a}jek and Le Cam for classical statistical problems, and giving efficient procedures based on Nesterov's dual averaging that (often) adaptively achieve optimal convergence guarantees. Our results provide function-specific lower bounds and convergence results that make precise a correspondence between statistical difficulty and the geometric notion of tilt-stability from optimization. We show how variants of dual averaging---a stochastic gradient-based procedure---guarantee finite time identification of constraints in optimization problems, while stochastic gradient procedures provably fail. Additionally, we highlight a gap between optimization problems with linear and nonlinear constraints: standard stochastic-gradient-based procedures are suboptimal even for the simplest nonlinear constraints.