AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

A simple efficient density estimator that enables fast systematic search

arXiv.org Machine LearningSep-12-2017

This paper introduces a simple and efficient density estimator that enables fast systematic search. To show its advantage over commonly used kernel density estimator, we apply it to outlying aspects mining. Outlying aspects mining discovers feature subsets (or subspaces) that describe how a query stand out from a given dataset. The task demands a systematic search of subspaces. We identify that existing outlying aspects miners are restricted to datasets with small data size and dimensions because they employ kernel density estimator, which is computationally expensive, for subspace assessments. We show that a recent outlying aspects miner can run orders of magnitude faster by simply replacing its density estimator with the proposed density estimator, enabling it to deal with large datasets with thousands of dimensions that would otherwise be impossible.

artificial intelligence, data mining, subspace, (17 more...)

arXiv.org Machine Learning

1707.00783

Country:

Oceania > Australia (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)

Add feedback

A Simple Visual Proof of a Powerful Idea in Graph Theory - Facts So Romantic

NautilusSep-8-2017, 19:10:08 GMT

A recent advance in geometry makes heavy use of Ramsey's theorem, an important idea in another field--graph theory. Ramsey's theorem states that in any graph where all points are connected by either red lines or blue lines, you're guaranteed to have a large subset of the graph that is completely uniform--that is, either all red or all blue. Equivalently, you can go the other way: Pick how big you want your uniform subset to be. Ramsey's theorem states that somewhere out there there's a graph in which a subset of that size must arise. It's not obvious why this is true.

artificial intelligence, graph, subset, (12 more...)

Nautilus

Country: North America > Canada > British Columbia (0.06)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.62)

Add feedback

Network Essence: PageRank Completion and Centrality-Conforming Markov Chains

Teng, Shang-Hua

arXiv.org Machine LearningAug-25-2017

Ji\v{r}\'i Matou\v{s}ek (1963-2015) had many breakthrough contributions in mathematics and algorithm design. His milestone results are not only profound but also elegant. By going beyond the original objects --- such as Euclidean spaces or linear programs --- Jirka found the essence of the challenging mathematical/algorithmic problems as well as beautiful solutions that were natural to him, but were surprising discoveries to the field. In this short exploration article, I will first share with readers my initial encounter with Jirka and discuss one of his fundamental geometric results from the early 1990s. In the age of social and information networks, I will then turn the discussion from geometric structures to network structures, attempting to take a humble step towards the holy grail of network science, that is to understand the network essence that underlies the observed sparse-and-multifaceted network data. I will discuss a simple result which summarizes some basic algebraic properties of personalized PageRank matrices. Unlike the traditional transitive closure of binary relations, the personalized PageRank matrices take "accumulated Markovian closure" of network data. Some of these algebraic properties are known in various contexts. But I hope featuring them together in a broader context will help to illustrate the desirable properties of this Markovian completion of networks, and motivate systematic developments of a network theory for understanding vast and ubiquitous multifaceted network data.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Machine Learning

1708.07906

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.92)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Game Theory (1.00)
Information Technology > Data Science > Data Mining (1.00)
(4 more...)

Add feedback

How to become a Data Scientist – freeCodeCamp

#artificialintelligenceAug-22-2017, 13:57:29 GMT

The main topics concerning mathematics that you should familiarize yourself with if you want to go into data science are probability, statistics, and linear algebra. As you learn more about other topics such as statistical learning (machine learning) these core mathematical foundations will serve as a base for you to continue learning from. Let's briefly describe each and give you a few resources to learn from! Probability -- is the measure of the likelihood that an event will occur. A lot of data science is based on attempting to measure likelihood of events, everything from the odds of an advertisement getting clicked on, to the probability of failure for a part on an assembly line. For this classic topic I recommend going with a book, such as A First Course in Probability by Sheldon Ross or Probability Theory by E.T. Jaynes.

artificial intelligence, machine learning, social media, (12 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.37)

Industry: Education > Educational Setting > Continuing Education (0.73)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Communications > Social Media (0.57)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.52)

Add feedback

[1312.6120] Exact solutions to the nonlinear dynamics of learning in deep linear neural networks

@machinelearnbotAug-17-2017, 10:45:10 GMT

Which authors of this paper are endorsers? Disable MathJax (What is MathJax?)

artificial intelligence, machine learning, nonlinear dynamic, (5 more...)

@machinelearnbot

Genre: Research Report (0.80)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Curious Mathematical Object: Hyperlogarithms

@machinelearnbotAug-16-2017, 19:07:22 GMT

Logarithms turn a product of numbers into a sum of numbers: log(xy) log(x) log(y). Hyperlogarithms generalize the concept as follows: Hlog(XY) Hlog(X) Hlog(y), where X and Y are any kind of objects, and the product and sum are replaced by operators in some arbitrary space. Here we focus exclusively on operations on sets: XY becomes the intersection of the sets X and Y, and X Y the union of X and Y. The question is: which functions satisfy Hlog(XY) Hlog(X) Hlog(y). We assume here that the argument for Hlog is a set X, and the returned value Hlog(X) Y is another set Y from the same set of sets. Let E {X, Y, ... } be the sets of all potential arguments for Hlog.

artificial intelligence, hlog, social media, (7 more...)

@machinelearnbot

Technology:

Information Technology > Communications > Social Media (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback

Hypotheses testing on infinite random graphs

Ryabko, Daniil

arXiv.org Machine LearningAug-10-2017

Drawing on some recent results that provide the formalism necessary to definite stationarity for infinite random graphs, this paper initiates the study of statistical and learning questions pertaining to these objects. Specifically, a criterion for the existence of a consistent test for complex hypotheses is presented, generalizing the corresponding results on time series. As an application, it is shown how one can test that a tree has the Markov property, or, more generally, to estimate its memory.

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Machine Learning

1708.03131

Country:

Africa > Sudan (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.42)

Add feedback

Four Weird Mathematical Objects

@machinelearnbotAug-7-2017, 23:58:34 GMT

Here I discuss four interesting mathematical problems (mostly involving famous unsolved conjectures) of considerable interest, and that even high school kids can understand. The field itself has been a source of constant innovation -- especially to develop distributed architectures, as well as HPC (high performance computing) and quantum computing to try to solve (to non avail so far) these very difficult yet basic problems. And for those interested in mathematical logic and measure theory, here is an interesting paradox, which somehow allows you to duplicate a ball made out of gold, into two balls, each having the same size as the original ball (though it does not double the mass.) The Banach–Tarski paradox is a theorem which states the following: Given a solid ball in 3‑dimensional space, there exists a decomposition of the ball into a finite number of disjoint subsets, which can then be put back together in a different way to yield two identical copies of the original ball.

artificial intelligence, computing

@machinelearnbot

Industry: Education (0.60)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback

Number crunchers in demand as data, AI startups see potential - Times of India

#artificialintelligenceAug-3-2017, 20:40:24 GMT

CHENNAI: With a PhD in mathematics, Bharat Ramakrishna was preparing content for school children when suddenly he found a well-paying job in the machine learning & data sciences space. No longer is a mathematics background purely academic. Maths majors are now in demand for a job in artificial intelligence and data sciences. "After graduating from the University of Utah, I was into preparing question banks for students. Now, concepts such as matrices, linear algebra and calculus are being used in artificial intelligence and it is easier for a mathematics graduate to learn coding than vice versa," said Ramakrishna, data scientist at Skillenza.

artificial intelligence, machine learning, student, (11 more...)

#artificialintelligence

Country:

Asia > India > Tamil Nadu > Chennai (0.64)
North America > United States > Utah (0.26)
Asia > India > West Bengal > Kharagpur (0.06)

Industry: Education > Curriculum > Subject-Specific Education (0.74)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.38)

Add feedback

Local Asymptotics for Stochastic Optimization: Optimality, Constraint Identification, and Dual Averaging

Duchi, John, Ruan, Feng

arXiv.org Machine LearningAug-2-2017

We study local complexity measures for stochastic convex optimization problems, providing a local minimax theory analogous to that of H\'{a}jek and Le Cam for classical statistical problems, and giving efficient procedures based on Nesterov's dual averaging that (often) adaptively achieve optimal convergence guarantees. Our results provide function-specific lower bounds and convergence results that make precise a correspondence between statistical difficulty and the geometric notion of tilt-stability from optimization. We show how variants of dual averaging---a stochastic gradient-based procedure---guarantee finite time identification of constraints in optimization problems, while stochastic gradient procedures provably fail. Additionally, we highlight a gap between optimization problems with linear and nonlinear constraints: standard stochastic-gradient-based procedures are suboptimal even for the simplest nonlinear constraints.

artificial intelligence, constraint, machine learning, (17 more...)

arXiv.org Machine Learning

1612.05612

Country:

North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)

Add feedback