AITopics

In multi-step learning, where a final learning task is accomplished via a sequence of intermediate learning tasks, the intuition is that successive steps or levels transform the initial data into representations more and more ``suited" to the final learning task. A related principle arises in transfer-learning where Baxter (2000) proposed a theoretical framework to study how learning multiple tasks transforms the inductive bias of a learner. The most widespread multi-step learning approach is semi-supervised learning with two steps: unsupervised, then supervised. Several authors (Castelli-Cover, 1996; Balcan-Blum, 2005; Niyogi, 2008; Ben-David et al, 2008; Urner et al, 2011) have analyzed SSL, with Balcan-Blum (2005) proposing a version of the PAC learning framework augmented by a ``compatibility function" to link concept class and unlabeled data distribution. We propose to analyze SSL and other multi-step learning approaches, much in the spirit of Baxter's framework, by defining a learning problem generatively as a joint statistical model on $X \times Y$. This determines in a natural way the class of conditional distributions that are possible with each marginal, and amounts to an abstract form of compatibility function. It also allows to analyze both discrete and non-discrete settings. As tool for our analysis, we define a notion of $\gamma$-uniform shattering for statistical models. We use this to give conditions on the marginal and conditional models which imply an advantage for multi-step learning approaches. In particular, we recover a more general version of a result of Poggio et al (2012): under mild hypotheses a multi-step approach which learns features invariant under successive factors of a finite group of invariances has sample complexity requirements that are additive rather than multiplicative in the size of the subgroups.

artificial intelligence, learning, machine learning, (18 more...)

Country: North America > United States (0.28)

Industry: Education (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

David, Ofir, Moran, Shay, Yehudayoff, Amir

Supervised learning through the lens of compression

This work continues the study of the relationship between sample compression schemes and statistical learning, which has been mostly investigated within the framework of binary classification. We first extend the investigation to multiclass categorization: we prove that in this case learnability is equivalent to compression of logarithmic sample size and that the uniform convergence property implies compression of constant size. We use the compressibility-learnability equivalence to show that (i) for multiclass categorization, PAC and agnostic PAC learnability are equivalent, and (ii) to derive a compactness theorem for learnability. We then consider supervised learning under general loss functions: we show that in this case, in order to maintain the compressibility-learnability equivalence, it is necessary to consider an approximate variant of compression. We use it to show that PAC and agnostic PAC are not equivalent, even when the loss function has only three values.

artificial intelligence, compression scheme, machine learning, (18 more...)

Country:

North America > United States (0.28)
Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.51)

Vuffray, Marc, Misra, Sidhant, Lokhov, Andrey, Chertkov, Michael

Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models

We consider the problem of learning the underlying graph of an unknown Ising model on p spins from a collection of i.i.d. samples generated from the model. We suggest a new estimator that is computationally efficient and requires a number of samples that is near-optimal with respect to previously established information theoretic lower-bound. Our statistical estimator has a physical interpretation in terms of "interaction screening". The estimator is consistent and is efficiently implemented using convex optimization. We prove that with appropriate regularization, the estimator recovers the underlying graph using a number of samples that is logarithmic in the system size p and exponential in the maximum coupling-intensity and maximum node-degree.

artificial intelligence, ising model, machine learning, (15 more...)

Country:

North America > United States (0.69)
Europe (0.46)

Industry:

Energy (0.47)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

On the Recursive Teaching Dimension of VC Classes

Chen, Xi, Chen, Xi, Cheng, Yu, Tang, Bo

The recursive teaching dimension (RTD) of a concept class $C \subseteq \{0, 1\}^n$, introduced by Zilles et al. [ZLHZ11], is a complexity parameter measured by the worst-case number of labeled examples needed to learn any target concept of $C$ in the recursive teaching model. In this paper, we study the quantitative relation between RTD and the well-known learning complexity measure VC dimension (VCD), and improve the best known upper and (worst-case) lower bounds on the recursive teaching dimension with respect to the VC dimension. Given a concept class $C \subseteq \{0, 1\}^n$ with $VCD(C) = d$, we first show that $RTD(C)$ is at most $d 2^{d+1}$. This is the first upper bound for $RTD(C)$ that depends only on $VCD(C)$, independent of the size of the concept class $|C|$ and its~domain size $n$. Before our work, the best known upper bound for $RTD(C)$ is $O(d 2^d \log \log |C|)$, obtained by Moran et al. [MSWY15]. We remove the $\log \log |C|$ factor. We also improve the lower bound on the worst-case ratio of $RTD(C)$ to $VCD(C)$. We present a family of classes $\{ C_k \}_{k \ge 1}$ with $VCD(C_k) = 3k$ and $RTD(C_k)=5k$, which implies that the ratio of $RTD(C)$ to $VCD(C)$ in the worst case can be as large as $5/3$. Before our work, the largest ratio known was $3/2$ as obtained by Kuhlmann [Kuh99]. Since then, no finite concept class $C$ has been known to satisfy $RTD(C) > (3/2) VCD(C)$.

artificial intelligence, dimension, machine learning, (15 more...)

Country:

North America > United States > California (0.28)
Europe (0.28)

Industry: Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

#artificialintelligenceDec-28-2016, 06:45:27 GMT

The real prerequisite for machine learning isn't math, it's data analysis - SHARP SIGHT LABS

When beginners get started with machine learning, the inevitable question is "what are the prerequisites? What do I need to know to get started?" A list like this is enough to intimidate anyone but a person with an advanced math degree. It's unfortunate, because I think a lot of beginners lose heart and are scared away by this advice. If you're intimidated by the math, I have some good news for you: in order to get started building machine learning models (as opposed to doing machine learning theory), you need less math background than you think (and almost certainly less math than you've been told that you need).

artificial intelligence, data analysis, machine learning, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.54)

#artificialintelligenceDec-25-2016, 11:45:15 GMT

Intelligent Things It's all about machine learning

Evolving from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning explores software algorithms that can learn from, and make predictions on volumes of data. Simply stated... Machine learning helps humans make data-driven decisions. Machine learning offers practical solutions that can maximize resource utilization, prolong the lifespan of IoT sensors, platforms and networks, and enables dynamic services architecture. Our connected world is increasingly dependent on big data -- at rest, and in years to come, streaming fast data -- in motion." With real-time predictive models, once a streaming fast data point has been observed it might never be seen again.

artificial intelligence, machine learning, prediction, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.56)

@machinelearnbotDec-18-2016, 00:25:04 GMT

Artificial Intelligence and Machine Learning in Big Data and IoT: The Market for Data Capture …

Artificial Intelligence and Machine Learning in Big Data and IoT: The Market for Data Capture … NEW YORK, Dec. 16, 2016 /PRNewswire/ Overview:More than 50% of enterprise IT organizations are experimenting with Artificial Intelligence (AI) in various forms such as Machine Learning, Deep Learning, Computer Vision, Image Recognition, Voice Recognition, Artificial Neural Networks, and more. AI is not a single technology but a convergence of various technologies, statistical models, algorithms, and approaches. Machine Learning is a sub-field of computer science that evolved from the study of pattern recognition and computational learning theory in AI.Every large corporation collects and maintains a huge amount of human-oriented data associated with its customers including their preferences, purchases, habits, and other personal information. As the Internet of Things (IoT) progresses, there will an increasingly large amount of unstructured machine data.

big data, data mining, pattern recognition, (4 more...)

@machinelearnbot

Country: North America > United States > New York (0.29)

Industry:

Information Technology (0.64)
Media > News (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.64)

#artificialintelligenceDec-7-2016, 23:45:28 GMT

Data Science & Machine Learning Training Workshop

Data Science Middle East Foundation in partnership with EVERATI running 3-day training workshop series across Middle East to get you started on your data science and machine learning journey, as you learn how to use data and science to deliver insights, value and innovation. Data Science and Machine Learning workshop is a 3-day practical training program for applied introduction to data science industry practices and models of machine learning. The workshop has a strong focus on gaining hands-on experience implementing algorithms and building predictive models on real datasets. By the end of the workshop, participants will be ready to implement the machine learning algorithms using data science on their own data, and immediately generate business value. The workshop will take participants through the conceptual and applied foundations of the subject.

artificial intelligence, data mining, machine learning, (7 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (0.89)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

#artificialintelligenceDec-6-2016, 17:40:45 GMT

Machine Learning Theory - Part 3: Regularization and the Bias-variance Trade-off

In first part we explored the statistical model underlying the machine learning problem, and used it to formalize the problem in terms of obtaining the minimum generalization error. By noting that we cannot directly evaluate the generalization error of an ML model, we continued in the second part by establishing a theory that relates this elusive generalization error to another error metric that we can actually evaluate, which is the empirical error. That is: the generalization error (or the risk) $R(h)$ is bounded by the empirical risk (or the training error) plus a term that is proportionate to the complexity (or the richness) of the hypothesis space $ \mathcal{H} $, the dataset size $N$, and the degree of certainty $1 - \delta$ about the bound. Starting from this part, and based on this simplified theoretical result, we'll begin to draw some practical concepts for the process of solving the ML problem. We'll start by trying to get more intuition about why a more complex hypothesis space is bad.

artificial intelligence, hypothesis, machine learning, (19 more...)

Country:

North America > United States > New York (0.05)
North America > United States > New Jersey > Hudson County > Secaucus (0.05)

Industry: Education (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.85)

#artificialintelligenceDec-3-2016, 21:45:24 GMT

The Mathematics of Machine Learning

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I've observed that some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-learn, Weka, Tensorflow etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.

artificial intelligence, bayesian inference, machine learning, (14 more...)

Country: North America > United States > Texas (0.06)

Genre: Instructional Material (0.32)

Industry: Education > Educational Setting (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)