Goto

Collaborating Authors

 Inductive Learning


World record set for largest skinny dip on Co Wicklow beach

BBC News

More than 2,500 women have stripped naked on a beach in the Republic of Ireland to break the world record for the largest skinny dip.


Localized Structured Prediction

arXiv.org Machine Learning

Key to structured prediction is exploiting the problem structure to simplify the learning process. A major challenge arises when data exhibit a local structure (e.g., are made by "parts") that can be leveraged to better approximate the relation between (parts of) the input and (parts of) the output. Recent literature on signal processing, and in particular computer vision, has shown that capturing these aspects is indeed essential to achieve state-of-the-art performance. While such algorithms are typically derived on a case-by-case basis, in this work we propose the first theoretical framework to deal with part-based data from a general perspective. We derive a novel approach to deal with these problems and study its generalization properties within the setting of statistical learning theory. Our analysis is novel in that it explicitly quantifies the benefits of leveraging the part-based structure of the problem with respect to the learning rates of the proposed estimator.


AI (I): Machine learning

#artificialintelligence

Machine learning provides de foundation for artificial intelligence. Machine learning is a technique in which we train a software model using data. The model learns from the training cases and then, we can use the trained model to make predictions for new data cases. To have a computer make intelligent predictions from the data, we just need a way to train it to perform the correct calculations. We usually start with a data set that contains historical records, often called cases or observations.


Explaining supervised learning to a kid (or your boss)

#artificialintelligence

Now that you know what machine learning is, let's meet the easiest kind. My goal here is to get humans of all stripes and (almost) all ages comfy with its basic jargon: instance, label, feature, model, algorithm, and supervised learning. Instances are also called'examples' or'observations.' What do these examples look like when we put them in a table? Sticking with convention (because good manners are good), each row is an instance.


Learning to Follow Language Instructions with Adversarial Reward Induction

arXiv.org Artificial Intelligence

Recent work has shown that deep reinforcement-learning agents can learn to follow language-like instructions from infrequent environment rewards. However, for many real-world natural language commands that involve a degree of underspecification or ambiguity, such as "tidy the room", it would be challenging or impossible to program an appropriate reward function. To overcome this, we present a method for learning to follow commands from a training set of instructions and corresponding example goal-states, rather than an explicit reward function. Importantly, the example goal-states are not seen at test time. The approach effectively separates the representation of what instructions require from how they can be executed. In a simple grid world, the method enables an agent to learn a range of commands requiring interaction with blocks and understanding of spatial relations and underspecified abstract arrangements. We further show the method allows our agent to adapt to changes in the environment without requiring new training examples.


GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

arXiv.org Machine Learning

GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings Marek Sikora a,b,, Łukasz Wróbel a,b,, Adam Gudyś a, a Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland b Institute of Innovative Technologies, EMAG, Leopolda 31, 40-189 Katowice, PolandAbstract This article presents GuideR, a user-guided rule induction algorithm, which overcomes the largest limitation of the existing methods---the lack of the possibility to introduce user's preferences or domain knowledge to the rule learning process. Automatic selection of attributes and attribute ranges often leads to the situation in which resulting rules do not contain interesting information. We propose an induction algorithm which takes into account user's requirements. Our method uses the sequential covering approach and is suitable for classification, regression, and survival analysis problems. The effectiveness of the algorithm in all these tasks has been verified experimentally, confirming guided rule induction to be a powerful data analysis tool. Introduction Sequential covering rule induction algorithms can be used for both, predictive and descriptive purposes [1, 2, 3, 4]. In spite of the development of increasingly sophisticated versions of those algorithms [5, 6], the main principle remains unchanged and involves two phases: rule growing and rule pruning. In the latter, some of these conditions are removed. In comparison to other machine learning methods, rule sets obtained by sequential covering algorithm, also known as separate-and-conquer strategy (SnC), are characterized by good predictive as well as descriptive capabilities. Taking into consideration only the former, superior results can often be obtained using other methods, e.g. However, data models obtained this way are much less comprehensible than rule sets. In the case of rule learning for descriptive purposes, the algorithms of association rule induction [12, 13, 14] or subgroup discovery [15, 6], are applied. The former leads to a very large number of rules which must then be limited by filtering according to rule interestingness measures [16, 17, 18]. Nevertheless, rule sets obtained by subgroup discovery are characterized by worse predictive abilities than those generated by the standard sequential covering approach. Therefore, if creating a prediction system with comprehensible data model is the main objective, the application of sequential covering rule induction algorithms provides the most sensible solution.


AI Method Could Speed Up Development of Specialized Nanoparticles

#artificialintelligence

Summary: A new artificial intelligence technique could speed up complex physics simulations and help create multilayered nanoparticles, researchers say. A new technique developed by MIT physicists could someday provide a way to custom-design multilayered nanoparticles with desired properties, potentially for use in displays, cloaking systems, or biomedical devices. It may also help physicists tackle a variety of thorny research problems, in ways that could in some cases be orders of magnitude faster than existing methods. The innovation uses computational neural networks, a form of artificial intelligence, to "learn" how a nanoparticle's structure affects its behavior, in this case the way it scatters different colors of light, based on thousands of training examples. Then, having learned the relationship, the program can essentially be run backward to design a particle with a desired set of light-scattering properties -- a process called inverse design.


Intentional Control of Type I Error over Unconscious Data Distortion: a Neyman-Pearson Approach to Text Classification

arXiv.org Machine Learning

Digital texts have become an increasingly important source of data for social studies. However, textual data from open platforms are vulnerable to manipulation (e.g., censorship and information inflation), often leading to bias in subsequent empirical analysis. This paper investigates the problem of data distortion in text classification when controlling type I error (a relevant textual message is classified as irrelevant) is the priority. The default classical classification paradigm that minimizes the overall classification error can yield an undesirably large type I error, and data distortion exacerbates this situation. As a solution, we propose the Neyman-Pearson (NP) classification paradigm which minimizes type II error under a user-specified type I error constraint. Theoretically, we show that while the classical oracle (i.e., optimal classifier) cannot be recovered under unknown data distortion even if one has the entire post-distortion population, the NP oracle is unaffected by data distortion and can be recovered under the same condition. Empirically, we illustrate the advantage of NP classification methods in a case study that classifies posts about strikes and corruption published on a leading Chinese blogging platform.


AI-based method could speed development of specialized nanoparticles

#artificialintelligence

The innovation uses computational neural networks, a form of artificial intelligence, to "learn" how a nanoparticle's structure affects its behavior, in this case the way it scatters different colors of light, based on thousands of training examples. Then, having learned the relationship, the program can essentially be run backward to design a particle with a desired set of light-scattering properties -- a process called inverse design. The findings are being reported in the journal Science Advances, in a paper by MIT senior John Peurifoy, research affiliate Yichen Shen, graduate student Li Jing, professor of physics Marin Soljacic, and five others. While the approach could ultimately lead to practical applications, Soljacic says, the work is primarily of scientific interest as a way of predicting the physical properties of a variety of nanoengineered materials without requiring the computationally intensive simulation processes that are typically used to tackle such problems. Soljacic says that the goal was to look at neural networks, a field that has seen a lot of progress and generated excitement in recent years, to see "whether we can use some of those techniques in order to help us in our physics research. So basically, are computers'intelligent' enough so that they can do some more intelligent tasks in helping us understand and work with some physical systems?"