AITopics | Education

Collaborating Authors

Education

Online Learning with Pairwise Loss Functions

Wang, Yuyang, Khardon, Roni, Pechyony, Dmitry, Jones, Rosie

arXiv.org Machine LearningJan-22-2013

Efficient online learning with pairwise loss functions is a crucial component in building large-scale learning system that maximizes the area under the Receiver Operator Characteristic (ROC) curve. In this paper we investigate the generalization performance of online learning algorithms with pairwise loss functions. We show that the existing proof techniques for generalization bounds of online algorithms with a univariate loss can not be directly applied to pairwise losses. In this paper, we derive the first result providing data-dependent bounds for the average risk of the sequence of hypotheses generated by an arbitrary online learner in terms of an easily computable statistic, and show how to extract a low risk hypothesis from the sequence. We demonstrate the generality of our results by applying it to two important problems in machine learning. First, we analyze two online algorithms for bipartite ranking; one being a natural extension of the perceptron algorithm and the other using online convex optimization. Secondly, we provide an analysis for the risk bound for an online algorithm for supervised metric learning.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

1301.5332

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

Active Learning of Inverse Models with Intrinsically Motivated Goal Exploration in Robots

Baranes, Adrien, Oudeyer, Pierre-Yves

arXiv.org Artificial IntelligenceJan-21-2013

We introduce the Self-Adaptive Goal Generation - Robust Intelligent Adaptive Curiosity (SAGG-RIAC) architecture as an intrinsi- cally motivated goal exploration mechanism which allows active learning of inverse models in high-dimensional redundant robots. This allows a robot to efficiently and actively learn distributions of parameterized motor skills/policies that solve a corresponding distribution of parameterized tasks/goals. The architecture makes the robot sample actively novel parameterized tasks in the task space, based on a measure of competence progress, each of which triggers low-level goal-directed learning of the motor policy pa- rameters that allow to solve it. For both learning and generalization, the system leverages regression techniques which allow to infer the motor policy parameters corresponding to a given novel parameterized task, and based on the previously learnt correspondences between policy and task parameters. We present experiments with high-dimensional continuous sensorimotor spaces in three different robotic setups: 1) learning the inverse kinematics in a highly-redundant robotic arm, 2) learning omnidirectional locomotion with motor primitives in a quadruped robot, 3) an arm learning to control a fishing rod with a flexible wire. We show that 1) exploration in the task space can be a lot faster than exploration in the actuator space for learning inverse models in redundant robots; 2) selecting goals maximizing competence progress creates developmental trajectories driving the robot to progressively focus on tasks of increasing complexity and is statistically significantly more efficient than selecting tasks randomly, as well as more efficient than different standard active motor babbling methods; 3) this architecture allows the robot to actively discover which parts of its task space it can learn to reach and which part it cannot.

artificial intelligence, machine learning, task space, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.robot.2012.05.008

1301.4862

Country:

Europe (1.00)
North America > United States > New York (0.28)
North America > United States > Wisconsin (0.27)

Genre: Research Report > New Finding (0.45)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

A Spectral Algorithm for Latent Dirichlet Allocation

Anandkumar, Animashree, Foster, Dean P., Hsu, Daniel, Kakade, Sham M., Liu, Yi-Kai

arXiv.org Machine LearningJan-17-2013

The problem of topic modeling can be seen as a generalization of the clustering problem, in that it posits that observations are generated due to multiple latent factors (e.g., the words in each document are generated as a mixture of several active topics, as opposed to just one). This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topic probability vectors (the distributions over words for each topic), when only the words are observed and the corresponding topics are hidden. We provide a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of mixture models, including the popular latent Dirichlet allocation (LDA) model. For LDA, the procedure correctly recovers both the topic probability vectors and the prior over the topics, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method, termed Excess Correlation Analysis (ECA), is based on a spectral decomposition of low order moments (third and fourth order) via two singular value decompositions (SVDs). Moreover, the algorithm is scalable since the SVD operations are carried out on $k\times k$ matrices, where $k$ is the number of latent factors (e.g. the number of topics), rather than in the $d$-dimensional observed space (typically $d \gg k$).

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1204.6703

Country:

Asia (1.00)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Sports (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.84)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Dynamic Bayesian Multinets

Bilmes, Jeff A.

arXiv.org Machine LearningJan-16-2013

In this work, dynamic Bayesian multinets are introduced where a Markov chain state at time t determines conditional independence patterns between random variables lying within a local time window surrounding t. It is shown how information-theoretic criterion functions can be used to induce sparse, discriminative, and class-conditional network structures that yield an optimal approximation to the class posterior probability, and therefore are useful for the classification task. Using a new structure learning heuristic, the resulting models are tested on a medium-vocabulary isolated-word speech recognition task. It is demonstrated that these discriminatively structured dynamic Bayesian multinets, when trained in a maximum likelihood setting using EM, can outperform both HMMs and other dynamic Bayesian networks with a similar number of parameters.

artificial intelligence, dependency, machine learning, (16 more...)

arXiv.org Machine Learning

1301.3837

Country:

North America > United States (0.28)
Asia > Japan (0.28)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Case Study in Knowledge Discovery and Elicitation in an Intelligent Tutoring Application

Nicholson, Ann, Boneh, Tal, Wilkin, Tim, Stacey, Kaye, Sonenberg, Liz, Steinle, Vicki

arXiv.org Artificial IntelligenceJan-10-2013

Most successful Bayesian network (BN) applications to datehave been built through knowledge elicitation from experts.This is difficult and time consuming, which has lead to recentinterest in automated methods for learning BNs from data. We present a case study in the construction of a BN in anintelligent tutoring application, specifically decimal misconceptions. Wedescribe the BN construction using expert elicitation and then investigate how certainexisting automated knowledge discovery methods might support the BN knowledge engineering process.

artificial intelligence, classification, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1301.2297

Genre: Research Report (1.00)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)

Add feedback

Syntactic Analysis Based on Morphological Characteristic Features of the Romanian Language

Patrut, Bogdan

arXiv.org Artificial IntelligenceJan-9-2013

This paper refers to the syntactic analysis of phrases in Romanian, as an important process of natural language processing. We will suggest a real-time solution, based on the idea of using some words or groups of words that indicate grammatical category; and some specific endings of some parts of sentence. Our idea is based on some characteristics of the Romanian language, where some prepositions, adverbs or some specific endings can provide a lot of information about the structure of a complex sentence. Such characteristics can be found in other languages, too, such as French. Using a special grammar, we developed a system (DIASEXP) that can perform a dialogue in natural language with assertive and interogative sentences about a "story" (a set of sentences describing some events from the real life).

artificial intelligence, helen, natural language computing, (16 more...)

arXiv.org Artificial Intelligence

1301.195

Country: Europe > Romania > Nord-Est Development Region (0.15)

Genre: Research Report (0.40)

Industry: Education > Educational Setting (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

AAAI News

Hamilton, Carol M. (Association for the Advancement of Artificial Intelligence)

AI MagazineDec-31-2012

In 2013, the Robotics information, please visit the IAAI-13 iaai13 icwsm13 track is particularly interested in website at www.aaai.org/iaai13.

aaai, artificial intelligence, intelligence, (15 more...)

AI Magazine

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(14 more...)

Genre:

Personal > Honors (0.93)
Overview (0.68)

Industry:

Banking & Finance > Trading (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Communications > Social Media (0.94)
Information Technology > Communications > Web (0.93)
Information Technology > Artificial Intelligence > Applied AI (0.93)

Add feedback

Mirror Descent Meets Fixed Share (and feels no regret)

Cesa-bianchi, Nicolò, Gaillard, Pierre, Lugosi, Gabor, Stoltz, Gilles

Neural Information Processing SystemsDec-31-2012

Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension. This is done using either a carefully designed projection or by a weight sharing technique. Via a novel unified analysis, we show that these two approaches deliver essentially equivalent bounds on a notion of regret generalizing shifting, adaptive, discounted, and other related regrets. Our analysis also captures and extends the generalized weight sharing technique of Bousquet and Warmuth, and can be refined in several ways, including improvements for small losses and adaptive tuning of parameters.

algorithm, forecaster, sequence, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Perceptron Learning of SAT

Flint, Alex, Blaschko, Matthew

Neural Information Processing SystemsDec-31-2012

Boolean satisfiability (SAT) as a canonical NP-complete decision problem is one of the most important problems in computer science. In practice, real-world SAT sentences are drawn from a distribution that may result in efficient algorithms for their solution. Such SAT instances are likely to have shared characteristics and substructures. This work approaches the exploration of a family of SAT solvers as a learning problem. In particular, we relate polynomial time solvability of a SAT subset to a notion of margin between sentences mapped by a feature function into a Hilbert space. Provided this mapping is based on polynomial time computable statistics of a sentence, we show that the existance of a margin between these data points implies the existance of a polynomial time solver for that SAT subset based on the Davis-Putnam-Logemann-Loveland algorithm. Furthermore, we show that a simple perceptron-style learning rule will find an optimal SAT solver with a bounded number of training updates. We derive a linear time computable set of features and show analytically that margins exist for important polynomial special cases of SAT. Empirical results show an order of magnitude improvement over a state-of-the-art SAT solver on a hardware verification task.

algorithm, feature space, polynomial time, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.35)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
(2 more...)

Add feedback

Local Supervised Learning through Space Partitioning

Wang, Joseph, Saligrama, Venkatesh

Neural Information Processing SystemsDec-31-2012

We develop a novel approach for supervised learning based on adaptively partitioning the feature space into different regions and learning local region-specific classifiers. We formulate an empirical risk minimization problem that incorporates both partitioning and classification in to a single global objective. We show that space partitioning can be equivalently reformulated as a supervised learning problem and consequently any discriminative learning method can be utilized in conjunction with our approach. Nevertheless, we consider locally linear schemes by learning linear partitions and linear region classifiers. Locally linear schemes can not only approximate complex decision boundaries and ensure low training error but also provide tight control on over-fitting and generalization error. We train locally linear classifiers by using LDA, logistic regression and perceptrons, and so our scheme is scalable to large data sizes and high-dimensions. We present experimental results demonstrating improved performance over state of the art classification techniques on benchmark datasets. We also show improved robustness to label noise.

classifier, linear classifier, region classifier, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.05)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
South America > Paraguay > Asunción > Asunción (0.04)

Genre: Research Report > New Finding (0.70)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.51)

Add feedback