AITopics

Structure in a visual scene can be described at many levels of granularity. At a coarse level, the scene is composed of objects; at a finer level, each object is made up of parts, and the parts of subparts. In this work, I propose a simple principle by which such hierarchical structure can be extracted from visual scenes: Regularity in the relations among different parts of an object is weaker than in the internal structure of a part. This principle can be applied recursively to define part-whole relationships among elements in a scene. The principle does not make use of object models, categories, or other sorts of higher-level knowledge; rather, part-whole relationships can be established based on the statistics of a set of sample visual scenes. I illustrate with a model that performs unsupervised decomposition of simple scenes. The model can account for the results from a human learning experiment on the ontogeny of partwhole relationships.

category, regularity principle, representation, (13 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Liu, Zili, Weinshall, Daphna

Mechanisms of Generalization in Perceptual Learning

The learning of many visual perceptual tasks has been shown to be specific to practiced stimuli, while new stimuli require re-Iearning from scratch. Here we demonstrate generalization using a novel paradigm in motion discrimination where learning has been previously shown to be specific. We trained subjects to discriminate the directions of moving dots, and verified the previous results that learning does not transfer from the trained direction to a new one. However, by tracking the subjects' performance across time in the new direction, we found that their rate of learning doubled. Therefore, learning generalized in a task previously considered too difficult for generalization.

discrimination task, sensitivity, training direction, (16 more...)

Country: Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Christianson, G. Bjorn, Becker, Suzanna

A Model for Associative Multiplication

Despite the fact that mental arithmetic is based on only a few hundred basic facts and some simple algorithms, humans have a difficult time mastering the subject, and even experienced individuals make mistakes. Associative multiplication, the process of doing multiplication by memory without the use of rules or algorithms, is especially problematic.

multiplication, operation error, representation, (12 more...)

Country:

North America > Canada > Ontario > Hamilton (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Massachusetts > Middlesex County > Natick (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Bhushan, Nikhil, Shadmehr, Reza

Evidence for a Forward Dynamics Model in Human Adaptive Motor Control

Based on computational principles, the concept of an internal model for adaptive control has been divided into a forward and an inverse model. However, there is as yet little evidence that learning control by the eNS is through adaptation of one or the other. Here we examine two adaptive control architectures, one based only on the inverse model and other based on a combination of forward and inverse models. We then show that for reaching movements of the hand in novel force fields, only the learning of the forward model results in key characteristics of performance that match the kinematics of human subjects. In contrast, the adaptive control system that relies only on the inverse model fails to produce the kinematic patterns observed in the subjects, despite the fact that it is more stable.

forward model, inverse model, trajectory, (13 more...)

Country:

Asia > Middle East > Jordan (0.05)
Europe > Belgium > Flanders (0.05)
North America > United States > Maryland > Baltimore (0.04)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Control Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Artificial Intelligence > Cognitive Science (0.52)

Munos, Rémi, Moore, Andrew W.

Barycentric Interpolators for Continuous Space and Time Reinforcement Learning

In order to find the optimal control of continuous state-space and time reinforcement learning (RL) problems, we approximate the value function (VF) with a particular class of functions called the barycentric interpolators. We establish sufficient conditions under which a RL algorithm converges to the optimal VF, even when we use approximate models of the state dynamics and the reinforcement functions.

barycentric interpolator, convergence, equation, (14 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Grandvalet, Yves, Canu, Stéphane

Outcomes of the Equivalence of Adaptive Ridge with Least Absolute Shrinkage

In supervised learning, we have a set of explicative variables x from which we wish to predict a response variable y.

adaptive ridge, coefficient, covariate, (16 more...)

Country:

North America > United States > New York (0.05)
Europe > France > Hauts-de-France > Oise > Compiègne (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Williams, John K., Singh, Satinder P.

Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes

Partially Observable Markov Decision Processes (pO "MOPs) constitute an important class of reinforcement learning problems which present unique theoretical and computational difficulties. In the absence of the Markov property, popular reinforcement learning algorithms such as Q-Iearning may no longer be effective, and memory-based methods which remove partial observability via state-estimation are notoriously expensive. An alternative approach is to seek a stochastic memoryless policy which for each observation of the environment prescribes a probability distribution over available actions that maximizes the average reward per timestep. A reinforcement learning algorithm which learns a locally optimal stochastic memoryless policy has been proposed by Jaakkola, Singh and Jordan, but not empirically verified. We present a variation of this algorithm, discuss its implementation, and demonstrate its viability using four test problems.

algorithm, learner, memoryless policy, (12 more...)

Country:

Asia > Middle East > Jordan (0.25)
North America > United States > New York (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Sutton, Richard S., Singh, Satinder P., Precup, Doina, Ravindran, Balaraman

Improved Switching among Temporally Abstract Actions

In robotics and other control applications it is commonplace to have a preexisting set of controllers for solving subtasks, perhaps handcrafted or previously learned or planned, and still face a difficult problem of how to choose and switch among the controllers to solve an overall task as well as possible. In this paper we present a framework based on Markov decision processes and semi-Markov decision processes for phrasing this problem, a basic theorem regarding the improvement in performance that can be obtained by switching flexibly between given controllers, and example applications of the theorem. In particular, we show how an agent can plan with these high-level controllers and then use the results of such planning to find an even better plan, by modifying the existing controllers, with negligible additional cost and no re-planning. In one of our examples, the complexity of the problem is reduced from 24 billion state-action pairs to less than a million state-controller pairs. In many applications, solutions to parts of a task are known, either because they were handcrafted by people or because they were previously learned or planned. For example, in robotics applications, there may exist controllers for moving joints to positions, picking up objects, controlling eye movements, or navigating along hallways. More generally, an intelligent system may have available to it several temporally extended courses of action to choose from. In such cases, a key challenge is to take full advantage of the existing temporally extended actions, to choose or switch among them effectively, and to plan at their level rather than at the level of individual actions.

controller, improved switching, landmark, (15 more...)

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Industry: Government (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Suematsu, Nobuo, Hayashi, Akira

A Reinforcement Learning Algorithm in Partially Observable Environments Using Short-Term Memory

We have proved that the model learned by BLHT converges to the optimal model in given hypothesis space, 1{, which provides the most accurate predictions of percepts and rewards, given short-term memory. We believe this fact provides a solid basis for BLHT, and BLHT can be compared favorably with other methods using short-term memory.

blht, history tree, short-term memory, (14 more...)

Country:

Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.05)
Asia > Middle East > Jordan (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Sato, Masa-aki, Ishii, Shin

Reinforcement Learning Based on On-Line EM Algorithm

On the other hand, applications to continuous state/action problems (Werbos, 1990; Doya, 1996; Sofge & White, 1992) are much more difficult than the finite state/action cases. Good function approximation methods and fast learning algorithms are crucial for successful applications. In this article, we propose a new RL method that has the above-mentioned two features. This method is based on an actor-critic architecture (Barto et al., 1983), although the detailed implementations of the actor and the critic are quite differ- Reinforcement Learning Based on On-Line EM Algorithm 1053 ent from those in the original actor-critic model. The actor and the critic in our method estimate a policy and a Q-function, respectively, and are approximated by Normalized Gaussian Networks (NGnet) (l'doody & Darken, 1989).

algorithm, pendulum, rl method, (14 more...)

Country:

Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)