AITopics | visual target

Collaborating Authors

visual target

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Edit Visual Programs with Self-Supervision

Neural Information Processing SystemsFeb-18-2026, 01:21:58 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Learning to Edit Visual Programs with Self-Supervision

Neural Information Processing SystemsOct-11-2025, 00:41:41 GMT

edit network, edit operation, input program, (15 more...)

Neural Information Processing Systems

Country:

Asia (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Annotation-Free One-Shot Imitation Learning for Multi-Step Manipulation Tasks

Wichitwechkarn, Vijja, Williams, Emlyn, Fox, Charles, Choudhary, Ruchi

arXiv.org Artificial IntelligenceSep-30-2025

Abstract-- Recent advances in one-shot imitation learning have enabled robots to acquire new manipulation skills from a single human demonstration. While existing methods achieve strong performance on single-step tasks, they remain limited in their ability to handle long-horizon, multi-step tasks without additional model training or manual annotation. We propose a method that can be applied to this setting provided a single demonstration without additional model training or manual annotation. We evaluated our method on multi-step and single-step manipulation tasks where our method achieves an average success rate of 82.5% and 90%, respectively. Our method matches and exceeds the performance of the baselines in both these cases. We also compare the performance and computational efficiency of alternative pre-trained feature extractors within our framework. I. INTRODUCTION Recent advances in imitation learning have enabled robots to perform increasingly complex tasks. However, these methods still require hundreds to thousands of demonstrations per task [1], [2], [3], [4], making them impractical for real-world deployment.

demonstration, machine learning, visual target, (16 more...)

arXiv.org Artificial Intelligence

2509.24972

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Representing Positional Information in Generative World Models for Object Manipulation

Ferraro, Stefano, Mazzaglia, Pietro, Verbelen, Tim, Dhoedt, Bart, Rajeswar, Sai

arXiv.org Artificial IntelligenceSep-19-2024

Object manipulation capabilities are essential skills that set apart embodied agents engaging with the world, especially in the realm of robotics. The ability to predict outcomes of interactions with objects is paramount in this setting. While model-based control methods have started to be employed for tackling manipulation tasks, they have faced challenges in accurately manipulating objects. As we analyze the causes of this limitation, we identify the cause of underperformance in the way current world models represent crucial positional information, especially about the target's goal specification for object positioning tasks. We introduce a general approach that empowers world model-based agents to effectively solve object-positioning tasks. We propose two declinations of this approach for generative world models: position-conditioned (PCP) and latent-conditioned (LCP) policy learning. In particular, LCP employs object-centric latent representations that explicitly capture object positional information for goal specification. This naturally leads to the emergence of multimodal capabilities, enabling the specification of goals through spatial coordinates or a visual goal. Our methods are rigorously evaluated across several manipulation environments, showing favorable performance compared to current model-based control approaches.

agent, artificial intelligence, information, (17 more...)

arXiv.org Artificial Intelligence

2409.12005

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Learning to Edit Visual Programs with Self-Supervision

Jones, R. Kenny, Zhang, Renhao, Ganeshan, Aditya, Ritchie, Daniel

arXiv.org Artificial IntelligenceJun-4-2024

We design a system that learns how to edit visual programs. Our edit network consumes a complete input program and a visual target. From this input, we task our network with predicting a local edit operation that could be applied to the input program to improve its similarity to the target. In order to apply this scheme for domains that lack program annotations, we develop a self-supervised learning approach that integrates this edit network into a bootstrapped finetuning loop along with a network that predicts entire programs in one-shot. Our joint finetuning scheme, when coupled with an inference procedure that initializes a population from the one-shot model and evolves members of this population with the edit network, helps to infer more accurate visual programs. Over multiple domains, we experimentally compare our method against the alternative of using only the one-shot model, and find that even under equal search-time budgets, our editing-based paradigm provides significant advantages.

edit network, edit operation, input program, (14 more...)

arXiv.org Artificial Intelligence

2406.02383

Country:

Asia (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)

Add feedback

Tiny Eye Movements Are Under a Surprising Degree of Cognitive Control - Neuroscience News

#artificialintelligenceApr-10-2023, 02:46:52 GMT

Summary: Ocular drift, or tiny eye movements that seem random can be influenced by prior knowledge of an expected visual target, researchers report. A very subtle and seemingly random type of eye movement called ocular drift can be influenced by prior knowledge of the expected visual target, suggesting a surprising level of cognitive control over the eyes, according to a study led by Weill Cornell Medicine neuroscientists. The discovery, described Apr. 3 in Current Biology, adds to the scientific understanding of how vision--far from being a mere absorption of incoming signals from the retina--is controlled and directed by cognitive processes. "These eye movements are so tiny that we're not even conscious of them, and yet our brains somehow can use the knowledge of the visual task to control them," says study lead author Dr. Yen-Chu Lin, who carried out the work as a Fred Plum Fellow in Systems Neurology and Neuroscience in the Feil Family Brain and Mind Research Institute at Weill Cornell Medicine. Dr. Lin works in the laboratory of study senior author Dr. Jonathan Victor, the Fred Plum Professor of Neurology at Weill Cornell Medicine. The study involved a close collaboration with the laboratory of Dr. Michele Rucci, professor of brain and cognitive sciences and neuroscience at the University of Rochester.

cognitive control, eye movement, ocular drift, (13 more...)

#artificialintelligence

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Further Explorations in Visually-Guided Reaching: Making MURPHY Smarter

Neural Information Processing SystemsApr-6-2023, 20:02:19 GMT

MURPHY is a vision-based kinematic controller and path planner based on a connectionist architecture, and implemented with a video camera and Rhino XR-series robot arm. Imitative of the layout of sen(cid:173) sory and motor maps in cerebral cortex, MURPHY'S internal representa(cid:173) tions consist of four coarse-coded populations of simple units represent(cid:173) ing both static and dynamic aspects of the sensory-motor environment. In previously reported work [4], MURPHY first learned a direct kinematic model of his camera-arm system during a period of extended practice, and then used this "mental model" to heuristically guide his hand to unobstructed visual targets. MURPHY has since been extended in two ways: First, he now learns the inverse differential-kinematics of his arm in addition to ordinary direct kinematics, which allows him to push his hand directly towards a visual target without the need for search. Sec(cid:173) ondly, he now deals with the much more difficult problem of reaching in the presence of obstacles.

exploration, murphy smarter, visually-guided, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (0.64)
Information Technology > Artificial Intelligence > Machine Learning (0.44)

Add feedback

Unifying the Sensory and Motor Components of Sensorimotor Adaptation

Haith, Adrian, Jackson, Carl P., Miall, R. C., Vijayakumar, Sethu

Neural Information Processing SystemsDec-31-2009

Adaptation of visually guided reaching movements in novel visuomotor environments (e.g.wearing prism goggles) comprises not only motor adaptation but also substantial sensory adaptation, corresponding to shifts in the perceived spatial location of visual and proprioceptive cues. Previous computational modelsof the sensory component of visuomotor adaptation have assumed that it is driven purely by the discrepancy introduced between visual andproprioceptive estimates of hand position and is independent of any motor component of adaptation. We instead propose a unified model in which sensory and motor adaptation are jointly driven by optimal Bayesian estimation of the sensory and motor contributions to perceived errors. Our model is able to account for patterns of performance errors during visuomotor adaptationas well as the subsequent perceptual aftereffects. This unified model also makes the surprising prediction that force field adaptation willelicit similar perceptual shifts, even though there is never any discrepancy between visual and proprioceptive observations. We confirm this prediction with an experiment.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre: Research Report > New Finding (0.68)

Technology: