AITopics

2501.09783

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.70)

arXiv.org Artificial IntelligenceOct-3-2024

Learning Diverse Bimanual Dexterous Manipulation Skills from Human Demonstrations

Zhou, Bohan, Yuan, Haoqi, Fu, Yuhui, Lu, Zongqing

Bimanual dexterous manipulation is a critical yet underexplored area in robotics. Its high-dimensional action space and inherent task complexity present significant challenges for policy learning, and the limited task diversity in existing benchmarks hinders general-purpose skill development. Existing approaches largely depend on reinforcement learning, often constrained by intricately designed reward functions tailored to a narrow set of tasks. In this work, we present a novel approach for efficiently learning diverse bimanual dexterous skills from abundant human demonstrations. Specifically, we introduce BiDexHD, a framework that unifies task construction from existing bimanual datasets and employs teacher-student policy learning to address all tasks. The teacher learns state-based policies using a general two-stage reward function across tasks with shared behaviors, while the student distills the learned multi-task policies into a vision-based policy. With BiDexHD, scalable learning of numerous bimanual dexterous skills from auto-constructed tasks becomes feasible, offering promising advances toward universal bimanual dexterous manipulation. Our empirical evaluation on the TACO dataset, spanning 141 tasks across six categories, demonstrates a task fulfillment rate of 74.59% on trained tasks and 51.07% on unseen tasks, showcasing the effectiveness and competitive zero-shot generalization capabilities of BiDexHD. For videos and more information, visit our project page https://sites.google.com/view/bidexhd.

arxiv preprint arxiv, manipulation, teapot, (11 more...)

2410.02477

Country: Asia (0.04)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceSep-3-2024

ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation

Huang, Wenlong, Wang, Chen, Li, Yunzhu, Zhang, Ruohan, Fei-Fei, Li

Representing robotic manipulation tasks as constraints that associate the robot and the environment is a promising way to encode desired robot behaviors. However, it remains unclear how to formulate the constraints such that they are 1) versatile to diverse tasks, 2) free of manual labeling, and 3) optimizable by off-the-shelf solvers to produce robot actions in real-time. In this work, we introduce Relational Keypoint Constraints (ReKep), a visually-grounded representation for constraints in robotic manipulation. Specifically, ReKep is expressed as Python functions mapping a set of 3D keypoints in the environment to a numerical cost. We demonstrate that by representing a manipulation task as a sequence of Relational Keypoint Constraints, we can employ a hierarchical optimization procedure to solve for robot actions (represented by a sequence of end-effector poses in SE(3)) with a perception-action loop at a real-time frequency. Furthermore, in order to circumvent the need for manual specification of ReKep for each new task, we devise an automated procedure that leverages large vision models and vision-language models to produce ReKep from free-form language instructions and RGB-D observations. We present system implementations on a wheeled single-arm platform and a stationary dual-arm platform that can perform a large variety of manipulation tasks, featuring multi-stage, in-the-wild, bimanual, and reactive behaviors, all without task-specific data or environment models. Website at https://rekep-robot.github.io.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

2409.01652

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Finzel, Bettina, Hilme, Patrick, Rabold, Johannes, Schmid, Ute

When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX

arXiv.org Artificial IntelligenceMay-2-2024

Explanations for Convolutional Neural Networks (CNNs) based on relevance of input pixels might be too unspecific to evaluate which and how input features impact model decisions. Especially in complex real-world domains like biomedicine, the presence of specific concepts (e.g., a certain type of cell) and of relations between concepts (e.g., one cell type is next to another) might be discriminative between classes (e.g., different types of tissue). Pixel relevance is not expressive enough to convey this type of information. In consequence, model evaluation is limited and relevant aspects present in the data and influencing the model decisions might be overlooked. This work presents a novel method to explain and evaluate CNN models, which uses a concept- and relation-based explainer (CoReX). It explains the predictive behavior of a model on a set of images by masking (ir-)relevant concepts from the decision-making process and by constraining relations in a learned interpretable surrogate model. We test our approach with several image data sets and CNN architectures. Results show that CoReX explanations are faithful to the CNN model in terms of predictive outcomes. We further demonstrate that CoReX is a suitable tool for evaluating CNNs supporting identification and re-classification of incorrect or ambiguous classifications.

explanation, machine learning, natural language, (16 more...)

2405.01661

Country:

Europe (1.00)
North America > United States (0.67)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.48)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Artificial IntelligenceMar-17-2024

PhD: A Prompted Visual Hallucination Evaluation Dataset

Liu, Jiazhen, Fu, Yuhan, Xie, Ruobing, Xie, Runquan, Sun, Xingwu, Lian, Fengzong, Kang, Zhanhui, Li, Xirong

The rapid growth of Large Language Models (LLMs) has driven the development of Large Vision-Language Models (LVLMs). The challenge of hallucination, prevalent in LLMs, also emerges in LVLMs. However, most existing efforts mainly focus on object hallucination in LVLM, ignoring diverse types of LVLM hallucinations. In this study, we delve into the Intrinsic Vision-Language Hallucination (IVL-Hallu) issue, thoroughly analyzing different types of IVL-Hallu on their causes and reflections. Specifically, we propose several novel IVL-Hallu tasks and categorize them into four types: (a) object hallucination, which arises from the misidentification of objects, (b) attribute hallucination, which is caused by the misidentification of attributes, (c) multi-modal conflicting hallucination, which derives from the contradictions between textual and visual information, and (d) counter-common-sense hallucination, which owes to the contradictions between the LVLM knowledge and actual images. Based on these taxonomies, we propose a more challenging benchmark named PhD to evaluate and explore IVL-Hallu. An automated pipeline is proposed for generating different types of IVL-Hallu data. Extensive experiments on five SOTA LVLMs reveal their inability to effectively tackle our proposed IVL-Hallu tasks, with detailed analyses and insights on the origins and possible solutions of these new challenging IVL-Hallu tasks, facilitating future researches on IVL-Hallu and LVLM. The benchmark can be accessed at https://github.com/jiazhen-code/IntrinsicHallu

hallucination, knowledge, lvlm, (14 more...)

2403.11116

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

#artificialintelligenceMar-16-2023, 11:00:16 GMT

Teaching Robots to Perform Tasks Like Humans - USC Viterbi

Can language models reason in a real-world setting? USC researchers explored this question in a recent paper published at AAAI. Your coffee has gone cold. You pick up your cup, place it in the microwave, and zap it. For a robot, however, the task is not easy – even if it has been "taught" by language models (LMs) where the water, cup and microwave are.

artificial intelligence, natural language, robot, (13 more...)

Country: North America > United States > District of Columbia > Washington (0.05)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

#artificialintelligenceFeb-14-2022, 06:43:15 GMT

OpenAI top scientist says AI might already be conscious. Researchers respond furiously

It's a long-standing debate, one that this weekend made headlines: will artificial intelligence (AI) ever be conscious or is it already so? OpenAI top researcher Ilya Sutskever took to Twitter to declare his view on the matter and saw backlash from many scientists in the field, as first spotted by Futurism. The question that remains is: who is right? It all began when Sutskever tweeted on Thursday "it may be that today's large neural networks are slightly conscious." This might seem like a harmless enough statement but it was met with immediate and swift backlash. According to UNSW Sidney AI researcher Toby Walsh, it's because the topic derails the conversation and perhaps even the evolution of AI. "Every time such speculative comments get an airing, it takes months of effort to get the conversation back to the more realistic opportunities and threats posed by AI," tweeted Walsh.

backlash, openai top scientist, researcher respond furiously, (3 more...)

Country: Europe > Denmark > Capital Region > Copenhagen (0.07)

Industry: Information Technology > Services (0.59)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.63)

#artificialintelligenceJan-7-2019, 21:33:23 GMT

Can artificial intelligence tell a teapot from a golf ball? Severe limitations of 'deep learning' machines

Supporters have expressed enthusiasm for the use of these networks to do many individual tasks, and even jobs, traditionally performed by people. However, results of the five experiments in this study showed that it's easy to fool the networks, and the networks' method of identifying objects using computer vision differs substantially from human vision. "The machines have severe limitations that we need to understand," said Philip Kellman, a UCLA distinguished professor of psychology and a senior author of the study. Machine vision, he said, has drawbacks. In the first experiment, the psychologists showed one of the best deep learning networks, called VGG-19, color images of animals and objects.

artificial intelligence, experiment, machine learning, (16 more...)

Genre: Research Report > New Finding (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceJan-7-2019, 21:33:07 GMT

Can artificial intelligence tell a polar bear from a can opener?

How smart is the form of artificial intelligence known as deep learning computer networks, and how closely do these machines mimic the human brain? They have improved greatly in recent years, but still have a long way to go, a team of UCLA cognitive psychologists reports in the journal PLOS Computational Biology. Supporters have expressed enthusiasm for the use of these networks to do many individual tasks, and even jobs, traditionally performed by people. However, results of the five experiments in this study showed that it's easy to fool the networks, and the networks' method of identifying objects using computer vision differs substantially from human vision. "The machines have severe limitations that we need to understand," said Philip Kellman, a UCLA distinguished professor of psychology and a senior author of the study.

artificial intelligence, experiment, machine learning, (17 more...)

Country: North America > United States > California > Los Angeles County > Los Angeles (0.15)

Genre: Research Report > New Finding (0.70)

Industry: Education (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

#artificialintelligenceAug-13-2018, 13:33:47 GMT

How the Turing Test inspired AI

Computer pioneer and artificial intelligence (AI) theorist Alan Turing would have been 100 years old this Saturday. To mark the anniversary the BBC has commissioned a series of essays. In this, the fourth article, his influence on AI research and the resulting controversy are explored. Alan Turing was clearly a man ahead of his time. In 1950, at the dawn of computing, he was already grappling with the question: "Can machines think?"

artificial intelligence, turing, turing test, (9 more...)

Country:

North America > United States > New York (0.05)
Asia > China (0.05)

Technology:

Information Technology > Artificial Intelligence > Issues > Turing's Test (1.00)
Information Technology > Artificial Intelligence > History (1.00)