robobrain
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete
Ji, Yuheng, Tan, Huajie, Shi, Jiayu, Hao, Xiaoshuai, Zhang, Yuan, Zhang, Hengyuan, Wang, Pengwei, Zhao, Mengdi, Mu, Yao, An, Pengju, Xue, Xinda, Su, Qinghang, Lyu, Huaihai, Zheng, Xiaolong, Liu, Jiaming, Wang, Zhongyuan, Zhang, Shanghang
Recent advancements in Multimodal Large Language Models (MLLMs) have shown remarkable capabilities across various multimodal contexts. However, their application in robotic scenarios, particularly for long-horizon manipulation tasks, reveals significant limitations. These limitations arise from the current MLLMs lacking three essential robotic brain capabilities: Planning Capability, which involves decomposing complex manipulation instructions into manageable sub-tasks; Affordance Perception, the ability to recognize and interpret the affordances of interactive objects; and Trajectory Prediction, the foresight to anticipate the complete manipulation trajectory necessary for successful execution. To enhance the robotic brain's core capabilities from abstract to concrete, we introduce ShareRobot, a high-quality heterogeneous dataset that labels multi-dimensional information such as task planning, object affordance, and end-effector trajectory. ShareRobot's diversity and accuracy have been meticulously refined by three human annotators. Building on this dataset, we developed RoboBrain, an MLLM-based model that combines robotic and general multi-modal data, utilizes a multi-stage training strategy, and incorporates long videos and high-resolution images to improve its robotic manipulation capabilities. Extensive experiments demonstrate that RoboBrain achieves state-of-the-art performance across various robotic tasks, highlighting its potential to advance robotic brain capabilities.
Enhancing customer experience with AI-powered chatbots The MSP Hub – owned by Expandi Group
AI-powered chatbots come with many benefits for the businesses that adopt them, but in some instances, they can have greater impact for the everyday user. In this blog, we list four recent articles giving examples of where AI-powered applications and chatbots have been put into practice to help customers and the common man. The developer of the'world's first robot lawyer' application, which helped overturn more than one-hundred parking fines, is now adapting the functionality of the integrated chatbot to provide legal aid to refugees seeking asylum in the US and Canada, as well as asylum support in the UK. The original DoNotPay AI-powered application gives legal aid through a simple chat interface, where a chatbot asks a series of questions to help determine which application a refugee needs to fill out and whether they are eligible for asylum protection under international law. After this, the chatbot takes note of the relevant details required for asylum applications in the US or Canada, auto-fills the application form and sends.
UBank brings AI to customer agents with RoboBrain
NAB-owned UBank is expanding its artificial intelligence focus with the creation of an "agent assist" enterprise search tool called RoboBrain. The digital bank entered the AI fray in May last year with a chatbot to aid home loan applications, which it called Robochat. It is now building on the internal competency it created for that project, and expanding the reach of AI in the business in the process. Both Robochat and RoboBrain are powered largely by IBM Watson components. They also owe their genesis to the same process - internal hackathons designed to flush out potential use cases and ideas.
The Plan to Build a Massive Online Brain for All the World's Robots
If you walk into the computer science building at Stanford University, Mobi is standing in the lobby, encased in glass. He looks a bit like a garbage can, with a rod for a neck and a camera for eyes. He was one of several robots developed at Stanford in the 1980s to study how machines might learn to navigate their environment--a stepping stone toward intelligent robots that could live and work alongside humans. He worked, but not especially well. The best he could do was follow a path along a wall.
Reporters' Roundtable: Debating the robobrains
Big news in AI this week: IBM's Watson project defeated "Jeopardy" champions Ken Jennings and Brad Rutter in a three-night prime-time demo match. What does that win mean for computing, and more importantly, for humanity? That's the topic for this week's Reporters' Roundtable, and to discuss it we have two great guests, both with current books on the topics of computer vs. human competition. First up is Stephen Baker, author of Final Jeopardy: Man vs. Machine and the Quest to Know Everything. Baker reported on the development of Watson from inside IBM headquarters to write this book.
RoboBrain: Large-Scale Knowledge Engine for Robots
Saxena, Ashutosh, Jain, Ashesh, Sener, Ozan, Jami, Aditya, Misra, Dipendra K., Koppula, Hema S.
In this paper we introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks. Building such an engine brings with it the challenge of dealing with multiple data modalities including symbols, natural language, haptic senses, robot trajectories, visual features and many others. The \textit{knowledge} stored in the engine comes from multiple sources including physical interactions that robots have while performing tasks (perception, planning and control), knowledge bases from the Internet and learned representations from several robotics research groups. We discuss various technical aspects and associated challenges such as modeling the correctness of knowledge, inferring latent information and formulating different robotic tasks as queries to the knowledge engine. We describe the system architecture and how it supports different mechanisms for users and robots to interact with the engine. Finally, we demonstrate its use in three important research areas: grounding natural language, perception, and planning, which are the key building blocks for many robotic tasks. This knowledge engine is a collaborative effort and we call it RoboBrain.