navigation
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.68)
- Information Technology > Artificial Intelligence > Robots (0.67)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel (0.04)
- Asia > China > Beijing > Beijing (0.04)
Meet the AI-powered robotic dog ready to help with emergency response
Developed by Texas A&M University engineering students, this AI-powered robotic dog doesn't just follow commands. Designed to navigate chaos with precision, the robot could help revolutionize search-and-rescue missions, disaster response and many other emergency operations. Sandun Vitharana, an engineering technology master's student, and Sanjaya Mallikarachchi, an interdisciplinary engineering doctoral student, spearheaded the invention of the robotic dog. It can process voice commands and uses AI and camera input to perform path planning and identify objects. A roboticist would describe it as a terrestrial robot that uses a memory-driven navigation system powered by a multimodal large language model (MLLM).
- North America > United States > Texas (0.28)
- North America > United States > Ohio (0.05)
- Europe > Switzerland > Zürich > Zürich (0.05)
- Asia > Kazakhstan (0.05)
- North America > United States (0.04)
- North America > Dominican Republic (0.04)
- Asia > China > Hong Kong (0.04)
Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision Keji He1,2 Y an Huang 1,2 Qi Wu3 Jianhua Y ang 5
In Vision-and-Language Navigation (VLN) task, an agent is asked to navigate inside 3D indoor environments following given instructions. Cross-modal alignment is one of the most critical challenges in VLN because the predicted trajectory needs to match the given instruction accurately. In this paper, we address the cross-modal alignment challenge from the perspective of fine-grain. Firstly, to alleviate weak cross-modal alignment supervision from coarse-grained data, we introduce a human-annotated fine-grained VLN dataset, namely Landmark-RxR. Secondly, to further enhance local cross-modal alignment under fine-grained supervision, we investigate the focal-oriented rewards with soft and hard forms, by focusing on the critical points sampled from fine-grained Landmark-RxR. Moreover, to fully evaluate the navigation process, we also propose a re-initialization mechanism that makes metrics insensitive to difficult points, which can cause the agent to deviate from the correct trajectories. Experimental results show that our agent has superior navigation performance on Landmark-RxR, en-RxR and R2R.
- North America > United States (0.14)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
AutoGuide: Automated Generation and Selection of Context-Aware Guidelines for Large Language Model Agents
Recent advances in large language models (LLMs) have empowered AI agents capable of performing various sequential decision-making tasks. However, effectively guiding LLMs to perform well in unfamiliar domains like web navigation, where they lack sufficient knowledge, has proven to be difficult with the demonstration-based in-context learning paradigm. In this paper, we introduce a novel framework, called AutoGuide, which addresses this limitation by automatically generating context-aware guidelines from offline experiences. Importantly, each context-aware guideline is expressed in concise natural language and follows a conditional structure, clearly describing the context where it is applicable. As a result, our guidelines facilitate the provision of relevant knowledge for the agent's current decision-making process, overcoming the limitations of the conventional demonstration-based learning paradigm. Our evaluation demonstrates that AutoGuide significantly outperforms competitive baselines in complex benchmark domains, including real-world web navigation.