Goto

Collaborating Authors

 fireplace


"Don't Do That!": Guiding Embodied Systems through Large Language Model-based Constraint Generation

Djuhera, Aladin, Seffo, Amin, Asai, Masataro, Boche, Holger

arXiv.org Artificial Intelligence

Recent advancements in large language models (LLMs) have spurred interest in robotic navigation that incorporates complex spatial, mathematical, and conditional constraints from natural language into the planning problem. Such constraints can be informal yet highly complex, making it challenging to translate into a formal description that can be passed on to a planning algorithm. In this paper, we propose STPR, a constraint generation framework that uses LLMs to translate constraints (expressed as instructions on ``what not to do'') into executable Python functions. STPR leverages the LLM's strong coding capabilities to shift the problem description from language into structured and transparent code, thus circumventing complex reasoning and avoiding potential hallucinations. We show that these LLM-generated functions accurately describe even complex mathematical constraints, and apply them to point cloud representations with traditional search algorithms. Experiments in a simulated Gazebo environment show that STPR ensures full compliance across several constraints and scenarios, while having short runtimes. We also verify that STPR can be used with smaller, code-specific LLMs, making it applicable to a wide range of compact models at low inference cost.


TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation

Zhong, Linqing, Gao, Chen, Ding, Zihan, Liao, Yue, Liu, Si

arXiv.org Artificial Intelligence

The Zero-Shot Object Navigation (ZSON) task requires embodied agents to find a previously unseen object by navigating in unfamiliar environments. Such a goal-oriented exploration heavily relies on the ability to perceive, understand, and reason based on the spatial information of the environment. However, current LLM-based approaches convert visual observations to language descriptions and reason in the linguistic space, leading to the loss of spatial information. In this paper, we introduce TopV-Nav, a MLLM-based method that directly reasons on the top-view map with complete spatial information. To fully unlock the MLLM's spatial reasoning potential in top-view perspective, we propose the Adaptive Visual Prompt Generation (AVPG) method to adaptively construct semantically-rich top-view map. It enables the agent to directly utilize spatial information contained in the top-view map to conduct thorough reasoning. Besides, we design a Dynamic Map Scaling (DMS) mechanism to dynamically zoom top-view map at preferred scales, enhancing local fine-grained reasoning. Additionally, we devise a Target-Guided Navigation (TGN) mechanism to predict and to utilize target locations, facilitating global and human-like exploration. Experiments on MP3D and HM3D benchmarks demonstrate the superiority of our TopV-Nav, e.g., $+3.9\%$ SR and $+2.0\%$ SPL absolute improvements on HM3D.


Winter is coming • AI Blog

#artificialintelligence

The recent heatwave has been tough to bear. The days are long and humid, and the nights offer little relief. I find myself cranky and short-tempered, and even the simplest tasks seem to take twice as much effort. I know I'm not alone in feeling this way - the entire city seems to be struggling under the weight of the heat. Even so, I can't help but appreciate the beauty of a summer day.


R.U.R. (Rossum's Universal Robots): PROPERTY LIST

#artificialintelligence

R.U.R. (Rossum's Universal Robots), by Karel Capek is part of HackerNoon's Book Blog Post series. You can jump to any chapter in this book here. Box candy. 1 Pad and blotter. 1 Letter opener. 1 Cigarette box. 1 Inkwell stand. 1 Practical buzzer (6 buttons). Off L.: 1 Fountain pen (for Busman). 1 Telephone buzzer. 1 Siren whistle. On Table L.C.: 2 Book ends (wooden).


What we bought: Our favorite gadgets of 2021

Engadget

While plenty of gadgets cross our desks, we at Engadget also end up buying a lot of things for ourselves throughout the year. In 2021, some of us invested in smart home devices and others (re)discovered passions for things like e-books and vinyl, but there are plenty of things we bought and loved that didn't make it onto the site. Here, our staffers look back on the year that was by gushing about their favorite items they bought this year. After a few years of waffling, I finally pulled the trigger in 2021 and bought a Dyson stick vacuum. You could say I fell for the hype, but honestly it's been one of my favorite purchases of the year and arguably the most useful. Until now, we had been relying on a few-years-old Roomba (lovingly named Dale) to clean our two-bedroom apartment -- Dale did a good job, but the Dyson is even better.


Will artificial intelligence ever out-design designers?

#artificialintelligence

There's a concept in artificial intelligence called "the singularity." It refers to the idea that AI will one day be able to reproduce and improve upon itself at increasingly rapid speeds, resulting in a computerized brain exponentially more powerful than human intelligence, capable of transforming civilization as we know it. Some scholars are confident the singularity is only a matter of time. Others say it's pure science fiction. For the time being, let's leave the issue to the Ph.D.s and focus on a few simpler questions.


Vision Artificial Intelligence Can Help Minimize Angst, For Companies And Customers, In The Relocation/Moving Industry

#artificialintelligence

Much of the focus on deep learning systems for vision has been in three areas: autonomous vehicles, facial recognition, and robotics. However, as with the many other areas of artificial intelligence (AI), vision will have a far wider impact on society than in those three areas. The logistics of relocation are heavily depending, no surprise, on what is being moved. Vision can be applied to that challenge in order to create more accurate estimates much faster than before. As a one news article points out, "about one in five Americans (23 percent) think that moving is more stressful than planning a wedding, according to new research. Twenty-seven percent think it's more stressful than a job interview, and more than one in 10 (13 percent) even go as far as to say it's more stressful than a week in jail."


Reinforcement Learning: a Subtle Introduction

#artificialintelligence

Reinforcement learning is a branch of Machine Learning and AI. It takes a very specific approach to creating models to do certain things. The objective of reinforcement learning is to teach a computer/machine to perform a certain task with a high degree of success. It is also important to note what reinforcement learning isn't. These models are artificial specific intelligence (ASIs), meaning they can only perform very specific tasks.


Gigaom Are There Robot-Proof Jobs?

#artificialintelligence

The following is an excerpt from GigaOm publisher Byron Reese's new book, The Fourth Age: Smart Robots, Conscious Computers, and the Future of Humanity. You can purchase the book here. The Fourth Age explores the implications of automation and AI on humanity, and has been described by Ethernet inventor and 3Com founder Bob Metcalfe as framing "the deepest questions of our time in clear language that invites the reader to make their own choices. Using 100,000 years of human history as his guide, he explores the issues around artificial general intelligence, robots, consciousness, automation, the end of work, abundance, and immortality." When the topic of automation and AI comes up, one of the chief concerns is always technology's potential impact on jobs.


Blizzard president Mike Morhaime expands his realm in Rancho Mirage

Los Angeles Times

Mike Morhaime, the co-founder and president of Blizzard Entertainment, has bought a home in a gated Rancho Mirage community for $2.25 million. The hacienda-style estate, built in 2003, opens to a gated drive that ends at a circular motor court. The more than one-acre property includes a main house and three casitas that combine to offer seven bedrooms and 8.5 bathrooms in just under 6,300 square feet of living space. Among features is a two-story great room with a stacked stone fireplace, a wine room and a kitchen updated with an island and a wrap-around bar. The master suite has his and hers quarter bathrooms and walk-ins. Pocket glass doors extend the living space outside, where patios surround a resort-style swimming pool with a waterfall feature.