MIT Technology Review
When you might start speaking to robots
There are lots of ways to incorporate AI into robots, starting with improving how they are trained to do tasks. But using large language models to give instructions, as Google has done, is particularly interesting. The robotics startup Figure went viral a year ago for a video in which humans gave instructions to a humanoid on how to put dishes away. Around the same time, a startup spun off from OpenAI, called Covariant, built something similar for robotic arms in warehouses. I saw a demo where you could give the robot instructions via images, text, or video to do things like "move the tennis balls from this bin to that one."
The Download: Google playing AI search catchup, and forming relationships with chatbots
I've been mulling over something that Will Heaven, our senior editor for AI, pointed out not too long ago: all the big players in AI seem to be moving in the same directions and converging on the same things. It's just announced it's adding new AI features from Gemini to search, and adding search features to Gemini. What strikes me more than how well they work is that they are really just about catching up with OpenAI's ChatGPT. And their belated appearance in March of the year 2025 doesn't seem like a great sign for Google. This story originally appeared in The Debrief with Mat Honan, a weekly newsletter about the biggest stories in tech from our editor in chief.
Is Google playing catchup on search with OpenAI?
Take AI Mode, which it announced March 5. It's cool. But it's pretty much a follow-along of what OpenAI was already doing. Google already had something called AI Overviews in search, but AI Mode is different and deeper.) As the company explained in a blog post, "This new Search mode expands what AI Overviews can do with more advanced reasoning, thinking and multimodal capabilities so you can get help with even your toughest questions." Rather than a brief overview with links out, the AI will dig in and offer more robust answers. You can ask followup questions too, something AI Overviews doesn't support.
The Download: Google DeepMind's plans for robots, and Eastern Europe's changing tech sector
The news: Google DeepMind has released a new model, Gemini Robotics, that combines its best large language model with robotics. Plugging in the LLM seems to give robots the ability to be more dexterous, work from natural-language commands, and generalize across tasks. All three are things that robots have struggled to do until now. Why it matters: The team hopes their work could usher in an era of robots that are far more useful and require less detailed training for each task. Incorporating LLMs into robotics is part of a growing trend, and this may be the most impressive example yet.
Gemini Robotics uses Google's top language model to make robots more useful
Google DeepMind also announced that it is partnering with a number of robotics companies, like Agility Robotics and Boston Dynamics, on a second model they announced, the Gemini Robotics-ER model, a vision-language model focused on spatial reasoning to continue refining that model. "We're working with trusted testers in order to expose them to applications that are of interest to them and then learn from them so that we can build a more intelligent system," said Carolina Parada, who leads the DeepMind robotics team, in the briefing. Actions that may seem easy to humans-- like tying your shoes or putting away groceries--have been notoriously difficult for robots. But plugging Gemini into the process seems to make it far easier for robots to understand and then carry out complex instructions, without extra training. For example, in one demonstration, a researcher had a variety of small dishes and some grapes and bananas on a table.
The Download: testing new AI agent Manus, and Waabi's virtual robotruck ambitions
For many years, researchers have been working to build devices that can mimic photosynthesis--the process by which plants use sunlight and carbon dioxide to make their fuel. These artificial leaves use sunlight to separate water into oxygen and hydrogen, which could then be used to fuel cars or generate electricity. Now a research team from the University of Cambridge has taken aim at creating more energy-dense fuels. The group's device produces ethylene and ethane, proving that artificial leaves can create hydrocarbons. The development could offer a cheaper, cleaner way to make fuels, chemicals, and plastics--with the ultimate goal of creating fuels that don't leave a harmful carbon footprint after they're burned.
Everyone in AI is talking about Manus. We put it to the test.
Despite all the hype, very few people have had a chance to use it. Currently, under 1% of the users on the wait list have received an invite code. MIT Technology Review was able to obtain access to Manus, and when I gave it a test-drive, I found that using it feels like collaborating with a highly intelligent and efficient intern: While it occasionally lacks understanding of what it's being asked to do, makes incorrect assumptions, or cuts corners to expedite tasks, it explains its reasoning clearly, is remarkably adaptable, and can improve substantially when provided with detailed instructions or feedback. Just like its parent company's previous product, an AI assistant called Monica that was released in 2023, Manus is intended for a global audience. English is set as the default language, and its design is clean and minimalist.
Waabi says its virtual robotrucks are realistic enough to prove the real ones are safe
"It brings accountability to the industry," says Raquel Urtasun, Waabi's firebrand founder and CEO (who is also a professor at the University of Toronto). "There are no more excuses." After quitting Uber, where she led the ride-sharing firm's driverless-car division, Urtasun founded Waabi in 2021 with a different vision for how autonomous vehicles should be made. The firm, which has partnerships with Uber Freight and Volvo, has been running real trucks on real roads in Texas since 2023, but it carries out the majority of its development inside a simulation called Waabi World. Waabi is now taking its sim-first approach to the next level, using Waabi World not only to train and test its driving models but to prove their real-world safety.
The Download: making AI fairer, and why everyone's talking about AGI
What's new: A new pair of AI benchmarks could help developers reduce bias in AI models, potentially making them fairer and less likely to cause harm. The benchmarks evaluate AI systems based on their awareness of different scenarios and contexts. They could offer a more nuanced way to measure AI's bias and its understanding of the world. Why it matters: The researchers were inspired to look into the problem of bias after witnessing clumsy missteps in previous approaches, demonstrating how ignoring differences between groups may in fact make AI systems less fair. But while these new benchmarks could help teams better judge fairness in AI models, actually fixing them may require some other techniques altogether.
AGI is suddenly a dinner table topic
First, let's get the pesky business of defining AGI out of the way. In practice, it's a deeply hazy and changeable term shaped by the researchers or companies set on building the technology. But it usually refers to a future AI that outperforms humans on cognitive tasks. Which humans and which tasks we're talking about makes all the difference in assessing AGI's achievability, safety, and impact on labor markets, war, and society. That's why defining AGI, though an unglamorous pursuit, is not pedantic but actually quite important, as illustrated in a new paper published this week by authors from Hugging Face and Google, among others.