Goto

Collaborating Authors

 fernando


Stay Focused: Problem Drift in Multi-Agent Debate

Becker, Jonas, Kaesberg, Lars Benedikt, Stephan, Andreas, Wahle, Jan Philip, Ruas, Terry, Gipp, Bela

arXiv.org Artificial Intelligence

Multi-agent debate - multiple instances of large language models discussing problems in turn-based interaction - has shown promise for solving knowledge and reasoning tasks. However, these methods show limitations, particularly when scaling them to longer reasoning chains. In this study, we unveil a new issue of multi-agent debate: discussions drift away from the initial problem over multiple turns. We define this phenomenon as problem drift and quantify its presence across ten tasks (i.e., three generative, three knowledge, three reasoning, and one instruction-following task). To identify the reasons for this issue, we perform a human study with eight experts on discussions suffering from problem drift, who find the most common issues are a lack of progress (35% of cases), low-quality feedback (26% of cases), and a lack of clarity (25% of cases). To systematically address the issue of problem drift, we propose DRIFTJudge, a method based on LLM-as-a-judge, to detect problem drift at test-time. We further propose DRIFTPolicy, a method to mitigate 31% of problem drift cases. Our study can be seen as a first step to understanding a key limitation of multi-agent debate, highlighting pathways for improving their effectiveness in the future.


The Daring Robot Surgery That Saved a Man's Life

WIRED

IN EARLY APRIL 2020, shortly after the British prime minister Boris Johnson had announced the first pandemic lockdown in the United Kingdom, a urologist named Archie Fernando reached out to one of her colleagues, Nadine Hachach-Haram. The two doctors worked at Guy's and St Thomas' hospital, one of the busiest in the country, at a time when nearly a thousand people were dying of Covid-19 every week. Most surgeries were being deferred, except for life-or-limb cases and urgent cancer surgeries, and Hachach-Haram, who is a reconstructive plastic surgeon, recalls how useless she felt. "I would just walk into the wards and ask the nurses what I could do to help," she says. "I started doing everything, like portering and proning, turning patients over to make their breathing slightly better."


Watson's Creator Wants to Teach AI a New Trick: Common Sense

WIRED

David Ferrucci, the man who built IBM's Jeopardy-playing machine, Watson, is explaining a children's story to his new creation. In the tale, Fernando and Zoey buy some plants. Fernando places his plant on a windowsill while Zoey tucks hers away in a darkened room. After a few days, Fernando's plant is green and healthy but the leaves of Zoey's have browned. She moves her plant to the windowsill, and it flourishes.


Fernando A. Cabal posted on LinkedIn

#artificialintelligence

Now that self driving cars are delayed a decade or more and Facebook's director of AI said they are expecting to hit a "wall" with #deeplearning, the key technology within #artificial_intelligence, the evangelists of #AI are waking up to a new reality where many things turned out to be more difficult than expected;-) some other things are simply too expensive to accomplish #artificialintelligence Is the party over?


AI success depends on good datasets, strategic alignment

#artificialintelligence

Given all the relentless hype about its artificial intelligence and its transformative potential for healthcare, it would be understandable if some health systems might be casting about in search of AI or machine learning projects they could try. But that sort of rushed, ad hoc approach is precisely the wrong one to take, says Tushar Mehrotra, senior vice president of analytics at Optum. "The only way you are going to get value out of AI is to link the clinical or business problem to the organization's overall strategy and make sure you have a rich enough data set to train the model so it generates actionable insights," said Mehrotra. "Making sure you are building and designing your AI effort the right way means putting in the work up front to create a clear understanding of what you are trying to solve so it can be embedded in the decision-making workflow," he said. "Too often, AI projects start with a quest for academic insight."


Your barista is a robot. Should it be friendly?

#artificialintelligence

The cold, steely arm of Fernando the Barista swirled the foam of my matcha latte, set it down gently and waved goodbye from inside a glass case. Where you can get robot pizza and robot salad, and now, a robot matcha. There were humans inside the small coffee shop on Market Street, but only some of them ordered drinks. Some of them came in just to gawk at Fernando: The machine was sleek and white, like an Apple product, and its glass enclosure made it seem like a small animal on display. "They all have'it' pronouns," said Sam Blum, Cafe X's community manager.


Data Science Simplified Part 6: Model Selection Methods

@machinelearnbot

In the last article of this series, we had discussed multivariate linear regression model. Fernando creates a model that estimates the price of the car based on five input parameters. Fernando indeed has a better model. Yet, he wanted to select the best set of variables for input. The idea of model selection method is intuitive. How is an optimal model defined?


Data Science Simplified Part 7: Log-Log Regression Models

@machinelearnbot

In the last few blog posts of this series, we discussed simple linear regression model. We discussed multivariate regression model and methods for selecting the right model. Fernando has now created a better model. In this article will address that question. This article will elaborate about Log-Log regression models.


Data Science Simplified Part 6: Model Selection Methods

@machinelearnbot

In the last article of this series, we had discussed multivariate linear regression model. Fernando creates a model that estimates the price of the car based on five input parameters. Fernando indeed has a better model. Yet, he wanted to select the best set of variables for input. The idea of model selection method is intuitive. How is an optimal model defined?