Goto

Collaborating Authors

 human debater


Debate Helps Supervise Unreliable Experts

arXiv.org Artificial Intelligence

As AI systems are used to answer more difficult questions and potentially help create new knowledge, judging the truthfulness of their outputs becomes more difficult and more important. How can we supervise unreliable experts, which have access to the truth but may not accurately report it, to give answers that are systematically true and don't just superficially seem true, when the supervisor can't tell the difference between the two on their own? In this work, we show that debate between two unreliable experts can help a non-expert judge more reliably identify the truth. We collect a dataset of human-written debates on hard reading comprehension questions where the judge has not read the source passage, only ever seeing expert arguments and short quotes selectively revealed by 'expert' debaters who have access to the passage. In our debates, one expert argues for the correct answer, and the other for an incorrect answer. Comparing debate to a baseline we call consultancy, where a single expert argues for only one answer which is correct half of the time, we find that debate performs significantly better, with 84% judge accuracy compared to consultancy's 74%. Debates are also more efficient, being 68% of the length of consultancies. By comparing human to AI debaters, we find evidence that with more skilled (in this case, human) debaters, the performance of debate goes up but the performance of consultancy goes down. Our error analysis also supports this trend, with 46% of errors in human debate attributable to mistakes by the honest debater (which should go away with increased skill); whereas 52% of errors in human consultancy are due to debaters obfuscating the relevant evidence from the judge (which should become worse with increased skill). Overall, these results show that debate is a promising approach for supervising increasingly capable but potentially unreliable AI systems.


An IBM AI Debates Humans--but It's Not Yet the Deep Blue of Oratory

#artificialintelligence

In 2019 Harish Natarajan took part in a debate with a five-and-a-half-foot tall rectangular computer screen in front of a live audience of about 800 people. The computer was Project Debater, an artificial intelligence system designed by IBM. Natarajan is a globally recognized debate champion. And the topic at hand was whether or not preschool should be subsidized. Based on an audience vote, Project Debater lost the contest.


Argument technology for debating with humans

#artificialintelligence

The study of arguments has an academic pedigree stretching back to the ancient Greeks, and spans disciplines from theoretical philosophy to computational engineering. Developing computer systems that can recognize arguments in natural human language is one of the most demanding challenges in the field of artificial intelligence (AI). Writing in Nature, Slonim et al.1 report an impressive development in this field: Project Debater, an AI system that can engage with humans in debating competitions. The findings showcase how far research in this area has come, and emphasize the importance of robust engineering that combines different components, each of which handles a particular task, in the development of technology that can recognize, generate and critique arguments in debates. Less than a decade ago, the analysis of human discourse to identify the ways in which evidence is adduced to support conclusions -- a process now known as argument mining2 -- was firmly beyond the capabilities of state-of-the-art AI. Since then, a combination of technical advances in AI and increasing maturity in the engineering of argument technology, coupled with intense commercial demand, has led to rapid expansion of the field.


IBM's AI Machine Makes A Convincing Case That It's Mastering The Human Art Of Persuasion

#artificialintelligence

You are about to hear a speech supporting the idea that Gambling should be banned…" The 332-word speech arguing that gambling should be banned offered three reasons (with evidence) to support its case: "Gambling is addictive," "facilitates criminal activity," and "has ruined many individuals and families." The second speech--arguing that that gambling should not be banned--also provided three reasons. Regular readers of my column know that "the rule of three" is a fundamental component of persuasion. Overloading a listener with too much information at any one time makes it difficult for humans to process the content. Project Debater already knows it. Project Debater marks a major milestone toward understanding language. The AI system can complement human decision-making by bringing in facts and evidence in a persuasive, logical structure. By understanding people's opinions on different topics, politicians, public servants and business leaders can get a better understanding of what people think about a policy or corporate decision--and why they think the way they do.


IBM pits computer against human debaters

Washington Post - Technology News

IBM is testing a computer against two human debaters in the first public demonstration of artificial intelligence technology it's been working on for more than five years. The company unveiled its Project Debater in San Francisco on Monday. The argumentative computer system is embodied in a 5-foot-tall (1.5 meter) machine shaped like a monolith. Asked to debate in favor of government-subsidized space exploration -- a topic it hadn't studied -- the computer quickly delivered an opening argument, pulling in evidence collected from its repository of newspaper articles and journals. It then listened to a human's counter-argument and gave a 4-minute rebuttal.


IBM pits computer against human debaters

Washington Post - Technology News

IBM pitted a computer against two human debaters in the first public demonstration of artificial intelligence technology it's been working on for more than five years. The company unveiled its Project Debater in San Francisco on Monday, asking it to make a case for government-subsidized space research -- a topic it hadn't studied in advance but championed fiercely with just a few awkward gaps in reasoning. "Subsidizing space exploration is like investing in really good tires," argued the computer system, its female voice embodied in a 5-foot-tall machine shaped like a monolith with TV screens on its sides. Such research would enrich the human mind, inspire young people and be a "very sound investment," it said, making it more important even than good roads, schools or health care. The computer delivered its opening argument by pulling in evidence from its huge internal repository of newspapers, journals and other sources.


IBM pits computer against human debaters

Washington Post - Technology News

IBM pitted a computer against two human debaters in the first public demonstration of artificial intelligence technology it's been working on for more than five years. The company unveiled its Project Debater in San Francisco on Monday, asking it to make a case for government-subsidized space research -- a topic it hadn't studied in advance but championed fiercely with just a few awkward gaps in reasoning. "Subsidizing space exploration is like investing in really good tires," argued the computer system, its female voice embodied in a 5-foot-tall machine shaped like a monolith with TV screens on its sides. Such research would enrich the human mind, inspire young people and be a "very sound investment," it said, making it more important even than good roads, schools or health care. The computer delivered its opening argument by pulling in evidence from its huge internal repository of newspapers, journals and other sources.


IBM Pits Computer Against Human Debaters

#artificialintelligence

IBM Senior Technical Staff Member Noam Slonim stands with the IBM Project Debater system. IBM pitted a computer against two human debaters in the first public demonstration of artificial intelligence technology it's been working on for more than five years. The company unveiled its Project Debater in San Francisco, asking it to make a case for government-subsidized space research--a topic it hadn't studied in advance but championed fiercely with just a few awkward gaps in reasoning. "Subsidizing space exploration is like investing in really good tires," argued the computer system, its female voice embodied in a 5-foot-tall machine shaped like a monolith with TV screens on its sides. Such research would be a "very sound investment," it said.


7 Years After Watson, IBM's AI Turns Heads Again

International Business Times

Watson, an artificial intelligence system built by International Business Machines (NYSE:IBM), made headlines in 2011 by beating two champions on the game show Jeopardy! This initial version of Watson was built for answering Jeopardy!-style questions, backed by a vast database of information. Since then, Watson has been adapted to a wide variety of business applications, in industries such as healthcare and financial services. This article originally appeared in the Motley Fool. Soon after Watson's debut, IBM began work on another AI project.


IBM pits computer against human debaters

The Japan Times

SAN FRANCISCO – IBM pitted a computer against two human debaters in the first public demonstration of artificial intelligence technology it's been working on for more than five years. The company unveiled its Project Debater in San Francisco on Monday, asking it to make a case for government-subsidized space research -- a topic it hadn't studied in advance but championed fiercely with just a few awkward gaps in reasoning. "Subsidizing space exploration is like investing in really good tires," argued the computer system, its female voice embodied in a 5-foot-tall machine shaped like a monolith with TV screens on its sides. Such research would enrich the human mind, inspire young people and be a "very sound investment," it said, making it more important even than good roads, schools or health care. The computer delivered its opening argument by pulling in evidence from its huge internal repository of newspapers, journals and other sources.