Goto

Collaborating Authors

 Jeopardy!


em Jeopardy! /em 's Most Infamous Moment Haunted the Show's Fans, Its Stars, and Even Alex Trebek. It's Clear Why Now.

Slate

's most controversial moment was years in the making. It took many more for the fallout to come into full view. One morning in 2010, Alex Trebek walked onto the IBM campus not far outside New York City and prepared to inspect what would become the most unusual player in's history. The trip, clear across the country from the show's Culver City set, had been carefully planned. David Ferrucci, a computer scientist at IBM, had spent years leading a team to develop what would become the first and, so far, last nonhuman ever to compete on Longtime host Trebek would watch three practice games played with "Watson," as the system was named, and two human contestants. Then the team would be taken to lunch nearby, and Trebek would ultimately take the stage and host two more Watson practice games himself. By then the preparations for a future televised contest with IBM's creation were well underway, but this was the first time Trebek would encounter the technology in person, and his approval was crucial. Ferrucci was eager to show off one element in particular: the display, which had been rigged to show Watson's top three guesses whenever it answered, along with the numerical confidence rate it had in each one. For Ferrucci, this feature was central to demonstrating the computer's language-processing capabilities, because it showed that Watson wasn't just spitting out answers--it was reasoning. If Watson were ever going to be deployed to industries like health care, its human users wouldn't just want to know its best guess. It would be infinitely more valuable to know if Watson was 95 percent confident or just 30 percent, and whether those confidence levels were in line with its actual accuracy rate. It also made for better viewing. Ferrucci had brought his young daughter to the lab earlier in the process and showed her Watson as it played against human opponents. When Watson declined to ring in, Ferrucci's daughter turned to him and asked if the computer had crashed. He struggled to explain that it hadn't--it just wasn't confident enough to hazard a guess.


A Russian Jeopardy! Data Set for Question-Answering Systems

Mikhalkova, Elena

arXiv.org Artificial Intelligence

Question answering (QA) is one of the most common NLP tasks that relates to named entity recognition, fact extraction, semantic search and some other fields. In industry, it is much appreciated in chatbots and corporate information systems. It is also a challenging task that attracted the attention of a very general audience at the quiz show Jeopardy! In this article we describe a Jeopardy!-like Russian QA data set collected from the official Russian quiz database Chgk (che ge ka). The data set includes 379,284 quiz-like questions with 29,375 from the Russian analogue of Jeopardy! - "Own Game". We observe its linguistic features and the related QA-task. We conclude about perspectives of a QA competition based on the data set collected from this database.


PEDANTS (Precise Evaluations of Diverse Answer Nominee Text for Skinflints): Efficient Evaluation Analysis and Benchmarking for Open-Domain Question Answering

Li, Zongxia, Mondal, Ishani, Liang, Yijun, Nghiem, Huy, Boyd-Graber, Jordan Lee

arXiv.org Artificial Intelligence

Question answering (QA) can only make progress if we know if an answer is correct, but for many of the most challenging and interesting QA examples, current efficient answer correctness (AC) metrics do not align with human judgments, particularly verbose, free-form answers from large language models (LLMs). There are two challenges: a lack of diverse evaluation data and that models are too big and non-transparent; LLM-based scorers correlate better with humans, but this expensive task has only been tested on limited QA datasets. We rectify these issues by providing guidelines and datasets for evaluating machine QA adopted from human QA community. We also propose an efficient, low-resource, and interpretable QA evaluation method more stable than an exact match and neural methods.


CFMatch: Aligning Automated Answer Equivalence Evaluation with Expert Judgments For Open-Domain Question Answering

Li, Zongxia, Mondal, Ishani, Liang, Yijun, Nghiem, Huy, Boyd-Graber, Jordan

arXiv.org Artificial Intelligence

Question answering (QA) can only make progress if we know if an answer is correct, but for many of the most challenging and interesting QA examples, current evaluation metrics to determine answer equivalence (AE) often do not align with human judgments, particularly more verbose, free-form answers from large language models (LLM). There are two challenges: a lack of data and that models are too big: LLM-based scorers can correlate better with human judges, but this task has only been tested on limited QA datasets, and even when available, update of the model is limited because LLMs are large and often expensive. We rectify both of these issues by providing clear and consistent guidelines for evaluating AE in machine QA adopted from professional human QA contests. We also introduce a combination of standard evaluation and a more efficient, robust, and lightweight discriminate AE classifier-based matching method (CFMatch, smaller than 1 MB), trained and validated to more accurately evaluate answer correctness in accordance with adopted expert AE rules that are more aligned with human judgments.


'Jeopardy!' contestant torn apart by fans after huge mistake: 'Such a buffoon'

FOX News

'Gutfeld!' guests discuss a Jeopardy question that used alleged murderer Brian Laundrie as the clue. A "Jeopardy!" contestant is going viral this week after making what many fans are considering one of the biggest blunders in the show's history. On Wednesday's episode, a woman named Karen had a huge lead over the other two contestants as they neared the end of the second round – she had earned $21,800, while her competitors had earned $7,100 and $6,400. When there were only a few clues left on the Double Jeopardy board, Karen found a Daily Double in the "Hans, Solo" category. If she had made a modest bet, she would have been sure to win the entire game after Final Jeopardy, as the other players couldn't possibly catch up to her lead.


Jeopardy champion's 23-day winning streak ends after losing by $1

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Mattea Roach, a tutor from Toronto, Canada, had won $560,983 over the course of her winning streak. This image released by Sony Pictures Television shows Mattea Roach, a 23-year-old Canadian contestant on the game show "Jeopardy!" Heading into the final round of Friday's match, Roach was leading with $19,200 and wagered $3,001 on the Final Jeopardy question.

  Country: North America > Canada > Ontario > Toronto (0.26)
  Industry:

A Decade Of Advancements As We Enter A New Age Of AI

#artificialintelligence

As we embark on the next decade of innovations in AI, Daniel Pitchford looks back at the five biggest industry milestones of the 2010s, how they impacted investment in the sector and how they've shaped the advance of technology. The 2010s will be known for the advent of one of the most powerful technologies on the planet – Artificial Intelligence. Over the next decade, as more funding is made available for its development and it becomes more accepted by companies and consumers alike, it is worth reviewing some of the major milestones over the last decade that have made this advancement possible. The game is on, Watson: IBM's Jeopardy triumph The first major milestone of AI hitting the mainstream was when IBM's "super-computer" Watson beat long-standing Jeopardy champions Ken Jennings and Brad Rutter in 2011. Watson won the $1m TV game show with $77,147, leaving Jennings and Ruttner far behind at $24,000 and $21,600 respectively.


Playing Games with AI

#artificialintelligence

"The challenges of machine learning have long been tied to games as a testbed for computer intelligence." Jeopardy Champion Emma Boettcher's Master's paper on using text mining to predict how hard a Jeopardy clue might be didn't win her a title on its own, but it is an interesting thought experiment. Futurism's mission is to empower our readers and drive the development of transformative technologies towards maximizing human potential.



The Secret Farm Team for em Jeopardy! /em Players

Slate

As she met her fellow captains and competitors, all multiweek winners on the game show (including me), she was surprised how familiar everyone seemed to be with each other. Back in 2014, when she made her first appearance, "I didn't know a single person who had ever been on the show," Julia told me. But this time, she marveled, "everyone else seems to have known each other, either personally or by reputation, for decades." They shared years of experience on Jeopardy's secret farm team: quiz bowl. Of the 18 "All-Stars" in the tourney, all but Julia and two others had played the academic competition known as quiz bowl in high school or college.