imitation game
In Defense of the Turing Test and its Legacy
Considering that Turing's original test was co-opted by Weizenbaum and that six of the most common criticisms of the Turing test are unfair to both Turing's argument and the historical development of AI. The Turing test has faced criticism for decades, most recently at the Royal Society event "Celebrating the 75th Anniversary of the Turing Test." The question of the Turing test's significance has intensified with recent advances in large language model technology, which now enable machines to pass it. In this article, I address six of the most common criticisms of the Turing test: The Turing test encourages fooling people; Turing overestimated human intelligence, as people can be easily fooled (the ELIZA effect); The Turing test is not a good benchmark for AI; Turing's 1950 paper is not serious and/or has contradictions; Imitation should not be a goal for AI, and it is also harmful to society; Passing the Turing test teaches nothing about AI. All six criticisms largely derive from Joseph Weizenbaum's influential reinterpretation of the Turing test. The first four fail to withstand a close examination of the internal logic of Turing's 1950 paper, particularly when the paper is situated within its mid-twentieth-century context.
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
ChatGPT passed the Turing Test. Now what?
ChatGPT passed the Turing Test. The AI fooled 73% of people into thinking it was human, raising new questions about machine intelligence. As artificial intelligence gets better and better, people face machines that look--and act--surprisingly human. Breakthroughs, discoveries, and DIY tips sent every weekday. It seems that every day brings a new headline about the burgeoning capabilities of large language models (LLMs) like ChatGPT and Google's Gemini--headlines that are either exciting or increasingly apocalyptic, depending on one's point of view. One particularly striking story arrived earlier this year: a paper that described how an LLM had passed the Turing Test, an experiment devised in the 1950s by computer science pioneer Alan Turing to determine whether machine intelligence could be distinguished from that of a human. The LLM in question was ChatGPT 4.5, and the paper found that it had been strikingly successful in fooling people into thinking it was human: In an experiment where participants were asked to choose whether the chatbot or an actual human was the real person, nearly three of the four chose the former.
- North America > United States > New York (0.04)
- North America > United States > Illinois (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Research Report (0.48)
- Personal > Honors (0.46)
The Imitation Game for Educational AI
Sonkar, Shashank, Liu, Naiming, Chen, Xinghe, Baraniuk, Richard G.
As artificial intelligence systems become increasingly prevalent in education, a fundamental challenge emerges: how can we verify if an AI truly understands how students think and reason? Traditional evaluation methods like measuring learning gains require lengthy studies confounded by numerous variables. We present a novel evaluation framework based on a two-phase Turing-like test. In Phase 1, students provide open-ended responses to questions, revealing natural misconceptions. In Phase 2, both AI and human experts, conditioned on each student's specific mistakes, generate distractors for new related questions. By analyzing whether students select AI-generated distractors at rates similar to human expert-generated ones, we can validate if the AI models student cognition. We prove this evaluation must be conditioned on individual responses - unconditioned approaches merely target common misconceptions. Through rigorous statistical sampling theory, we establish precise requirements for high-confidence validation. Our research positions conditioned distractor generation as a probe into an AI system's fundamental ability to model student thinking - a capability that enables adapting tutoring, feedback, and assessments to each student's specific needs.
- North America > United States > Texas > Harris County > Houston (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > Singapore (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (0.94)
- Information Technology > Artificial Intelligence > Cognitive Science (0.94)
- Information Technology > Artificial Intelligence > Issues > Turing's Test (0.51)
Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns
Liu, Naiming, Sonkar, Shashank, Baraniuk, Richard G.
Large Language Models (LLMs) have demonstrated remarkable capabilities in various educational tasks, yet their alignment with human learning patterns, particularly in predicting which incorrect options students are most likely to select in multiple-choice questions (MCQs), remains underexplored. Our work investigates the relationship between LLM generation likelihood and student response distributions in MCQs with a specific focus on distractor selections. We collect a comprehensive dataset of MCQs with real-world student response distributions to explore two fundamental research questions: (1). RQ1 - Do the distractors that students more frequently select correspond to those that LLMs assign higher generation likelihood to? (2). RQ2 - When an LLM selects a incorrect choice, does it choose the same distractor that most students pick? Our experiments reveals moderate correlations between LLM-assigned probabilities and student selection patterns for distractors in MCQs. Additionally, when LLMs make mistakes, they are more likley to select the same incorrect answers that commonly mislead students, which is a pattern consistent across both small and large language models. Our work provides empirical evidence that despite LLMs' strong performance on generating educational content, there remains a gap between LLM's underlying reasoning process and human cognitive processes in identifying confusing distractors. Our findings also have significant implications for educational assessment development. The smaller language models could be efficiently utilized for automated distractor generation as they demonstrate similar patterns in identifying confusing answer choices as larger language models. This observed alignment between LLMs and student misconception patterns opens new opportunities for generating high-quality distractors that complement traditional human-designed distractors.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study > Negative Result (0.34)
- Education > Educational Technology > Educational Software > Computer Based Training (0.94)
- Education > Educational Setting (0.68)
- Education > Assessment & Standards (0.67)
Imitation Game for Adversarial Disillusion with Multimodal Generative Chain-of-Thought Role-Play
Chang, Ching-Chun, Chen, Fan-Yun, Gu, Shih-Hong, Gao, Kai, Wang, Hanrui, Echizen, Isao
As the cornerstone of artificial intelligence, machine perception confronts a fundamental threat posed by adversarial illusions. These adversarial attacks manifest in two primary forms: deductive illusion, where specific stimuli are crafted based on the victim model's general decision logic, and inductive illusion, where the victim model's general decision logic is shaped by specific stimuli. The former exploits the model's decision boundaries to create a stimulus that, when applied, interferes with its decision-making process. The latter reinforces a conditioned reflex in the model, embedding a backdoor during its learning phase that, when triggered by a stimulus, causes aberrant behaviours. The multifaceted nature of adversarial illusions calls for a unified defence framework, addressing vulnerabilities across various forms of attack. In this study, we propose a disillusion paradigm based on the concept of an imitation game. At the heart of the imitation game lies a multimodal generative agent, steered by chain-of-thought reasoning, which observes, internalises and reconstructs the semantic essence of a sample, liberated from the classic pursuit of reversing the sample to its original state. As a proof of concept, we conduct experimental simulations using a multimodal generative dialogue agent and evaluates the methodology under a variety of attack scenarios.
- Europe > Austria > Vienna (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- (17 more...)
- Information Technology > Security & Privacy (1.00)
- Government > Military (0.87)
The Imitation Game According To Turing
Temtsin, Sharon, Proudfoot, Diane, Kaber, David, Bartneck, Christoph
The current cycle of hype and anxiety concerning the benefits and risks to human society of Artificial Intelligence is fuelled, not only by the increasing use of generative AI and other AI tools by the general public, but also by claims made on behalf of such technology by popularizers and scientists. In particular, recent studies have claimed that Large Language Models (LLMs) can pass the Turing Test-a goal for AI since the 1950s-and therefore can "think". Large-scale impacts on society have been predicted as a result. Upon detailed examination, however, none of these studies has faithfully applied Turing's original instructions. Consequently, we conducted a rigorous Turing Test with GPT-4-Turbo that adhered closely to Turing's instructions for a three-player imitation game. We followed established scientific standards where Turing's instructions were ambiguous or missing. For example, we performed a Computer-Imitates-Human Game (CIHG) without constraining the time duration and conducted a Man-Imitates-Woman Game (MIWG) as a benchmark. All but one participant correctly identified the LLM, showing that one of today's most advanced LLMs is unable to pass a rigorous Turing Test. We conclude that recent extravagant claims for such models are unsupported, and do not warrant either optimism or concern about the social impact of thinking machines.
- Oceania > New Zealand > South Island > Canterbury Region > Christchurch (0.04)
- North America > United States > Oregon (0.04)
- Europe > United Kingdom (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Issues > Turing's Test (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
Passed the Turing Test: Living in Turing Futures
The world has seen the emergence of machines based on pretrained models, transformers, also known as generative artificial intelligences for their ability to produce various types of content, including text, images, audio, and synthetic data. Without resorting to preprogramming or special tricks, their intelligence grows as they learn from experience, and to ordinary people, they can appear human-like in conversation. This means that they can pass the Turing test, and that we are now living in one of many possible Turing futures where machines can pass for what they are not. However, the learning machines that Turing imagined would pass his imitation tests were machines inspired by the natural development of the low-energy human cortex. They would be raised like human children and naturally learn the ability to deceive an observer. These ``child machines,'' Turing hoped, would be powerful enough to have an impact on society and nature.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- South America > Brazil > São Paulo (0.04)
- North America > United States > New York (0.04)
The Original Turing Test Was a Drag Show
ChatGPT can now easily pass any Turing test, a measure of successful A.I. proposed by a founder of computer science, Alan Turing. But contemporary Turing tests leave out the most interesting part of Turing's original test: the gender-bending. I can usually spot A.I. writing in my students' work by the overuse of words like "delve," but the accuracy of artificial intelligence is impossible to deny. A.I. is being integrated into every aspect of our written culture, from news sources to classrooms to medicine. But in 1950, Turing's ideas about A.I. were prescient, creative, and, when I read them, surprisingly queer.
- North America > United States > California (0.05)
- Europe > United Kingdom > England (0.05)
- Information Technology > Security & Privacy (0.61)
- Health & Medicine (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Issues > Turing's Test (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.42)
Turing's Test, a Beautiful Thought Experiment
In the wake of large language models, there has been a resurgence of claims and questions about the Turing test and its value for AI, which are reminiscent of decades of practical "Turing" tests. If AI were quantum physics, by now several "Schr\"odinger's" cats could have been killed. Better late than never, it is time for a historical reconstruction of Turing's beautiful thought experiment. In this paper I present a wealth of evidence, including new archival sources, give original answers to several open questions about Turing's 1950 paper, and address the core question of the value of Turing's test.
- South America > Brazil > São Paulo (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Wisconsin (0.04)
- (5 more...)
- Health & Medicine (0.94)
- Leisure & Entertainment > Games > Chess (0.69)
- Government > Regional Government > North America Government > United States Government (0.46)
- Information Technology > Artificial Intelligence > Issues > Turing's Test (1.00)
- Information Technology > Artificial Intelligence > History (1.00)
Robot at the Mirror: Learning to Imitate via Associating Self-supervised Models
Lúčny, Andrej, Malinovská, Kristína, Farkaš, Igor
We introduce an approach to building a custom model from ready-made self-supervised models via their associating instead of training and fine-tuning. We demonstrate it with an example of a humanoid robot looking at the mirror and learning to detect the 3D pose of its own body from the image it perceives. To build our model, we first obtain features from the visual input and the postures of the robot's body via models prepared before the robot's operation. Then, we map their corresponding latent spaces by a sample-efficient robot's self-exploration at the mirror. In this way, the robot builds the solicited 3D pose detector, which quality is immediately perfect on the acquired samples instead of obtaining the quality gradually. The mapping, which employs associating the pairs of feature vectors, is then implemented in the same way as the key-value mechanism of the famous transformer models. Finally, deploying our model for imitation to a simulated robot allows us to study, tune up, and systematically evaluate its hyperparameters without the involvement of the human counterpart, advancing our previous research.
- Europe > Slovakia > Bratislava > Bratislava (0.04)
- Europe > Austria > Vienna (0.04)