guinea pig
How Susceptible are LLMs to Influence in Prompts?
Anagnostidis, Sotiris, Bulian, Jannis
Large Language Models (LLMs) are highly sensitive to prompts, including additional context provided therein. As LLMs grow in capability, understanding their prompt-sensitivity becomes increasingly crucial for ensuring reliable and robust performance, particularly since evaluating these models becomes more challenging. In this work, we investigate how current models (Llama, Mixtral, Falcon) respond when presented with additional input from another model, mimicking a scenario where a more capable model -- or a system with access to more external information -- provides supplementary information to the target model. Across a diverse spectrum of question-answering tasks, we study how an LLM's response to multiple-choice questions changes when the prompt includes a prediction and explanation from another model. Specifically, we explore the influence of the presence of an explanation, the stated authoritativeness of the source, and the stated confidence of the supplementary input. Our findings reveal that models are strongly influenced, and when explanations are provided they are swayed irrespective of the quality of the explanation. The models are more likely to be swayed if the input is presented as being authoritative or confident, but the effect is small in size. This study underscores the significant prompt-sensitivity of LLMs and highlights the potential risks of incorporating outputs from external sources without thorough scrutiny and further validation. As LLMs continue to advance, understanding and mitigating such sensitivities will be crucial for their reliable and trustworthy deployment.
With AI, We Are All Once Again Tech Companies' Guinea Pigs
The companies touting new chat-based artificial-intelligence systems are running a massive experiment--and we are the test subjects. In this experiment, Microsoft, OpenAI and others are rolling out on the internet an alien intelligence that no one really understands, which has been granted the ability to influence our assessment of what's true in the world.
For Chat-Based AI, We Are All Once Again Tech Companies' Guinea Pigs - WSJ
The companies touting new chat-based artificial-intelligence systems are running a massive experiment--and we are the test subjects. In this experiment, Microsoft, OpenAI and others are rolling out on the internet an alien intelligence that no one really understands, which has been granted the ability to influence our assessment of what's true in the world.
'Yeah, we're spooked': AI starting to have big real-world impact, says expert
A scientist who wrote a leading textbook on artificial intelligence has said experts are "spooked" by their own success in the field, comparing the advance of AI to the development of the atom bomb. Prof Stuart Russell, the founder of the Center for Human-Compatible Artificial Intelligence at the University of California, Berkeley, said most experts believed that machines more intelligent than humans would be developed this century, and he called for international treaties to regulate the development of the technology. "The AI community has not yet adjusted to the fact that we are now starting to have a really big impact in the real world," he told the Guardian. "That simply wasn't the case for most of the history of the field โ we were just in the lab, developing things, trying to get stuff to work, mostly failing to get stuff to work. So the question of real-world impact was just not germane at all. And we have to grow up very quickly to catch up."
'Yeah, we're spooked': AI starting to have big real-world impact, says expert
A scientist who wrote a leading textbook on artificial intelligence has said experts are "spooked" by their own success in the field, comparing the advance of AI to the development of the atom bomb. Prof Stuart Russell, the founder of the Center for Human-Compatible Artificial Intelligence at the University of California, Berkeley, said most experts believed that machines more intelligent than humans would be developed this century, and he called for international treaties to regulate the development of the technology. "The AI community has not yet adjusted to the fact that we are now starting to have a really big impact in the real world," he told the Guardian. "That simply wasn't the case for most of the history of the field โ we were just in the lab, developing things, trying to get stuff to work, mostly failing to get stuff to work. So the question of real-world impact was just not germane at all. And we have to grow up very quickly to catch up."
Quantifying the Conceptual Error in Dimensionality Reduction
Dimension reduction of data sets is a standard problem in the realm of machine learning and knowledge reasoning. They affect patterns in and dependencies on data dimensions and ultimately influence any decision-making processes. Therefore, a wide variety of reduction procedures are in use, each pursuing different objectives. A so far not considered criterion is the conceptual continuity of the reduction mapping, i.e., the preservation of the conceptual structure with respect to the original data set. Based on the notion scale-measure from formal concept analysis we present in this work a) the theoretical foundations to detect and quantify conceptual errors in data scalings; b) an experimental investigation of our approach on eleven data sets that were respectively treated with a variant of non-negative matrix factorization.
Federal investigators warn Tesla is using customers as 'guinea pigs' to test its 'Full Self-Driving'
The National Transport Safety Board (NTSB) suggests Tesla is using customers as'guinea pigs' to test its autonomous driving technology before it is officially approved and is blaming its sister agency for letting it happen. In a letter to the National Highway Traffic Safety Administration (NHTSA), NTSB is calling for stricter requirements for design and use automated driving systems on public roads, CNBC reports. Tesla is named 16 times in the document, mainly due to the fact it released its'Full Self-Driving' FSD) beta version to the public'with limited oversight or reporting requirements.' Although NTSB points to the Elon Musk-owned firm for its lack of safeguarding, the agency is also slamming NHTSA for its'hands-off approach' to monitor such testing on public roads. Tesla first launched its FSD beta program in October to a limited number of customers who were deemed'expert and careful drivers.'
'I choose to thrive': the man fighting motor neurone disease with cyborg technology
In November 2017, Peter B Scott-Morgan received the news that almost nothing can prepare you for โ he was told he had just two years to live. Peter had been diagnosed with motor neurone disease (MND). It kills a third of those who have it within a year, rising to a half by the end of year two, with no known cure. Devastated as Peter was, he'd already decided this was negotiable. Fortunately, long before his own diagnosis, he had been fascinated by the idea of harnessing the power of modern technology to prolong human life.
Tech-savvy turn out to be most leery of self-driving cars
Karen Brenchley is a computer scientist with expertise in training artificial intelligence, but the longtime Silicon Valley resident has pangs of anxiety whenever she sees Waymo self-driving cars maneuver the streets near her home. The former product manager, who has worked for Microsoft and Hewlett-Packard, wonders how engineers could teach the robocars operating on her tree-lined streets to make snap decisions, speed and slow with the flow of traffic, and yield to pedestrians walking from the park. She has asked her husband, an award-winning science-fiction author who doesn't drive, to wear a shiny vest while cycling to ensure that autonomous vehicles spot him in a rush of activity. The problem isn't that she doesn't understand the technology. It's that she does, and she knows how flawed nascent technology can be. "I'm not skeptical long-term," said Brenchley, who has lived in Silicon Valley for 30 years.
Tech-savvy residents go nimby on self-driving cars
KAREN Brenchley is a computer scientist with expertise in training artificial intelligence, but this longtime Silicon Valley resident has pangs of anxiety whenever she sees Waymo self-driving cars manoeuvre the streets near her home. The former product manager, who has worked for Microsoft and Hewlett-Packard, wonders how engineers could teach the robocars operating on her tree-lined streets to make snap decisions, speed and slow with the flow of traffic and yield to pedestrians coming from the nearby park. She has asked her husband, an award-winning science-fiction author who does not drive, to wear a shiny vest while cycling to ensure autonomous vehicles spot him in a rush of activity. The problem is not that she does not understand the technology. It is that she does, and she knows how flawed nascent technology can be.