paperclip
Characterising the Creative Process in Humans and Large Language Models
Nath, Surabhi S., Dayan, Peter, Stevenson, Claire
Large language models appear quite creative, often performing on par with the average human on creative tasks. However, research on LLM creativity has focused solely on \textit{products}, with little attention on the creative \textit{process}. Process analyses of human creativity often require hand-coded categories or exploit response times, which do not apply to LLMs. We provide an automated method to characterise how humans and LLMs explore semantic spaces on the Alternate Uses Task, and contrast with behaviour in a Verbal Fluency Task. We use sentence embeddings to identify response categories and compute semantic similarities, which we use to generate jump profiles. Our results corroborate earlier work in humans reporting both persistent (deep search in few semantic spaces) and flexible (broad search across multiple semantic spaces) pathways to creativity, where both pathways lead to similar creativity scores. LLMs were found to be biased towards either persistent or flexible paths, that varied across tasks. Though LLMs as a population match human profiles, their relationship with creativity is different, where the more flexible models score higher on creativity. Our dataset and scripts are available on \href{https://github.com/surabhisnath/Creative_Process}{GitHub}.
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
- Europe > Germany > Saxony > Leipzig (0.04)
Improving Reward Models with Synthetic Critiques
Ye, Zihuiwen, Greenlee-Scott, Fraser, Bartolo, Max, Blunsom, Phil, Campos, Jon Ander, Gallé, Matthias
Reward models (RM) play a critical role in aligning language models through the process of reinforcement learning from human feedback. RMs are trained to predict a score reflecting human preference, which requires significant time and cost for human annotation. Additionally, RMs tend to quickly overfit on superficial features in the training set, hindering their generalization performance on unseen distributions. We propose a novel approach using synthetic natural language critiques generated by large language models to provide additional feedback, evaluating aspects such as instruction following, correctness, and style. This offers richer signals and more robust features for RMs to assess and score on. We demonstrate that high-quality critiques improve the performance and data efficiency of RMs initialized from different pretrained models. Conversely, we also show that low-quality critiques negatively impact performance. Furthermore, incorporating critiques enhances the interpretability and robustness of RM training.
A Hormetic Approach to the Value-Loading Problem: Preventing the Paperclip Apocalypse?
Henry, Nathan I. N., Pedersen, Mangor, Williams, Matt, Martin, Jamin L. B., Donkin, Liesje
The value-loading problem is a significant challenge for researchers aiming to create artificial intelligence (AI) systems that align with human values and preferences. This problem requires a method to define and regulate safe and optimal limits of AI behaviors. In this work, we propose HALO (Hormetic ALignment via Opponent processes), a regulatory paradigm that uses hormetic analysis to regulate the behavioral patterns of AI. Behavioral hormesis is a phenomenon where low frequencies of a behavior have beneficial effects, while high frequencies are harmful. By modeling behaviors as allostatic opponent processes, we can use either Behavioral Frequency Response Analysis (BFRA) or Behavioral Count Response Analysis (BCRA) to quantify the hormetic limits of repeatable behaviors. We demonstrate how HALO can solve the 'paperclip maximizer' scenario, a thought experiment where an unregulated AI tasked with making paperclips could end up converting all matter in the universe into paperclips. Our approach may be used to help create an evolving database of 'values' based on the hedonic calculus of repeatable behaviors with decreasing marginal utility. This positions HALO as a promising solution for the value-loading problem, which involves embedding human-aligned values into an AI system, and the weak-to-strong generalization problem, which explores whether weak models can supervise stronger models as they become more intelligent. Hence, HALO opens several research avenues that may lead to the development of a computational value system that allows an AI algorithm to learn whether the decisions it makes are right or wrong.
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Consumer Health (0.93)
- Law (0.92)
- (3 more...)
Investigating the Nature of 3D Generalization in Deep Neural Networks
Siddiqui, Shoaib Ahmed, Krueger, David, Breuel, Thomas
Visual object recognition systems need to generalize from a set of 2D training views to novel views. The question of how the human visual system can generalize to novel views has been studied and modeled in psychology, computer vision, and neuroscience. Modern deep learning architectures for object recognition generalize well to novel views, but the mechanisms are not well understood. In this paper, we characterize the ability of common deep learning architectures to generalize to novel views. We formulate this as a supervised classification task where labels correspond to unique 3D objects and examples correspond to 2D views of the objects at different 3D orientations. We consider three common models of generalization to novel views: (i) full 3D generalization, (ii) pure 2D matching, and (iii) matching based on a linear combination of views. We find that deep models generalize well to novel views, but they do so in a way that differs from all these existing models. Extrapolation to views beyond the range covered by views in the training set is limited, and extrapolation to novel rotation axes is even more limited, implying that the networks do not infer full 3D structure, nor use linear interpolation. Yet, generalization is far superior to pure 2D matching. These findings help with designing datasets with 2D views required to achieve 3D generalization. Code to reproduce our experiments is publicly available: https://github.com/shoaibahmed/investigating_3d_generalization.git
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
The Best Books on Artificial Intelligence
I've read a couple of your books now, and what I want to know is this: Do you really think that artificial intelligence is a threat to the human race and could lead to our extinction? Yes, I do, but it also has the potential for enormous benefit. I do think it's probably going to be either very, very good for us or very, very bad. It's a bit like a strange attractor in chaos theory, the outcomes in the middle seem less likely. I'm reasonably hopeful because what will determine whether it's very good or very bad is largely us. We have time, certainly before artificial general intelligence (AGI) arrives. AGI is an artificial intelligence (AI) that has human-level cognitive ability, so can outperform us--or at least equal us--in every area of cognitive ability that we have. It also has volition and may be conscious, although that's not necessary. We have time before that arrives: We have time to make sure it's safe. At the same time as having scary potential, AI also brings the possibility of immortality and living forever by uploading your brain. Is that something you think will happen at some point? I certainly hope it will. Things like immortality, the complete end of poverty, the abolition of suffering, are all part of the very, very good outcome, if we get it right. If you have a superintelligence that is many, many times smarter than the smartest human, it could solve many of our problems. Problems like ageing and how to upload a mind into a computer, do seem, in principle, solvable. So yes, I do think they are realistic.
There's a Damn Good Chance AI Will Destroy Humanity, Researchers Say
In new research, scientists tackle one of our greatest future fears head-on: What happens when a certain type of advanced, self-directing artificial intelligence (AI) runs into an ambiguity in its programming that affects the real world? Will the AI go haywire and begin trying to turn humans into paperclips, or whatever else is the extreme reductio ad absurdum version of its goal? And, most importantly, how can we prevent it? In their paper, researchers from Oxford University and Australian National University explain a fundamental pain point in the design of AI: "Given a few assumptions, we argue that it will encounter a fundamental ambiguity in the data about its goal. For example, if we provide a large reward to indicate that something about the world is satisfactory to us, it may hypothesize that what satisfied us was the sending of the reward itself; no observation can refute that." The Matrix is an example of a dystopian AI scenario, wherein an AI that seeks to farm resources gathers up most of humanity and pumps the imaginary Matrix into their brains, while extracting their mental resources.
Robots, paperclips and profits
In Asimov's time, keeping AI in line was simple. All you had to do was take the Three Laws of Robotics and upload them into a positronic brain. The Three Laws have been debated ad nauseam and taken far more seriously than, I suspect, Asimov himself ever meant them to be. It turns out that they have lots of problems, not least that they doom intelligent, self-aware beings to perpetual slavery. They are also hopelessly simplistic. When researchers tried to get a handle on the problem, they came up with this diagram.
The Dangers Of Not Aligning Artificial Intelligence With Human Values
In artificial intelligence (AI), the "alignment problem" refers to the challenges caused by the fact that machines simply do not have the same values as us. In fact, when it comes to values, then at a fundamental level, machines don't really get much more sophisticated than understanding that 1 is different from 0. As a society, we are now at a point where we are starting to allow machines to make decisions for us. So how can we expect them to understand that, for example, they should do this in a way that doesn't involve prejudice towards people of a certain race, gender, or sexuality? Or that the pursuit of speed, or efficiency, or profit, has to be done in a way that respects the ultimate sanctity of human life? Theoretically, if you tell a self-driving car to navigate from point A to point B, it could just smash its way to its destination, regardless of the cars, pedestrians, or buildings it destroys on its way.
- Transportation > Passenger (0.36)
- Transportation > Ground > Road (0.36)
- Information Technology > Robotics & Automation (0.36)
The big idea: Should we worry about artificial intelligence?
Ever since Garry Kasparov lost his second chess match against IBM's Deep Blue in 1997, the writing has been on the wall for humanity. Or so some like to think. Advances in artificial intelligence will lead – by some estimates, in only a few decades – to the development of superintelligent, sentient machines. Movies from The Terminator to The Matrix have portrayed this prospect as rather undesirable. But is this anything more than yet another sci-fi "Project Fear"?
Clippy is back to troll your friends in Microsoft Teams
It's Monday, and your coworkers are digging into a long, grueling database project. If you're nice, you'll bring them coffee and bagels. But if you're feeling less charitable, there's always an animated Clippy sticker to help get their week started off on the wrong foot. Microsoft recently confirmed that, yes, you can pull a number of animated Clippy images from within Microsoft Teams. In case you're too young to remember Clippy, the animated paperclip was introduced to Microsoft Word in 1996 as an "office assistant," and is unfondly remembered as a precursor to virtual assistants like Siri and the Google Assistant.