Goto

Collaborating Authors

 hofstadter


Beyond World Models: Rethinking Understanding in AI Models

Gupta, Tarun, Pruthi, Danish

arXiv.org Artificial Intelligence

World models have garnered substantial interest in the AI community. These are internal representations that simulate aspects of the external world, track entities and states, capture causal relationships, and enable prediction of consequences. This contrasts with representations based solely on statistical correlations. A key motivation behind this research direction is that humans possess such mental world models, and finding evidence of similar representations in AI models might indicate that these models "understand" the world in a human-like way. In this paper, we use case studies from the philosophy of science literature to critically examine whether the world model framework adequately characterizes human-level understanding. We focus on specific philosophical analyses where the distinction between world model capabilities and human understanding is most pronounced. While these represent particular views of understanding rather than universal definitions, they help us explore the limits of world models.


The Case That A.I. Is Thinking

The New Yorker

The Case That A.I. Is Thinking ChatGPT does not have an inner life. Yet it seems to know what it's talking about. How convincing does the illusion of understanding have to be before you stop calling it an illusion? Dario Amodei, the C.E.O. of the artificial-intelligence company Anthropic, has been predicting that an A.I. "smarter than a Nobel Prize winner" in such fields as biology, math, engineering, and writing might come online by 2027. He envisions millions of copies of a model whirring away, each conducting its own research: a "country of geniuses in a datacenter." In June, Sam Altman, of OpenAI, wrote that the industry was on the cusp of building "digital superintelligence." "The 2030s are likely going to be wildly different from any time that has come before," he asserted. Meanwhile, the A.I. tools that most people currently interact with on a day-to-day basis are reminiscent of Clippy, the onetime Microsoft Office "assistant" that was actually more of a gadfly. A Zoom A.I. tool suggests that you ask it "What are some meeting icebreakers?" or instruct it to "Write a short message to share gratitude." Siri is good at setting reminders but not much else. A friend of mine saw a button in Gmail that said "Thank and tell anecdote." When he clicked it, Google's A.I. invented a funny story about a trip to Turkey that he never took. The rushed and uneven rollout of A.I. has created a fog in which it is tempting to conclude that there is nothing to see here--that it's all hype. There is, to be sure, plenty of hype: Amodei's timeline is science-fictional.


The '10 Martini' Proof Connects Quantum Mechanics With Infinitely Intricate Mathematical Structures

WIRED

The proof, known to be so hard that a mathematician once offered 10 martinis to whoever could figure it out, uses number theory to explain quantum fractals. In 1974, five years before he wrote his Pulitzer Prize-winning book, Douglas Hofstadter was a graduate student in physics at the University of Oregon. When his doctoral adviser went on sabbatical to Regensburg, Germany, Hofstadter tagged along, hoping to practice his German. The pair joined a group of brilliant theoretical physicists who were agonizing over a particular problem in quantum theory. They wanted to determine the energy levels of an electron in a crystal grid placed near a magnet. Hofstadter was the odd one out, unable to follow the others' line of thought. "Part of my luck was that I couldn't keep up with them," he said.


What's Happening to Reading?

The New Yorker

What do you read, and why? Reading was an unremarkable activity, essentially unchanged since the advent of the modern publishing industry, in the nineteenth century. In a 2017 Shouts & Murmurs titled "Before the Internet," the writer Emma Rathbone captured the spirit of reading as it used to be: "Before the Internet, you could laze around on a park bench in Chicago reading some Dean Koontz, and that would be a legit thing to do and no one would ever know you had done it unless you told them." Reading was just reading, and no matter what you chose to read--the paper, Proust, "The Power Broker"--you basically did it by moving your eyes across a page, in silence, at your own pace and on your own schedule. Today, the nature of reading has shifted.


Did AI mania rush Apple into making a rare misstep with Siri? John Naughton

The Guardian

After ChatGPT broke cover in late 2022 and the tech industry embarked on its contemporary rendering of tulip mania, people started to wonder why the biggest tech giant of all – Apple – was keeping its distance from the madness. Eventually, the tech commentariat decided that there could be only two possible interpretations of this corporate standoffishness: either Apple was way behind the game being played by OpenAI et al; or it had cunning plans to unleash upon the world its own world-beating take on the technology. Finally, at its annual World Wide Developers' Conference (WWDC) on 10 June last year Apple came clean. For Apple, "AI" would not mean what those vulgar louts at OpenAI, Google, Microsoft and Meta raved about, but something altogether more refined and sophisticated – something called "Apple Intelligence". It was not, as the veteran Apple-watcher John Gruber put it, a single thing or product but "a marketing term for a collection of features, apps, and services". Putting it all under a single, memorable label made it easier for users to understand that Apple was launching something really novel.


The hard truth about AI? It might produce some better software John Naughton

The Guardian

As you have doubtless noticed, we are in the middle of a feeding frenzy about something called generative AI. Legions of hitherto normal people – and economists – are surfing a wave of irrational exuberance about its transformative potential. For anyone suffering from the fever, two antidotes are recommended. The first is the hype cycle monitor produced by consultants Gartner, which shows the technology currently perched on the "peak of inflated expectations", before a steep decline into the "trough of disillusionment". The other is Hofstadter's law, about the difficulty of estimating how long difficult tasks will take, which says that "It always takes longer than you expect, even when you take into account Hofstadter's law".


I am a Strange Dataset: Metalinguistic Tests for Language Models

Thrush, Tristan, Moore, Jared, Monares, Miguel, Potts, Christopher, Kiela, Douwe

arXiv.org Artificial Intelligence

Statements involving metalinguistic self-reference ("This paper has six sections.") are prevalent in many domains. Can large language models (LLMs) handle such language? In this paper, we present "I am a Strange Dataset", a new dataset for addressing this question. There are two subtasks: generation and verification. In generation, models continue statements like "The penultimate word in this sentence is" (where a correct continuation is "is"). In verification, models judge the truth of statements like "The penultimate word in this sentence is sentence." (false). We also provide minimally different metalinguistic non-self-reference examples to complement the main dataset by probing for whether models can handle metalinguistic language at all. The dataset is hand-crafted by experts and validated by non-expert annotators. We test a variety of open-source LLMs (7B to 70B parameters) as well as closed-source LLMs through APIs. All models perform close to chance across both subtasks and even on the non-self-referential metalinguistic control data, though we find some steady improvement with model scale. GPT 4 is the only model to consistently do significantly better than chance, and it is still only in the 60% range, while our untrained human annotators score well in the 89-93% range. The dataset and evaluation toolkit are available at https://github.com/TristanThrush/i-am-a-strange-dataset.


Logic-based similarity

Antić, Christian

arXiv.org Artificial Intelligence

This paper develops a {\em qualitative} and logic-based notion of similarity from the ground up using only elementary concepts of first-order logic centered around the fundamental model-theoretic notion of type.


Metaphors We Learn By

Memisevic, Roland

arXiv.org Artificial Intelligence

Gradient based learning using error back-propagation ("backprop") is a wellknown contributor to much of the recent progress in AI. A less obvious, but arguably equally important, ingredient is parameter sharing - most well-known in the context of convolutional networks. In this essay we relate parameter sharing ("weight sharing") to analogy making and the school of thought of cognitive metaphor. We discuss how recurrent and auto-regressive models can be thought of as extending analogy making from static features to dynamic skills and procedures. We also discuss corollaries of this perspective, for example, how it can challenge the currently entrenched dichotomy between connectionist and "classic" rule-based views of computation. It is well-known that neural networks, regardless whether training is supervised or self-supervised, require large amounts of training data to work well. To ensure generalization, one can maximize the number of training examples, minimize the number of tunable parameters, or do both. Parameter sharing is a common principle to reduce the number of tunable parameters without having to reduce the number of actual parameters (synaptic connections) in the network. In fact, it is hard to find any neural network architecture in the literature, that does not make use of parameter sharing in some way.


When we might meet the first intelligent machines

#artificialintelligence

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! How close are we to living in a world where human-level intelligence is exceeded by machines? Over the course of my career, I've regularly engaged in a thought experiment where I try to "think like the computer" in order to imagine a solution to a programming challenge or opportunity. The gulf between human reasoning and software code was always pretty clear.