Goto

Collaborating Authors

 simon


The tiny tuxedo cat who became a naval hero

Popular Science

A 17-year-old British sailor saved Simon from the Hong Kong docks when he was likely a year old. Breakthroughs, discoveries, and DIY tips sent six days a week. One day in March of 1948, George Hickinbottom, a British sailor, was walking around the docks of Stonecutters Island in Hong Kong. When the 17-year-old spotted a small black-and-white tuxedo cat, barely out of kittenhood, he decided to smuggle the hungry, scrawny animal aboard his ship, the HMS . Hickinbottom didn't get in trouble.


AI tool helped recover 500m lost to fraud, government says

BBC News

A new artificial intelligence tool designed to crack down on fraud has helped the UK government recover almost £500m over the last year, the BBC can reveal. More than a third of the money clawed back related to fraudulent activity during the Covid-19 pandemic, with other cash being recouped from unlawful council tax claims and illegal subletting of social housing. The government will announce later that a new AI tool which has helped to identify the fraud will now be licensed to other countries, including the US and Australia. Civil liberties campaigners have previously criticised the Labour government for its use of AI in trying to counter fraud. The Cabinet Office says the £480m recovered in the 12 months from April 2024 is the largest sum ever reclaimed by government anti-fraud teams in a single year.


The women in love with AI chatbots: 'I vowed to him that I wouldn't leave him'

The Guardian

'Some people go into AI relationships purposefully, some out of curiosity, and others accidentally.' 'Some people go into AI relationships purposefully, some out of curiosity, and others accidentally.' The women in love with AI chatbots: 'I vowed to him that I wouldn't leave him' The Guardian's journalism is independent. We will earn a commission if you buy something through an affiliate link. A young tattoo artist on a hiking trip in the Rocky Mountains cozies up by the campfire, as her boyfriend Solin describes the constellations twinkling above them: the spidery limbs of Hercules, the blue-white sheen of Vega. The Guardian's journalism is independent. We will earn a commission if you buy something through an affiliate link. Somewhere in New England, a middle-aged woman introduces her therapist to her husband, Ying.


AI Product Value Assessment Model: An Interdisciplinary Integration Based on Information Theory, Economics, and Psychology

yang, Yu

arXiv.org Artificial Intelligence

In recent years, breakthroughs in artificial intelligence (AI) technology have triggered global industrial transformations, with applications permeating various fields such as finance, healthcare, education, and manufacturing. However, this rapid iteration is accompanied by irrational development, where enterprises blindly invest due to technology hype, often overlooking systematic value assessments. This paper develops a multi-dimensional evaluation model that integrates information theory's entropy reduction principle, economics' bounded rationality framework, and psychology's irrational decision theories to quantify AI product value. Key factors include positive dimensions (e.g., uncertainty elimination, efficiency gains, cost savings, decision quality improvement) and negative risks (e.g., error probability, impact, and correction costs). A non-linear formula captures factor couplings, and validation through 10 commercial cases demonstrates the model's effectiveness in distinguishing successful and failed products, supporting hypotheses on synergistic positive effects, non-linear negative impacts, and interactive regulations. Results reveal value generation logic, offering enterprises tools to avoid blind investments and promote rational AI industry development. Future directions include adaptive weights, dynamic mechanisms, and extensions to emerging AI technologies like generative models.


AI-Reporter: A Path to a New Genre of Scientific Communication

Graßhoff, Gerd

arXiv.org Artificial Intelligence

The AI-Reporter represents a paradigmatic shift in scientific publication practice. This document demonstrates through a concrete case study how our system transforms academic presentations into publication-ready chapters -- in less than three minutes. Using Arno Simons' lecture on Large Language Models from the ``Large Language Models for the History, Philosophy, and Sociology of Science'' workshop (NEPI) as an example, we show how technological innovation bridges the gap between ephemeral presentation and permanent scientific documentation.


Scaling up the think-aloud method

Wurgaft, Daniel, Prystawski, Ben, Gandhi, Kanishk, Zhang, Cedegao E., Tenenbaum, Joshua B., Goodman, Noah D.

arXiv.org Artificial Intelligence

The think-aloud method, where participants voice their thoughts as they solve a task, is a valuable source of rich data about human reasoning processes. Y et, it has declined in popularity in contemporary cognitive science, largely because labor-intensive transcription and annotation preclude large sample sizes. Here, we develop methods to automate the transcription and annotation of verbal reports of reasoning using natural language processing tools, allowing for large-scale analysis of think-aloud data. In our study, 640 participants thought aloud while playing the Game of 24, a mathematical reasoning task. We automatically transcribed the recordings and coded the transcripts as search graphs, finding moderate inter-rater reliability with humans. We analyze these graphs and characterize consistency and variation in human reasoning traces. Our work demonstrates the value of think-aloud data at scale and serves as a proof of concept for the automated analysis of verbal reports.


AI exoskeleton gives wheelchair users the freedom to walk again

FOX News

Wandercraft's Personal Exoskeleton is about helping people stand tall, connect with others and live life on their own terms. For Caroline Laubach, being a Wandercraft test pilot is about more than just trying out new technology. It's about reclaiming a sense of freedom and connection that many wheelchair users miss. Laubach, a spinal stroke survivor and full-time wheelchair user, has played a key role in demonstrating the personal AI-powered prototype exoskeleton's development, and her experience highlights just how life-changing this device can be. "When I'm in the exoskeleton, I feel more free than I do in my daily life," said Laubach.


Large Language Models in Numberland: A Quick Test of Their Numerical Reasoning Abilities

Rahman, Roussel

arXiv.org Artificial Intelligence

An essential element of human mathematical reasoning is our number sense -- an abstract understanding of numbers and their relationships -- which allows us to solve problems involving vast number spaces using limited computational resources. Mathematical reasoning of Large Language Models (LLMs) is often tested on high-level problems (such as Olympiad challenges, geometry, word problems, and puzzles), but their low-level number sense remains less explored. We introduce "Numberland," a 100-problem test to evaluate the numerical reasoning abilities of LLM-based agents. The tasks -- basic operations, advanced calculations (e.g., exponentiation, complex numbers), prime number checks, and the 24 game -- aim to test elementary skills and their integration in solving complex and uncertain problems. We evaluated five LLM-based agents: OpenAI's o1 and o1-mini, Google Gemini, Microsoft Copilot, and Anthropic Claude. They scored 74-95% on the first three tasks that allow deterministic steps to solutions. In the 24 game, which needs trial-and-error search, performance dropped to 10-73%. We tested the top 24 solver (o1 with 73% accuracy) on 25 harder problems, and its score fell to 27%, confirming search as a bottleneck. These results, along with the types of mistakes, suggest a fragile number of LLMs, which is a bit surprising given their prowess in challenging benchmarks. The limits of LLM numerical reasoning highlight the scope of simple, targeted tests to evaluate and explain LLM math skills to ensure safe use.


Dubito Ergo Sum: Exploring AI Ethics

Dorfler, Viktor, Cuthbert, Giles

arXiv.org Artificial Intelligence

We paraphrase Descartes' famous dictum in the area of AI ethics where the "I doubt and therefore I am" is suggested as a necessary aspect of morality. Therefore AI, which cannot doubt itself, cannot possess moral agency. Of course, this is not the end of the story. We explore various aspects of the human mind that substantially differ from AI, which includes the sensory grounding of our knowing, the act of understanding, and the significance of being able to doubt ourselves. The foundation of our argument is the discipline of ethics, one of the oldest and largest knowledge projects of human history, yet, we seem only to be beginning to get a grasp of it. After a couple of thousand years of studying the ethics of humans, we (humans) arrived at a point where moral psychology suggests that our moral decisions are intuitive, and all the models from ethics become relevant only when we explain ourselves. This recognition has a major impact on what and how we can do regarding AI ethics. We do not offer a solution, we explore some ideas and leave the problem open, but we hope somewhat better understood than before our study.


Some things to know about achieving artificial general intelligence

Roitblat, Herbert

arXiv.org Artificial Intelligence

Current and foreseeable GenAI models are not capable of achieving artificial general intelligence because they are burdened with anthropogenic debt. They depend heavily on human input to provide well-structured problems, architecture, and training data. They cast every problem as a language pattern learning problem and are thus not capable of the kind of autonomy needed to achieve artificial general intelligence. Current models succeed at their tasks because people solve most of the problems to which these models are directed, leaving only simple computations for the model to perform, such as gradient descent. Another barrier is the need to recognize that there are multiple kinds of problems, some of which cannot be solved by available computational methods (for example, "insight problems"). Current methods for evaluating models (benchmarks and tests) are not adequate to identify the generality of the solutions, because it is impossible to infer the means by which a problem was solved from the fact of its solution. A test could be passed, for example, by a test-specific or a test-general method. It is a logical fallacy (affirming the consequent) to infer a method of solution from the observation of success.