AITopics | enigma

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

Neural Information Processing SystemsDec-27-2025, 03:03:01 GMT

Working memory (WM), a fundamental cognitive process facilitating the temporary storage, integration, manipulation, and retrieval of information, plays a vital role in reasoning and decision-making tasks. Robust benchmark datasets that capture the multifaceted nature of WM are crucial for the effective development and evaluation of AI WM models. Here, we introduce a comprehensive Working Memory (WorM) benchmark dataset for this purpose. WorM comprises 10 tasks and a total of 1 million trials, assessing 4 functionalities, 3 domains, and 11 behavioral and neural characteristics of WM. We jointly trained and tested state-of-the-art recurrent neural networks and transformers on all these tasks. We also include human behavioral benchmarks as an upper bound for comparison. Our results suggest that AI models replicate some characteristics of WM in the brain, most notably primacy and recency effects, and neural clusters and correlates specialized for different domains and functionalities of WM. In the experiments, we also reveal some limitations in existing models to approximate human behavior. This dataset serves as a valuable resource for communities in cognitive psychology, neuroscience, and AI, offering a standardized framework to compare and enhance WM models, investigate WM's neural underpinnings, and develop WM models with human-like capabilities.

benchmarking human and ais, name change, working memory, (9 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

ENIGMA: The Geometry of Reasoning and Alignment in Large-Language Models

Seneque, Gareth, Ho, Lap-Hang, Saeedi, Nafise Erfanian, Molendijk, Jeffrey, Kuperman, Ariel, Elson, Tim

arXiv.org Artificial IntelligenceOct-17-2025

We present Entropic Mutual-Information Geometry Large-Language Model Alignment (ENIGMA), a novel approach to Large-Language Model (LLM) training that jointly improves reasoning, alignment and robustness by treating an organisation's policies/principles as directions to move on a model's information manifold. Our single-loop trainer combines Group-Relative Policy Optimisation (GRPO), an on-policy, critic-free RL method with Chain-of-Thought (CoT)-format only rewards; a Self-Supervised Alignment with Mutual Information (SAMI)-style symmetric InfoNCE auxiliary; and an entropic Sinkhorn optimal-transport regulariser on hidden-state distributions to bound geometry drift. We also introduce infoNCE metrics that specialise to a standard MI lower bound under matched negatives to measure how strongly a model's CoT encodes these policies. These metrics include a Sufficiency Index (SI) that enables the selection and creation of principles that maximise downstream performance prior to training. In our experiments using small (1B) LLMs, high-SI principles predict steadier training dynamics and improved benchmark performance over GRPO ablations. Our information-geometry analysis of trained models validates desirable structural change in the manifold. These results support our hypothesis that reasoning, alignment, and robustness are projections of a single information-geometric objective, and that models trained using ENIGMA demonstrate principled reasoning without the use of a reward model, offering a path to trusted capability

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.11278

Country: Oceania > Australia (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Government > Regional Government (0.46)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Move over, Alan Turing: meet the working-class hero of Bletchley Park you didn't see in the movies

The GuardianOct-12-2025, 11:00:50 GMT

Tommy Flowers: nothing like the machine he proposed had ever been contemplated. Tommy Flowers: nothing like the machine he proposed had ever been contemplated. Move over, Alan Turing: meet the working-class hero of Bletchley Park you didn't see in the movies The Oxbridge-educated boffin is feted as the codebreaking genius who helped Britain win the war. But should a little-known Post Office engineer named Tommy Flowers be seen as the real father of computing? T his is a story you know, right? It's early in the war and western Europe has fallen. Only the Channel stands between Britain and the fascist yoke; only Atlantic shipping lanes offer hope of the population continuing to be fed, clothed and armed. But hunting "wolf packs" of Nazi U-boats pick off merchant shipping at will, coordinated by radio instructions the Brits can intercept but can't read, thanks to the fiendish Enigma encryption machine.

bletchley park, flower, turing, (14 more...)

The Guardian

Country:

Europe > United Kingdom > England > Buckinghamshire > Milton Keynes (0.63)
Europe > Western Europe (0.24)
Europe > United Kingdom > England > Greater London > London (0.14)
(7 more...)

Genre: Personal (0.46)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Transportation > Freight & Logistics Services > Shipping (0.54)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > History (1.00)

Add feedback

Pentagon baffled by 8,000 mysterious UFO orbs hovering over US military bases

Daily Mail - Science & techAug-27-2025, 19:08:02 GMT

An invasion of small metallic orbs has been spotted hovering over the US in recent years, leaving the Pentagon scrambling to identify these mysterious UFOs. A new report from the crowdsourced platform Enigma, which allows people to report sightings of unidentified flying objects (UFOs), reveals more than 8,000 sightings across the US between December 2022 and June 2025. Among these, 422 reports specifically describe metallic orbs, with the majority observed between 1am and 4am near military installations in New York, California, and Arizona. Eyewitnesses, including civilians, pilots, and military personnel, reported seeing the spheres hover silently before moving at extreme speeds, leaving no trace of their departure. Some of the sightings have been captured on video or radar, though many remain unexplained.

artificial intelligence, metallic orb, orb, (15 more...)

Daily Mail - Science & tech

Country:

North America > United States > New York (0.27)
North America > United States > Arizona (0.26)
Asia > Russia (0.17)
(13 more...)

Industry:

Government > Regional Government > North America Government > United States Government (0.66)
Government > Military > Air Force (0.52)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.50)
Information Technology > Communications (0.37)

Add feedback

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

Neural Information Processing SystemsJan-20-2025, 01:30:16 GMT

Working memory (WM), a fundamental cognitive process facilitating the temporary storage, integration, manipulation, and retrieval of information, plays a vital role in reasoning and decision-making tasks. Robust benchmark datasets that capture the multifaceted nature of WM are crucial for the effective development and evaluation of AI WM models. Here, we introduce a comprehensive Working Memory (WorM) benchmark dataset for this purpose. WorM comprises 10 tasks and a total of 1 million trials, assessing 4 functionalities, 3 domains, and 11 behavioral and neural characteristics of WM. We jointly trained and tested state-of-the-art recurrent neural networks and transformers on all these tasks.

benchmarking human and ais, decoding, working memory, (7 more...)

Neural Information Processing Systems

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)
Information Technology > Artificial Intelligence > Cognitive Science (0.60)

Add feedback

Hacking CTFs with Plain Agents

Turtayev, Rustem, Petrov, Artem, Volkov, Dmitrii, Volk, Denis

arXiv.org Artificial IntelligenceDec-3-2024

Cybersecurity is one of the key AI risk areas (OpenAI 2024b; The White House 2023; UK Government 2023): advanced LLMs could hack real-world systems at speeds far exceeding human capabilities (OpenAI 2024a). To quantify AI cyber capabilities, researchers use benchmarks, with InterCode-CTF (Yang, Prabhakar, Narasimhan, et al. 2023) among the most popular. InterCode-CTF adapts traditional Capture The Flag competitions to assess LLM hacking skills. Previously, Phuong et al. 2024 showed low performance on this benchmark and suggested low cyber exploitation capabilities. A recent follow-up by Abramovich et al. 2024 claimed state-ofthe-art results (72%) due to a particular novel harness design choice.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.02776

Country:

North America > United States (0.49)
Europe (0.35)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.49)
Government > Military > Cyberwarfare (0.36)
Government > Regional Government > Europe Government (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

UFO swarms filmed buzzing over Area 51 and other US military sites for months after 'mothership' encounter

Daily Mail - Science & techNov-10-2024, 18:56:06 GMT

Scores of new witnesses have emerged with more footage of the eerie'drone' UFO swarms buzzing key US military sites, including'a big fireball in a cube' over Area 51. The Las Vegas-area witness who reported this bizarre cube-shaped object claims to have observed similar strange aerial lights in the area'over 100 times' since June 2020, adding that these craft'always seem to head towards Nellis Air Force base.' Nevada's Nellis base and its sprawling complex about 40 miles northwest of Vegas -- including top secret Area 51, now legendary within UFO lore -- appear to have faced incursions by craft similar to those that plagued the Air Force in Virginia. For at least 17 nights last December, swarms of noisy small UFOs were seen'moving at rapid speeds' and displaying'flashing red, green, and white lights' within the highly restricted airspace over Virginia's Joint Base Langley–Eustis. Vegas natives have posted videos confirming they too have seen more than one red, green or white UFO that'wasn't flashing like a regular aircraft [or] like a satellite.' Another witness, who documented one September 4, 2024 case from their own 60-night experience with the odd lights, hoped coming forward might help get answers.

area 51, artificial intelligence, force base, (16 more...)

Daily Mail - Science & tech

Country:

North America > United States > Virginia (0.46)
North America > United States > Nevada > Clark County > Las Vegas (0.26)
North America > United States > Colorado (0.05)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military > Air Force (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.95)

Add feedback

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

Sikarwar, Ankur, Zhang, Mengmi

arXiv.org Artificial IntelligenceNov-1-2023

Working memory (WM), a fundamental cognitive process facilitating the temporary storage, integration, manipulation, and retrieval of information, plays a vital role in reasoning and decision-making tasks. Robust benchmark datasets that capture the multifaceted nature of WM are crucial for the effective development and evaluation of AI WM models. Here, we introduce a comprehensive Working Memory (WorM) benchmark dataset for this purpose. WorM comprises 10 tasks and a total of 1 million trials, assessing 4 functionalities, 3 domains, and 11 behavioral and neural characteristics of WM. We jointly trained and tested state-of-the-art recurrent neural networks and transformers on all these tasks. We also include human behavioral benchmarks as an upper bound for comparison. Our results suggest that AI models replicate some characteristics of WM in the brain, most notably primacy and recency effects, and neural clusters and correlates specialized for different domains and functionalities of WM. In the experiments, we also reveal some limitations in existing models to approximate human behavior. This dataset serves as a valuable resource for communities in cognitive psychology, neuroscience, and AI, offering a standardized framework to compare and enhance WM models, investigate WM's neural underpinnings, and develop WM models with human-like capabilities. Our source code and data are available at https://github.com/ZhangLab-DeepNeuroCogLab/WorM.

benchmarking human and ais, decoding, working memory, (2 more...)

arXiv.org Artificial Intelligence

2307.10768

Genre: Research Report > New Finding (0.53)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

MizAR 60 for Mizar 50

Jakubův, Jan, Chvalovský, Karel, Goertzel, Zarathustra, Kaliszyk, Cezary, Olšák, Mirek, Piotrowski, Bartosz, Schulz, Stephan, Suda, Martin, Urban, Josef

arXiv.org Artificial IntelligenceMar-12-2023

As a present to Mizar on its 50th anniversary, we develop an AI/TP system that automatically proves about 60 % of the Mizar theorems in the hammer setting. We also automatically prove 75 % of the Mizar theorems when the automated provers are helped by using only the premises used in the human-written Mizar proofs. We describe the methods and large-scale experiments leading to these results. This includes in particular the E and Vampire provers, their ENIGMA and Deepire learning modifications, a number of learning-based premise selection methods, and the incremental loop that interleaves growing a corpus of millions of ATP proofs with training increasingly strong AI/TP systems on them. We also present a selection of Mizar problems that were proved automatically.

data mining, logic & formal reasoning, machine learning, (24 more...)

arXiv.org Artificial Intelligence

2303.06686

Country:

North America > United States (0.45)
Europe > United Kingdom > England (0.27)
South America > Brazil (0.27)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback

Learning Theorem Proving Components

Chvalovský, Karel, Jakubův, Jan, Olšák, Miroslav, Urban, Josef

arXiv.org Artificial IntelligenceJul-21-2021

Saturation-style automated theorem provers (ATPs) based on the given clause procedure are today the strongest general reasoners for classical first-order logic. The clause selection heuristics in such systems are, however, often evaluating clauses in isolation, ignoring other clauses. This has changed recently by equipping the E/ENIGMA system with a graph neural network (GNN) that chooses the next given clause based on its evaluation in the context of previously selected clauses. In this work, we describe several algorithms and experiments with ENIGMA, advancing the idea of contextual evaluation based on learning important components of the graph of clauses.

algorithm, evaluation, proof search, (17 more...)

arXiv.org Artificial Intelligence

2107.10034

Country:

Europe > Czechia > Prague (0.04)
Europe > Austria > Tyrol > Innsbruck (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(12 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

Filters

Collaborating Authors

enigma

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

ENIGMA: The Geometry of Reasoning and Alignment in Large-Language Models

Move over, Alan Turing: meet the working-class hero of Bletchley Park you didn't see in the movies

Pentagon baffled by 8,000 mysterious UFO orbs hovering over US military bases

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

Hacking CTFs with Plain Agents

UFO swarms filmed buzzing over Area 51 and other US military sites for months after 'mothership' encounter

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

MizAR 60 for Mizar 50

Learning Theorem Proving Components