diamond
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (2 more...)
- Leisure & Entertainment > Games (0.72)
- Materials > Metals & Mining > Iron (0.31)
Diffusion for World Modeling: Visual Details Matter in Atari
World models constitute a promising approach for training reinforcement learning agents in a safe and sample-efficient manner. Recent world models predominantly operate on sequences of discrete latent variables to model environment dynamics. However, this compression into a compact discrete representation may ignore visual details that are important for reinforcement learning. Concurrently, diffusion models have become a dominant approach for image generation, challenging well-established methods modeling discrete latents. Motivated by this paradigm shift, we introduce DIAMOND (DIffusion As a Model Of eNvironment Dreams), a reinforcement learning agent trained in a diffusion world model. We analyze the key design choices that are required to make diffusion suitable for world modeling, and demonstrate how improved visual details can lead to improved agent performance. DIAMOND achieves a mean human normalized score of 1.46 on the competitive Atari 100k benchmark; a new best for agents trained entirely within a world model. We further demonstrate that DIAMOND's diffusion world model can stand alone as an interactive neural game engine by training on static gameplay. To foster future research on diffusion for world modeling, we release our code, agents, videos and playable world models at https://diamond-wm.github.io.
Reasoning Under Pressure: How do Training Incentives Influence Chain-of-Thought Monitorability?
MacDermott, Matt, Wei, Qiyao, Djoneva, Rada, Ward, Francis Rhys
AI systems that output their reasoning in natural language offer an opportunity for safety -- we can \emph{monitor} their chain of thought (CoT) for undesirable reasoning, such as the pursuit of harmful objectives. However, the extent to which CoT faithfully reflects the underlying reasoning process, and hence the extent to which it can be usefully monitored, may be influenced by certain aspects of training. We investigate how different \emph{training incentives}, applied to a reasoning model, affect its monitorability. We introduce a novel methodology for measuring monitorability according to whether a monitor can predict a key latent variable using the model's reasoning. When controlling for accuracy, we do not find evidence for consistent effects from commonly used incentives (length penalties and KL regularisation), but we find that adversarial optimisation (penalising monitor accuracy) degrades monitor performance, while direct optimisation for monitorability does not reliably lead to improvements. Our code is available at https://github.com/QiyaoWei/reasoning-under-pressure.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Monaco (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.65)
Conversational no-code and multi-agentic disease module identification and drug repurposing prediction with ChatDRex
Süwer, Simon, Bagemihl, Kester, Baier, Sylvie, Dicunta, Lucia, List, Markus, Baumbach, Jan, Maier, Andreas, Delgado-Chaves, Fernando M.
Repurposing approved drugs offers a time-efficient and cost-effective alternative to traditional drug development. However, in silico prediction of repurposing candidates is challenging and requires the effective collaboration of specialists in various fields, including pharmacology, medicine, biology, and bioinformatics. Fragmented, specialized algorithms and tools often address only narrow aspects of the overall problem, and heterogeneous, unstructured data landscapes require specialized users to be involved. Hence, these data services do not integrate smoothly across workflows. With ChatDRex, we present a conversation-based, multi-agent system that facilitates the execution of complex bioinformatic analyses aiming for network-based drug repurposing prediction. It builds on the integrated systems medicine knowledge graph NeDRex. ChatDRex provides natural language access to its extensive biomedical KG and integrates bioinformatics agents for network analysis and drug repurposing, complemented by agents for functional coherence evaluation for in silico validation, as well as agents for literature mining and for discussing the obtained results in a scientific context. Its flexible multi-agent design assigns specific tasks to specialized agents, including query routing, data retrieval, algorithm execution, and result visualization. A dedicated reasoning module keeps the user in the loop and allows for hallucination detection. By enabling physicians and researchers without computer science expertise to control complex analyses in natural language, ChatDRex democratizes access to bioinformatics as an important resource for drug repurposing. It enables clinical experts to generate hypotheses and explore drug repurposing opportunities, ultimately accelerating the discovery of novel therapies and advancing personalized medicine and translational research.
Will quantum be bigger than AI?
Will quantum be bigger than AI? There's an old adage among tech journalists like me - you can either explain quantum accurately, or in a way that people understand, but you can't do both. That's because quantum mechanics - a strange and partly theoretical branch of physics - is a fiendishly difficult concept to get your head around. It involves tiny particles behaving in weird ways. And this odd activity has opened up the potential of a whole new world of scientific super power. Its mind-boggling complexity is probably a factor in why quantum has ended up with a lower profile than tech's current rockstar - artificial intelligence (AI).
- Europe > France (0.15)
- South America (0.14)
- North America > Central America (0.14)
- (15 more...)
- Information Technology > Security & Privacy (0.95)
- Health & Medicine (0.71)
- Leisure & Entertainment (0.71)
- Government > Regional Government > Europe Government (0.48)
Quantifying truth and authenticity in AI-assisted candidate evaluation: A multi-domain pilot analysis
Lee, Eldred, Worley, Nicholas, Takatsuji, Koshu
This paper presents a retrospective analysis of anonymized candidate-evaluation data collected during pilot hiring campaigns conducted through AlteraSF, an AI-native resume-verification platform. The system evaluates resume claims, generates context-sensitive verification questions, and measures performance along quantitative axes of factual validity and job fit, complemented by qualitative integrity detection. Across six job families and 1,700 applications, the platform achieved a 90-95% reduction in screening time and detected measurable linguistic patterns consistent with AI-assisted or copied responses. The analysis demonstrates that candidate truthfulness can be assessed not only through factual accuracy but also through patterns of linguistic authenticity. The results suggest that a multi-dimensional verification framework can improve both hiring efficiency and trust in AI-mediated evaluation systems.
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- Europe > Eastern Europe (0.04)
- Asia > Southeast Asia (0.04)
What to Know About the Shocking Louvre Jewelry Heist
In just seven minutes, the thieves took off with crown jewels containing with thousands of diamonds along with other precious gems. Police stand outside the Louvre after a brazen theft. Could the French TV series have been prophetic? The show envisioned a heist at the Louvre, an event that became reality on the morning of October 19, when a group of professional thieves managed to break into the world-famous Paris museum . In just seven minutes, they stole a host of priceless French crown jewels.
- Europe > France (0.30)
- North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.05)
- North America > United States > California (0.05)
- (3 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.94)
- Retail (0.86)
- Government > Regional Government (0.71)
- Information Technology (0.93)
- Leisure & Entertainment > Games > Computer Games (0.93)
Mars: Situated Inductive Reasoning in an Open-World Environment Xiaojuan Tang
Large Language Models (LLMs) trained on massive corpora have shown remarkable success in knowledge-intensive tasks. Y et, most of them rely on pre-stored knowledge. Inducing new general knowledge from a specific environment and performing reasoning with the acquired knowledge-- situated inductive reasoning, is crucial and challenging for machine intelligence. In this paper, we design Mars, an interactive environment devised for situated inductive reasoning. It introduces counter-commonsense game mechanisms by modifying terrain, survival setting and task dependency while adhering to certain principles.
- Asia > China (0.04)
- North America > United States > Massachusetts (0.04)
- North America > Montserrat (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Workflow (0.93)
- Research Report > New Finding (0.92)
- Materials > Metals & Mining > Diamonds (0.46)
- Materials > Metals & Mining > Coal (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.83)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (2 more...)
- Materials > Metals & Mining (0.96)
- Leisure & Entertainment > Sports > Golf (0.95)
- Leisure & Entertainment > Games (0.72)