Goto

Collaborating Authors

 South America



Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

arXiv.org Artificial Intelligence

Reinforcement Learning, particularly through policy gradient methods, has played a central role in enabling reasoning capabilities of Large Language Models. However, the optimization stability of policy gradients in this setting remains understudied. As a result, existing implementations often resort to conservative hyperparameter choices to ensure stability, which requires more training samples and increases computational costs. Hence, developing models for reliably tracking the underlying optimization dynamics and leveraging them into training enables more sample-efficient regimes and further unleashes scalable post-training. We address this gap by formalizing the stochastic optimization problem of policy gradients with explicit consideration of second-order geometry. We propose a tractable computational framework that tracks and leverages curvature information during policy updates. We further employ this framework to design interventions in the optimization process through data selection. The resultant algorithm, Curvature-Aware Policy Optimization (CAPO), identifies samples that contribute to unstable updates and masks them out. Theoretically, we establish monotonic improvement guarantees under realistic assumptions. On standard math reasoning benchmarks, we empirically show that CAPO ensures stable updates under aggressive learning regimes where baselines catastrophically fail. With minimal intervention (rejecting fewer than 8% of tokens), CAPO achieves up to 30x improvement in sample efficiency over standard GRPO for LLM reasoning.


Equivariant Geometric Scattering Networks via Vector Diffusion Wavelets

arXiv.org Machine Learning

We introduce a novel version of the geometric scattering transform for geometric graphs containing scalar and vector node features. This new scattering transform has desirable symmetries with respect to rigid-body roto-translations (i.e., $SE(3)$-equivariance) and may be incorporated into a geometric GNN framework. We empirically show that our equivariant scattering-based GNN achieves comparable performance to other equivariant message-passing-based GNNs at a fraction of the parameter count.


CINDES: Classification induced neural density estimator and simulator

arXiv.org Machine Learning

Neural network-based methods for (un)conditional density estimation have recently gained substantial attention, as various neural density estimators have outperformed classical approaches in real-data experiments. Despite these empirical successes, implementation can be challenging due to the need to ensure non-negativity and unit-mass constraints, and theoretical understanding remains limited. In particular, it is unclear whether such estimators can adaptively achieve faster convergence rates when the underlying density exhibits a low-dimensional structure. This paper addresses these gaps by proposing a structure-agnostic neural density estimator that is (i) straightforward to implement and (ii) provably adaptive, attaining faster rates when the true density admits a low-dimensional composition structure. Another key contribution of our work is to show that the proposed estimator integrates naturally into generative sampling pipelines, most notably score-based diffusion models, where it achieves provably faster convergence when the underlying density is structured. We validate its performance through extensive simulations and a real-data application.


Intelligent 5S Audit: Application of Artificial Intelligence for Continuous Improvement in the Automotive Industry

arXiv.org Artificial Intelligence

Abstract--The evolution of the 5S methodology with the support of artificial intelligence techniques represents a significant opportunity to improve industrial organization audits in the automotive chain, making them more objective, efficient and aligned with Industry 4.0 standards. This work developed an automated 5S audit system based on large-scale language models (LLM), capable of assessing the five senses (Seiri, Seiton, Seiso, Seiketsu, Shitsuke) in a standardized way through intelligent image analysis. The system's reliability was validated using Cohen's concordance coefficient (κ = 0.75), showing strong alignment between the automated assessments and the corresponding human audits. The results indicate that the proposed solution contributes significantly to continuous improvement in automotive manufacturing environments, speeding up the audit process by 50% of the traditional time and maintaining the consistency of the assessments, with a 99.8% reduction in operating costs compared to traditional manual audits. The methodology presented establishes a new paradigm for integrating lean systems with emerging AI technologies, offering scalability for implementation in automotive plants of different sizes. The global automotive industry faces growing competitiveness challenges demanding maximized operational efficiency and production quality. The 5S methodology, recognized worldwide as the foundation for workplace organization and cleanliness, plays a strategic role in operational excellence.


REAL: Reading Out Transformer Activations for Precise Localization in Language Model Steering

arXiv.org Artificial Intelligence

Inference-time steering aims to alter a large language model's (LLM's) responses without changing its parameters, but a central challenge is identifying the internal modules that most strongly govern the target behavior. Existing approaches often rely on simplistic cues or ad hoc heuristics, leading to suboptimal or unintended effects. We introduce REAL, a framework for identifying behavior-relevant modules (attention heads or layers) in Transformer models. For each module, REAL trains a vector-quantized autoencoder (VQ-AE) on its hidden activations and uses a shared, learnable codebook to partition the latent space into behavior-relevant and behavior-irrelevant subspaces. REAL quantifies a module's behavioral relevance by how well its VQ-AE encodings discriminate behavior-aligned from behavior-violating responses via a binary classification metric; this score guides both module selection and steering strength. We evaluate REAL across eight LLMs from the Llama and Qwen families and nine datasets spanning truthfulness enhancement, open-domain QA under knowledge conflicts, and general alignment tasks. REAL enables more effective inference-time interventions, achieving an average relative improvement of 20% (up to 81.5%) over the ITI method on truthfulness steering. In addition, the modules selected by REAL exhibit strong zero-shot generalization in cross-domain truthfulness-steering scenarios.



French troops board oil tanker linked to Russian 'shadow fleet'

BBC News

French troops board oil tanker linked to Russian'shadow fleet' French soldiers have boarded an oil tanker believed to be part of Russia's shadow fleet, used to evade sanctions imposed because of the war in Ukraine. The Boracay left Russia last month and was off the coast of Denmark when unidentified drones forced the temporary closure of several airports last week. It has been anchored off western France for a few days. French President Emmanuel Macron said at an EU leaders' summit in Copenhagen on Wednesday that the crew had committed serious offences, but did not elaborate. Kremlin spokesman Dmitry Peskov said Russia had no knowledge of the vessel.


Danish PM warns that Russia is waging hybrid war on Europe

Al Jazeera

Can Ukraine restore its pre-war borders? Why are Tomahawk missiles for Ukraine a'red line' for Russia? Is Russia testing NATO with aerial incursions in Europe? Macron, Meloni argue for caution in responding to Russian'provocations' Danish Prime Minister Mette Frederiksen has called on Europe to arm itself to respond to Russia's hybrid warfare, but other major continental leaders have argued for caution against getting trapped in a tit-for-tat cycle of escalation with Moscow. "I hope that everybody recognises now that there is a hybrid war and one day it's Poland, the other day it's Denmark, and next week it will probably be somewhere else that we see sabotage or we see drones flying," Frederiksen told reporters on Wednesday.


Interstellar object spotted 'bleeding' mysterious metals that defy science: 'It's extremely puzzling'

Daily Mail - Science & tech

Diddy FUMBLES as he speaks in public for first time in 13 months and begs his mother's forgiveness through tears Shroud of Turin mystery deepens as surgeon spots hidden detail that points to Jesus' resurrection I was so happy after trying a trendy new cosmetic procedure. But 10 years later I suffered a devastating side effect... the doctor had lied I'm no longer sleeping with my husband - and never will again, says MOLLY RYDDELL. I love him, but counted down the moments until he climaxed. Then I couldn't bear it any more and the truth spilled out... so many women feel the same The'middle-class kinks' saving marriages: Wives reveal the eight buzzy sex trends that revived their lagging libidos - including the fantasy husbands are secretly obsessed with I'm a woman with autism... here are the signs you might be masking, even from yourself Lori Loughlin's husband Mossimo Giannulli seen with mystery brunette in tiny skirt day after shock split Body count from Houston's bayous rises as serial killer whispers grip city and residents are told: 'Be vigilant' Cake-faced 90s sitcom star looks unrecognizable as she ditches the heavy eyeshadow for an LA errand run can you guess who? Trump dollar coin design released by Treasury... and it's inspired by the most iconic political photo of the century I've loved Taylor Swift for years. Mystery deepens over Hulk Hogan's death as his widow faces fresh anguish Warning as pasta salad is recalled due to risk of'fatal infections' Interstellar object spotted shedding mysterious metals in way that defies science: 'It's extremely puzzling' The mysterious interstellar object speeding through our Solar System has been spotted shedding metals in a way that defies the rules of a comet.