Industry
ADifferential and Pointwise Control Approach to Reinforcement Learning
Reinforcement learning (RL) in continuous state-action spaces remains challenging in scientific computing due to poor sample efficiency and lack of pathwise physical consistency. We introduce Differential Reinforcement Learning (Differential RL), a novel framework that reformulates RL from a continuous-time control perspective via a differential dual formulation. This induces a Hamiltonian structure that embeds physics priors and ensures consistent trajectories without requiring explicit constraints. To implement Differential RL, we develop Differential Policy Optimization (dfPO), a pointwise, stage-wise algorithm that refines local movement operators along the trajectory for improved sample efficiency and dynamic alignment. We establish pointwise convergence guarantees, a property not available in standard RL, and derive a competitive theoretical regret bound of O(K5/6). Empirically, dfPO outperforms standard RL baselines on representative scientific computing tasks, including surface modeling, grid control, and molecular dynamics, under low-data and physics-constrained conditions.
Addressing Mark Imbalance in Integration-free Neural Marked Temporal Point Processes
Marked Temporal Point Process (MTPP) has been well studied to model the event distribution in marked event streams, which can be used to predict the mark and arrival time of the next event. However, existing studies overlook that the distribution of event marks is highly imbalanced in many real-world applications, with some marks being frequent but others rare. The imbalance poses a significant challenge to the performance of the next event prediction, especially for events of rare marks. To address this issue, we propose a thresholding method, which learns thresholds to tune the mark probability normalized by the mark's prior probability to optimize mark prediction, rather than predicting the mark directly based on the mark probability as in existing studies. In conjunction with this method, we predict the mark first and then the time. In particular, we develop a novel neural MTPP model to support effective time sampling and estimation of mark probability without computationally expensive numerical improper integration. Extensive experiments on real-world datasets demonstrate the superior performance of our solution against various baselines for the next event mark and time prediction.
Inference-Time Reward Hacking in Large Language Models
A common paradigm to improve the performance of large language models is optimizing for a reward model. Reward models assign a numerical score to an LLM's output that indicates, for example, how likely it is to align with user preferences or safety goals. However, reward models are never perfect. They inevitably function as proxies for complex desiderata such as correctness, helpfulness, and safety. By overoptimizing for a misspecified reward, we can subvert intended alignment goals and reduce overall performance - a phenomenon commonly referred to as reward hacking.
Attention on the Sphere
We introduce a generalized attention mechanism for spherical domains, enabling Transformer architectures to natively process data defined on the two-dimensional sphere - a critical need in fields such as atmospheric physics, cosmology, and robotics, where preserving spherical symmetries and topology is essential for physical accuracy. By integrating numerical quadrature weights into the attention mechanism, we obtain a geometrically faithful spherical attention that is approximately rotationally equivariant, providing strong inductive biases and leading to better performance than Cartesian approaches. To further enhance both scalability and model performance, we propose neighborhood attention on the sphere, which confines interactions to geodesic neighborhoods. This approach reduces computational complexity and introduces the additional inductive bias for locality, while retaining the symmetry properties of our method. We provide optimized CUDA kernels and memory-efficient implementations to ensure practical applicability. The method is validated on three diverse tasks: simulating shallow water equations on the rotating sphere, spherical image segmentation, and spherical depth estimation. Across all tasks, our spherical Transformers consistently outperform their planar counterparts, highlighting the advantage of geometric priors for learning on spherical domains.
You Only Communicate Once: One-shot Federated Low-Rank Adaptation of MLLM
Multimodal Large Language Models (MLLMs) with Federated Learning (FL) can quickly adapt to privacy-sensitive tasks, but face significant challenges such as high communication costs and increased attack risks, due to their reliance on multiround communication. To address this, One-shot FL (OFL) has emerged, aiming to complete adaptation in a single client-server communication. However, existing adaptive ensemble OFL methods still need more than one round of communication, because correcting heterogeneity-induced local bias relies on aggregated global supervision, meaning they still do not achieve true one-shot communication. In this work, we make the first attempt to achieve true one-shot communication for MLLMs under OFL, by investigating whether implicit (i.e., initial rather than aggregated) global supervision alone can effectively correct local training bias. Our key finding from the empirical study is that imposing directional supervision on local training substantially mitigates client conflicts and local bias. Building on this insight, we propose YOCO, in which directional supervision with sign-regularized LoRAB enforces global consistency, while sparsely regularized LoRAA preserves client-specific adaptability. Experiments demonstrate that YOCO cuts communication to 0.03% of multi-round FL while surpassing those methods in several multimodal scenarios and consistently outperforming all one-shot competitors.
MS-GS: Multi-Appearance Sparse-View 3DGaussian Splatting in the Wild
In-the-wild photo collections often contain limited volumes of imagery and exhibit multiple appearances, e.g., taken at different times of day or seasons, posing significant challenges to scene reconstruction and novel view synthesis. Although recent adaptations of Neural Radiance Field (NeRF) and 3DGaussian Splatting (3DGS) have improved in these areas, they tend to oversmooth and are prone to overfitting. In this paper, we present MS-GS, a novel framework designed with Multi-appearance capabilities in Sparse-view scenarios using 3DGS. To address the lack of support due to sparse initializations, our approach is built on the geometric priors elicited from monocular depth estimations. The key lies in extracting and utilizing local semantic regions with a Structure-from-Motion (SfM) points anchored algorithm for reliable alignment and geometry cues. Then, to introduce multi-view constraints, we propose a series of geometry-guided supervision steps at virtual views in pixel and feature levels to encourage 3D consistency and reduce overfitting. We also introduce a dataset and an in-the-wild experiment setting to set up more realistic benchmarks. We demonstrate that MS-GS achieves photorealistic renderings under various challenging sparse-view and multi-appearance conditions, and outperforms existing approaches significantly across different datasets.
Who will win the World Cup? Mathematician's 11 models predict four possible champions (but NOT England!)
Embattled Gavin Newsom's stunning confession to Justin Trudeau caught on camera at World Cup when he thought no one was watching Secret list of celebrities attending billionaire Peter Thiel's invite-only society where elites learn about sex, cults and the next world war Malia and Sasha Obama steal the show during rare family outing for grand opening of dad Barack's library Haunting final video of beloved Bay Area coffee shop owner, 52, who vanished without a trace: Investigator reveals'unnerving' new clues found inside her home Watch horrifying drone video that follows woman's plunge to death after bungee team threw her from bridge without rope Tragic final moments of Hollywood legend's daughter and her husband revealed before being mysteriously found dead in their running SUV Ivanka Trump's youngest son, 8, spotted in middle of Knicks victory parade Scientists create first-ever'map' of female pleasure center that's confused men for centuries All my friends are suddenly getting divorced. Mid-life wives share taboo sex confessions about why they really leave... including common position that made one hate her husband: JANA HOCKING Taylor Swift's bottomless thirst for attention, her greed and sheer tackiness are now truly unbearable... this latest stunt has shown her true colors: MAUREEN CALLAHAN Mystery surrounds JD Vance's dash to Switzerland as world holds breath for Iranians to confirm peace deal Male Israeli hostage sexually assaulted by Hamas captor describes multiple attacks he suffered - blindfolded and stripped naked at knifepoint... and'brutal' 20-minute ordeal Boy, three, is thrown into crocodile enclosure at zoo: Man, 30, 'not known to him' arrested on suspicion of attempted murder Infection found in wildlife evolved to spread between humans, experts fear... after two clusters are identified Florida man hailed as a hero for jumping off of his bike to wrangle a dangerous 8-foot python... only to then be slapped with a $180 FINE Sensational REAL reason Jelly Roll is divorcing Bunnie XO: Insiders reveal'preacher's wife' bombshell that's the talk of Nashville... truth about legendary rocker cuckolding rumor... and G-string mishap Who will win the World Cup? Mathematician's 11 models predict four possible champions (but NOT England!) READ MORE: Supercomputer predicts England's World Cup journey England's World Cup journey begins tonight, but a mathematician warns that fans shouldn't get their hopes up. Dr Ari Joury, a particle physicist and founder of AI firm Wangari, created 11 different models to predict who will win this year's tournament. These digital tipsters crowned four different champions between them, but not a single one picked England. Seven models backed Spain, two singled out Argentina as the likeliest winner, while France and the Netherlands were each the favourite of one prediction system.
Smooth Quadratic Prediction Markets
When agents trade in a Duality-based Cost Function prediction market, they collectively implement the learning algorithm Follow-The-Regularized-Leader [Abernethy et al., 2013]. We ask whether other learning algorithms could be used to inspire the design of prediction markets. By decomposing and modifying the Duality-based Cost Function Market Maker's (DCFMM) pricing mechanism, we propose a new prediction market, called the Smooth Quadratic Prediction Market, the incentivizes agents to collectively implement general steepest gradient descent. Relative to the DCFMM, the Smooth Quadratic Prediction Market has a better worst-case monetary loss for AD securities while preserving axiom guarantees such as the existence of instantaneous price, information incorporation, expressiveness, no arbitrage, and a form of incentive compatibility. To motivate the application of the Smooth Quadratic Prediction Market, we independently examine agents' trading behavior under two realistic constraints: bounded budgets and buy-only securities. Finally, we provide an introductory analysis of an approach to facilitate adaptive liquidity using the Smooth Quadratic Prediction Market. Our results suggest future designs where the price update rule is separate from the fee structure, yet guarantees are preserved.
Snap unveils 1,995 smart glasses after previous flops
Snapchat's parent company has announced it is releasing new smart glasses, a decade after its original pair lost the company tens of millions of dollars . The new augmented reality (AR) glasses, called Specs, will allow users to see digital elements overlaid onto the world. They will cost £1,995 in the UK and $2,195 in the US when shipping begins this autumn. That makes them cheaper than Apple's Vision Pro mixed-reality headset and its $3,499 starting price, but far more than Meta's smart glasses, which start at $224. Evan Spiegel, co-founder and chief executive of Snap Inc, said the glasses marked the beginning of a new era in computing.