Goto

Collaborating Authors

 Genre


POCO: Scalable Neural Forecasting through Population Conditioning

Neural Information Processing Systems

Predicting future neural activity is a core challenge in modeling brain dynamics, with applications ranging from scientific investigation to closed-loop neurotechnology. While recent models of population activity emphasize interpretability and behavioral decoding, neural forecasting--particularly across multi-session, spontaneous calcium recordings--remains underexplored. We introduce POCO, a unified forecasting model that combines a lightweight univariate forecaster with a population-level encoder to capture both neuron-specific and brain-wide dynamics in calcium imaging recordings. Trained across five calcium imaging datasets spanning zebrafish, mice, and C. elegans, POCO achieves state-of-the-art accuracy at cellular resolution in spontaneous behaviors.



Prompted Policy Search: Reinforcement Learning through Linguistic and Numerical Reasoning in LLMs

Neural Information Processing Systems

Reinforcement Learning (RL) traditionally relies on scalar reward signals, limiting its ability to leverage the rich semantic knowledge often available in real-world tasks. In contrast, humans learn efficiently by combining numerical feedback with language, prior knowledge, and common sense. We introduce Prompted Policy Search (ProPS), a novel RL method that unifies numerical and linguistic reasoning within a single framework. Unlike prior work that augment existing RL components with language, ProPSplaces a large language model (LLM) at the center of the policy optimization loop--directly proposing policy updates based on both reward feedback and natural language input. We show that LLMs can perform numerical optimization in-context, and that incorporating semantic signals, such as goals, domain knowledge, and strategy hints can lead to more informed exploration and sample-efficient learning. ProPSis evaluated across 15 Gymnasium tasks, spanning classic control, Atari games, and MuJoCo environments, and compared to seven widely-adopted RL algorithms (e.g., PPO, SAC, TRPO). It outperforms all baselines on 8 out of 15 tasks and demonstrates substantial gains when provided with domain knowledge.


Steven Spielberg claims aliens have already visited Earth - now scientists say he might be right

Daily Mail - Science & tech

Former Olympian seen in handcuffs as Trump threatens'years in jail' and more arrests after vandals SABOTAGE Reflecting Pool with'corrosive and destructive chemicals' Angelina Jolie's son Pax, 22, surfaces in LA after bombshell revelation about his relationship to Brad Pitt Keir Starmer'will announce as early as Monday that he is quitting as Prime Minister' after spending weekend locked in tense talks about his future with his wife Victoria at Chequers Mortifying truth about Clavicular's'botched' nose job: Infertile influencer's'trans' admission to friends... as insider reveals what's said behind closed doors - and twisted secrets that'll leave fans floored Giorgia Meloni rips'senseless' attacks from Trump as Italian Prime Minister refuses to back down amid G7 feud Inside America's new fattest town: Burgers are the size of your head, gyms lie empty and custom mobility scooters carry 800lb loads... as we investigate why Ozempic just DOESN'T work Call me cynical, but the real reason Gruesome Twosome Harry and Meghan are returning to the UK is just so obvious... and highly humiliating: MAUREEN CALLAHAN Stingy fast food giant named America's favorite restaurant AGAIN... and experts think they know why I lost 50lb without jabs using this easy but overlooked method. But I still felt dowdy - until I discovered these expert anti-ageing fashion and beauty tips. No one can see the real reason Jelly Roll divorced Bunnie XO. Blake Lively runs errands in frumpy outfit after reconciling with ex-BFF Taylor Swift... miles away from reported'bachelorette party' Embattled Alexi Lalas makes controversial World Cup declaration amid tension with Fox colleagues: 'Makes you look like a weak poser' Scientists propose radical new theory of consciousness - and claim it doesn't depend on flesh and blood Candace Owens hits out at nasty rumors claiming she was DEAD... as fellow MAGA influencer claims her account was hacked Grace Kelly's lookalike granddaughter, 27, wows in bikini snaps...as she packs on the PDA during beach getaway TV star mom, 46, who appeared on'quitting everything to change your life' show died in fire at luxury Caribbean beach resort that sent 1,700 tourists running for their lives The four mistakes that led to bungee tragedy on Skeleton Bridge: FRED KELLY saw the scene for himself, now he retraces the prelude to disaster. So was it really an accident?


Sculpting Features from Noise Reward Guided Hierarchical Diffusion for Task Optimal Feature Transformation

Neural Information Processing Systems

Feature Transformation (FT) crafts new features from original ones via mathematical operations to enhance dataset expressiveness for downstream models. However, existing FT methods exhibit critical limitations: discrete search struggles with enormous combinatorial spaces, impeding practical use; and continuous search, being highly sensitive to initialization and step sizes, often becomes trapped in local optima, restricting global exploration. To overcome these limitations, DIFFT redefines FT as a reward-guided generative task. It first learns a compact and expressive latent space for feature sets using a Variational Auto-Encoder (VAE). A Latent Diffusion Model (LDM) then navigates this space to generate high-quality feature embeddings, its trajectory guided by a performance evaluator towards task-specific optima. This synthesis of global distribution learning (from LDM) and targeted optimization (reward guidance) produces potent embeddings, which a novel semi-autoregressive decoder efficiently converts into structured, discrete features, preserving intra-feature dependencies while allowing parallel inter-feature generation. Extensive experiments on 14 benchmark datasets show DIFFT consistently outperforms state-of-the-art baselines in predictive accuracy and robustness, with significantly lower training and inference times. Our code and data are publicly available at https://github.com/NanxuGong/DIFFT



ACounterfactual Semantics for Hybrid Dynamical Systems

Neural Information Processing Systems

Models of hybrid dynamical systems are widely used to answer questions about the causes and effects of dynamic events in time. Unfortunately, existing causal reasoning formalisms lack support for queries involving the dynamically triggered, discontinuous interventions that characterize hybrid dynamical systems. This mismatch can lead to ad-hoc and error-prone causal analysis workflows in practice. To bridge the gap between the needs of hybrid systems users and current causal inference capabilities, we develop a rigorous counterfactual semantics by formalizing interventions as transformations to the constraints of hybrid systems. Unlike interventions in a typical structural causal model, however, interventions in hybrid systems can easily render the model ill-posed. Thus, we identify mild conditions under which our interventions maintain solution existence, uniqueness, and measurability by making explicit connections to established hybrid systems theory. To illustrate the utility of our framework, we formalize a number of canonical causal estimands and explore a case study on the probabilities of causation with applications to fishery management. Our work simultaneously expands the modeling possibilities available to causal inference practitioners and begins to unlock decades of causality research for users of hybrid systems.


GUARD: Constructing Realistic Two-Player Matrix and Security Games for Benchmarking Game-Theoretic Algorithms

Neural Information Processing Systems

Game-theoretic algorithms are commonly benchmarked on recreational games, classical constructs from economic theory such as congestion and dispersion games, or entirely random game instances. While the past two decades have seen the rise of security games - grounded in real-world scenarios like patrolling and infrastructure protection - their practical evaluation has been hindered by limited access to the datasets used to generate them. In particular, although the structural components of these games (e.g., patrol paths derived from maps) can be replicated, the critical data defining target values - central to utility modeling - remain inaccessible. In this paper, we introduce a flexible framework that leverages open-access datasets to generate realistic matrix and security game instances. These include animal movement data for modeling anti-poaching scenarios and demographic and infrastructure data for infrastructure protection. Our framework allows users to customize utility functions and game parameters, while also offering a suite of preconfigured instances. We provide theoretical results highlighting the degeneracy and limitations of benchmarking on random games, and empirically compare our generated games against random baselines across a variety of standard algorithms for computing Nash and Stackelberg equilibria, including linear programming, incremental strategy generation, and self-play with no-regret learners.


Enhancing Time Series Forecasting through Selective Representation Spaces: APatch Perspective

Neural Information Processing Systems

Time Series Forecasting has made significant progress with the help of Patching technique, which partitions time series into multiple patches to effectively retain contextual semantic information into a representation space beneficial for modeling long-term dependencies. However, conventional patching partitions a time series into adjacent patches, which causes a fixed representation space, thus resulting in insufficiently expressful representations. In this paper, we pioneer the exploration of constructing a selective representation space to flexibly include the most informative patches for forecasting. Specifically, we propose the Selective Representation Space (SRS) module, which utilizes the learnable Selective Patching and Dynamic Reassembly techniques to adaptively select and shuffle the patches from the contextual time series, aiming at fully exploiting the information of contextual time series to enhance the forecasting performance of patch-based models. To demonstrate the effectiveness of SRS module, we propose a simple yet effective SRSNet consisting of SRS and an MLP head, which achieves state-of-the-art performance on real-world datasets from multiple domains. Furthermore, as a novel plug-and-play module, SRS can also enhance the performance of existing patch-based models.


SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Neural Information Processing Systems

LLM-based agents have shown promising capabilities in a growing range of software engineering (SWE) tasks. However, advancing this field faces two critical challenges. First, high-quality training data is scarce, especially data that reflects real-world SWE scenarios, where agents must interact with development environments, execute code and adapt behavior based on the outcomes of their actions. Existing datasets are either limited to one-shot code generation or comprise small, manually curated collections of interactive tasks, lacking both scale and diversity. Second, the lack of fresh interactive SWE tasks affects evaluation of rapidly improving models, as static benchmarks quickly become outdated due to contamination issues. To address these limitations, we introduce a novel, automated, and scalable pipeline to continuously extract real-world interactive SWE tasks from diverse GitHub repositories. Using this pipeline, we construct SWE-rebench, a public dataset comprising over 21,000 interactive Python-based SWE tasks, suitable for reinforcement learning of SWE agents at scale. Additionally, we use continuous supply of fresh tasks collected using SWE-rebench methodology to build a contamination-free benchmark for agentic software engineering. We compare results of various LLMs on this benchmark to results on SWE-bench Verified and show that performance of some language models might be inflated due to contamination issues.