Goto

Collaborating Authors

 Technology


From Self Check to Consensus Bayesian Strategic Decoding in Large Language Models

Neural Information Processing Systems

Large Language Models exhibit logical inconsistency across multi-turn inference processes, undermining correctness in complex inferential tasks. Challenges arise from ensuring that outputs align with both factual correctness and human intent.


OligoGym: Curated Datasets and Benchmarks for Oligonucleotide Drug Discovery

Neural Information Processing Systems

Oligonucleotide therapeutics offer great potential to address previously undruggable targets and enable personalized medicine. However, their progress is often hindered by insufficient safety and efficacy profiles. Predictive modeling and machine learning could significantly accelerate oligonucleotide drug discovery by identifying suboptimal compounds early on, but their application in this area lags behind other modalities. A key obstacle to the adoption of machine learning in the field is the scarcity of readily accessible and standardized datasets for model development, as data are often scattered across diverse experiments with inconsistent molecular representations. To overcome this challenge, we introduce OligoGym, a curated collection of standardized, machine learning-ready datasets encompassing various oligonucleotide therapeutic modalities and endpoints. We used OligoGym to benchmark diverse classical and deep learning methods, establishing performance baselines for each dataset across different featurization techniques, model configurations, and splitting strategies. Our work represents a crucial first step in creating a more unified framework for oligonucleotide therapeutic dataset generation and model training.


Making Classic GNNs Strong Baselines Across Varying Homophily: ASmoothness-Generalization Perspective

Neural Information Processing Systems

Graph Neural Networks (GNNs) have achieved great success but are often considered to be challenged by varying levels of homophily in graphs. Recent empirical studies have surprisingly shown that homophilic GNNs can perform well across datasets of different homophily levels with proper hyperparameter tuning, but the underlying theory and effective architectures remain unclear. To advance GNN universality across varying homophily, we theoretically revisit GNN message passing and uncover a novel smoothness-generalization dilemma, where increasing hops inevitably enhances smoothness at the cost of generalization. This dilemma hinders learning in high-order homophilic neighborhoods and all heterophilic ones, where generalization is critical due to complex neighborhood class distributions that are sensitive to shifts induced by noise or sparsity. To address this, we introduce the Inceptive Graph Neural Network (IGNN) built on three simple yet effective design principles, which alleviate the dilemma by enabling distinct hop-wise generalization alongside improved overall generalization with adaptive smoothness. Benchmarking against 30 baselines demonstrates IGNN's superiority and reveals notable universality in certain homophilic GNN variants. Our code and datasets are available at https://github.com/galogm/IGNN.


Scaling Data-Driven Probabilistic Robustness Analysis for Semantic Segmentation Neural Networks

Neural Information Processing Systems

Semantic segmentation neural networks (SSNs) are increasingly essential in highstakes fields such as medical imaging, autonomous driving, and environmental monitoring, where robustness to input uncertainties and adversarial examples is crucial for ensuring safety and reliability. However, traditional probabilistic verification methods struggle to scale effectively with the size and depth of modern SSNs, especially when dealing with their high-dimensional, structured inputs/outputs. As the output dimension increases, these methods tend to become overly conservative, resulting in unnecessarily restrictive safety guarantees. In this work, we propose a probabilistic, data-driven verification algorithm that is architecture-agnostic and scalable, capable of handling the high-dimensional outputs of SSNs without introducing conservative and loose guarantees. We leverage efficient sampling-based reachability analysis to explore the space of possible outputs while maintaining computational feasibility.


AGeneral-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Neural Information Processing Systems

Polyak-Ruppert averaging is a widely used technique to achieve the optimal asymptotic variance of stochastic approximation (SA) algorithms, yet its high-probability performance guarantees remain underexplored in general settings. In this paper, we present a general framework for establishing non-asymptotic concentration bounds for the error of averaged SA iterates. Our approach assumes access to individual concentration bounds for the unaveraged iterates and yields a sharp bound on the averaged iterates. We also construct an example, showing the tightness of our result up to constant multiplicative factors. As direct applications, we derive tight concentration bounds for contractive SA algorithms and for algorithms such as temporal difference learning and Q-learning with averaging, obtaining new bounds in settings where traditional analysis is challenging.


This Time is Different An Perspective on Time Series Foundation Models

Neural Information Processing Systems

We introduce TOTO, a time series forecasting foundation model with 151 million parameters. TOTO uses a modern decoder-only architecture coupled with architectural innovations designed to account for specific challenges found in multivariate observability time series data. TOTO's pre-training corpus is a mixture of observability data, open datasets, and synthetic data, and is 4-10 larger than those of leading time series foundation models. Additionally, we introduce BOOM, a large-scale benchmark consisting of 350 million observations across 2,807 real-world time series. For both TOTO and BOOM, we source observability data exclusively from Datadog's own telemetry and internal observability metrics. Extensive evaluations demonstrate that TOTO achieves state-of-the-art performance on both BOOM and on established general purpose time series forecasting benchmarks.


OWL: Optimized Workforce Learning General Multi-Agent Assistance for Real-World Task Automation

Neural Information Processing Systems

Large Language Model (LLM)-based multi-agent systems show promise for automating real-world tasks but struggle to transfer across domains due to their domain-specific nature. Current approaches face two critical shortcomings: they require complete architectural redesign and full retraining of all components when applied to new domains. We introduce WORKFORCE, a hierarchical multi-agent framework that decouples strategic planning from specialized execution through a modular architecture comprising: (i) a domain-agnostic Planner for task decomposition, (ii) a Coordinator for subtask management, and (iii) specialized Workers with domain-specific tool-calling capabilities.


SpaceX overtakes Amazon in value as post-IPO rally reaches 49%

The Japan Times

Its market capitalization is roughly $8 billion higher than Amazonโ€™s at the close of trading on Tuesday, making it the fifth-largest stock in the world.


Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Neural Information Processing Systems

This paradigm has had limited impact in value-based reinforcement learning (RL), where improvements are often driven by small models trained in a single-task context. This is because in multi-task RL sparse rewards and gradient conflicts make optimization of temporal difference brittle. Practical workflows for generalist policies therefore avoid online training, instead cloning expert trajectories or distilling collections of single-task policies into one agent. In this work, we show that the use of high-capacity value models trained via crossentropy and conditioned on learnable task embeddings addresses the problem of task interference in online RL, allowing for robust and scalable multi-task training. We test our approach on 7 multi-task benchmarks with over 280 unique tasks, spanning high degree-of-freedom humanoid control and discrete vision-based RL. We find that, despite its simplicity, the proposed approach leads to state-of-the-art single and multi-task performance, as well as sample-efficient transfer to new tasks.


STNet: Spectral Transformation Network for Solving Operator Eigenvalue Problem

Neural Information Processing Systems

Operator eigenvalue problems play a critical role in various scientific fields and engineering applications, yet numerical methods are hindered by the curse of dimensionality. Recent deep learning methods provide an efficient approach to address this challenge by iteratively updating neural networks. These methods' performance relies heavily on the spectral distribution of the given operator: larger gaps between the operator's eigenvalues will improve precision, thus tailored spectral transformations that leverage the spectral distribution can enhance their performance. Based on this observation, we propose the Spectral Transformation Network (STNet). During each iteration, STNet uses approximate eigenvalues and eigenfunctions to perform spectral transformations on the original operator, turning it into an equivalent but easier problem. Specifically, we employ deflation projection to exclude the subspace corresponding to already solved eigenfunctions, thereby reducing the search space and avoiding converging to existing eigenfunctions. Additionally, our filter transform magnifies eigenvalues in the desired region and suppresses those outside, further improving performance. Extensive experiments demonstrate that STNet consistently outperforms existing learning-based methods, achieving state-of-the-art performance in accuracy 1.