Goto

Collaborating Authors

 Instructional Material


LaLaL 1Reyyceeurrr s0ion Block

Neural Information Processing Systems

Scaling language models unlocks impressive capabilities, but the accompanying computational and memory demands make both training and deployment expensive. Existing efficiency efforts typically target either parameter sharing or adaptive computation, leaving open the question of how to attain both simultaneously. We introduce Mixture-of-Recursions (MoR), a unified framework that combines the two axes of efficiency inside a single Recursive Transformer. MoR reuses a shared stack of layers across recursion steps to achieve parameter efficiency, while lightweight routers enable adaptive token-level thinking by dynamically assigning different recursion depths to individual tokens. This allows MoR to focus quadratic attention computation only among tokens still active at a given recursion depth, further improving memory access efficiency by selectively caching only their key-value pairs. Beyond these core mechanisms, we also propose a KV sharing variant that reuses KV pairs from the first recursion, specifically designed to further decrease memory footprint. Across model scales ranging from 135M to 1.7B parameters, MoR forms a new Pareto frontier: at equal training FLOPs and smaller model sizes, it significantly lowers validation perplexity and improves few-shot accuracy, while delivering higher throughput compared with vanilla and existing recursive baselines.


The Best Art TVs

WIRED

After you're done bingeing your favorite movies, these art televisions are designed to liven up your wall. I have watched so many times I've lost count. For years, the Andrew Wyeth painting took a prominent place in my living room. Art televisions--the category of TV pioneered by Samsung's Frame and now rapidly expanding with models from many of the major TV producers --combine my passion for movies and shows with an even greater interest in art and photography. When it comes to their performance as televisions, even the best art TVs don't have quite the same punchy colors and speedy refresh rates found on similarly priced standard televisions. However, when the movie is finished, art TVs look a lot better in a room, displaying art and photos on a matte screen with a pristine clarity in a space otherwise wasted by a black box. Art televisions are typically just a little more expensive than a normal 4K TV.


Pump.Fun's Bounties Platform Is a Black Hole of Circular Grifting

WIRED

Pump.Fun's Bounties Platform Is a Black Hole of Circular Grifting The crypto platform claims you can "pay anyone to do anything," from quitting a job on camera to getting a memecoin-themed tattoo. But it mostly seems like people trying to scam each other. Would you run into a crowded university lecture hall, fart into a megaphone, and bellow "fartcoin" at the top of your lungs? If so--and should you have the means to document this stunt on video, preferably capturing the audience's reaction--you may claim a reward of approximately $1,000 . The money, of course, will be dispensed in fartcoin, a meme cryptocurrency trading at a little over 10 cents at time of publication, with a total market capitalization hovering around $130 million. Such is the promise of Pump.Fun GO, a new feature on Pump.Fun, one of the fastest-growing crypto businesses of the past few years.


Flow-Based Policy for Online Reinforcement Learning

Neural Information Processing Systems

We argue that in addition to training signals, enhancing the expressiveness of the policy class is crucial for the performance gains in RL. Flow-based generative models offer such potential, excelling at capturing complex, multimodal action distributions. However, their direct application in online RL is challenging due to a fundamental objective mismatch: standard flow training optimizes for static data imitation, while RL requires value-based policy optimization through a dynamic buffer, leading to difficult optimization landscapes.


BIPNN: Learning to Solve Binary Integer Programming via Hypergraph Neural Networks

Neural Information Processing Systems

Binary (0-1) integer programming (BIP) is pivotal in scientific domains requiring discrete decision-making. As the advance of AI computing, recent works explore neural network-based solvers for integer linear programming (ILP) problems. Yet, they lack scalability for tackling nonlinear challenges. To handle nonlinearities, state-of-the-art Branch-and-Cut solvers employ linear relaxations, leading to exponential growth in auxiliary variables and severe computation limitations. To overcome these limitations, we propose BIPNN (Binary Integer Programming Neural Network), an unsupervised learning framework to solve nonlinear BIP problems via hypergraph neural networks (HyperGNN). Specifically, (I) BIPNN reformulates BIPs-constrained, discrete, and nonlinear (sin, log, exp) optimization problems-into unconstrained, differentiable, and polynomial loss functions.


Knowledge Starts with Practice: Knowledge-Aware Exercise Generative Recommendation with Adaptive Multi-Agent Cooperation

Neural Information Processing Systems

Adaptive learning, which requires the in-depth understanding of students' learning processes and rational planning of learning resources, plays a crucial role in intelligent education. However, how to effectively model these two processes and seamlessly integrate them poses significant implementation challenges for adaptive learning. As core learning resources, exercises have the potential to diagnose students' knowledge states during the learning processes and provide personalized learning recommendations to strengthen students' knowledge, thereby serving as a bridge to boost student-oriented adaptive learning. Therefore, we introduce a novel task called Knowledge-aware Exercise Generative Recommendation (KEGR). It aims to dynamically infer students' knowledge states from their past exercise responses and customizably generate new exercises. To achieve KEGR, we propose an adaptive multi-agent cooperation framework, called ExeGen, inspired by the excellent reasoning and generative capabilities of LLM-based AI agents. Specifically, ExeGen coordinates four specialized agents for supervision, knowledge state perception, exercise generation, and quality refinement through an adaptive loop workflow pipeline. More importantly, we devise two enhancement mechanisms in ExeGen: 1) A human-simulated knowledge perception mechanism mimics students' cognitive processes and generates interpretable knowledge state descriptions via demonstration-based In-Context Learning (ICL). In this mechanism, a dualmatching strategy is further designed to retrieve highly relevant demonstrations for reliable ICL reasoning.


Scalable, Explainable and Provably Robust Anomaly Detection with One-Step Flow Matching

Neural Information Processing Systems

We introduce Time-Conditioned Contraction Matching (TCCM), a novel method for semi-supervised anomaly detection in tabular data. TCCM is inspired by flow matching, a recent generative modeling framework that learns velocity fields between probability distributions and has shown strong performance compared to diffusion models and generative adversarial networks. Instead of directly applying flow matching as originally formulated, TCCM builds on its core idea--learning velocity fields between distributions--but simplifies the framework by predicting a time-conditioned contraction vector toward a fixed target (the origin) at each sampled time step. This design offers three key advantages: (1) a lightweight and scalable training objective that removes the need for solving ordinary differential equations during training and inference; (2) an efficient scoring strategy called one time-step deviation, which quantifies deviation from expected contraction behavior in a single forward pass, addressing the inference bottleneck of existing continuous-time models such as DTE (a diffusion-based model with leading anomaly detection accuracy but heavy inference cost); and (3) explainability and provable robustness, as the learned velocity field operates directly in input space, making the anomaly score inherently feature-wise attributable; moreover, the score function is Lipschitz-continuous with respect to the input, providing theoretical guarantees under small perturbations. Extensive experiments on the ADBench benchmark show that TCCM strikes a favorable balance between detection accuracy and inference cost, outperforming state-of-the-art methods--especially on high-dimensional and large-scale datasets.


Microsoft's new Outlook now supports offline email attachments

PCWorld

PCWorld reports that Microsoft's new Outlook app for Windows 11 now supports adding email attachments while offline, with messages automatically sending once internet reconnects. This update addresses a key limitation for users who frequently work without reliable internet connections or need to prepare emails in advance. Despite these improvements, many users continue preferring the classic Outlook or web version over Microsoft's repackaged web app approach. For several years, Microsoft has been trying to persuade users to move from the classic Outlook app for Windows 11 to the new version, which is (essentially) a repackaged web app. The latest update offers improved offline support, making it possible to add attachments to emails without an internet connection. The emails will send automatically once you've got a working connection again. According to Windows Latest, Microsoft has been testing this feature since October 2025, with a wider rollout only now beginning. Although the "new" Outlook has improved recently, many still prefer the classic app or the web version. This article originally appeared on our sister publication PC för Alla and was translated and localized from Swedish.


Inspired by Ukraine, and worried by China: Taiwan teaches its citizens how to fly drones

The Guardian

I n a small, crowded room in Taipei, Pan Chien-chin is trying to keep a drone hovering steadily. Imagining himself flying a plane, he gently nudges controller joysticks to guide the insect-like device as it hums through the air. Cheers break out as Pan, who has never flown a drone before, steers it around a rectangular course marked by traffic cones without crashing. Around him are about two dozen fellow trainees, all signed up for the same course: Taiwan's first civil defence drone training programme. "The war in Ukraine has really changed how drones are used," says Pan, 48, a food company worker. "It's like giving myself another skill, something I can use if it's ever needed one day," he adds.


MLZero: AMulti-Agent System for End-to-end Machine Learning Automation

Neural Information Processing Systems

Existing AutoML systems have advanced the automation of machine learning (ML); however, they still require substantial manual configuration and expert input, particularly when handling multimodal data. We introduce MLZero, a novel multi-agent framework powered by Large Language Models (LLMs) that enables end-to-end ML automation across diverse data modalities with minimal human intervention. A cognitive perception module is first employed, transforming raw multimodal inputs into perceptual context that effectively guides the subsequent workflow. To address key limitations of LLMs, such as hallucinated code generation and outdated API knowledge, we enhance the iterative code generation process with semantic and episodic memory. MLZero demonstrates superior performance on MLE-Bench Lite, outperforming all competitors in both success rate and solution quality, securing six gold medals. Additionally, when evaluated on our Multimodal AutoML Agent Benchmark, which includes 25 more challenging tasks spanning diverse data modalities, MLZero outperforms the competing methods by a large margin with a success rate of 0.92 (+263.6%)