Goto

Collaborating Authors

 iter


Dennis Whyte's fusion quest

MIT Technology Review

When the US Department of Energy announced that it would stop funding the tokamak at MIT's Plasma Science and Fusion Center, Dennis Whyte considered giving up on fusion research. But then he had a brainstorm--and challenged his students to bring the idea to life. This full-scale high-temperature superconducting magnet designed and built by Commonwealth Fusion Systems and MIT's Plasma Science and Fusion Center (PSFC) has demonstrated a recordbreaking 20 tesla magnetic field. It is the strongest fusion magnet in the world. Ever since nuclear fusion was discovered in the 1930s, scientists have wondered if we could somehow replicate and harness the phenomenon behind starlight--the smashing together of hydrogen atoms to form helium and a stupendous amount of clean energy. Fusing hydrogen would yield times more energy than simply burning it. Unlike nuclear fission, which powers the world's 440 atomic reactors, hydrogen fusion produces no harmful radiation, only neutrons that are captured and added back to the reaction.


Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Xia, Peng, Zeng, Kaide, Liu, Jiaqi, Qin, Can, Wu, Fang, Zhou, Yiyang, Xiong, Caiming, Yao, Huaxiu

arXiv.org Artificial Intelligence

Large Language Model (LLM) Agents, often trained with Reinforcement Learning (RL), are constrained by a dependency on human-curated data, limiting scalability and tethering AI to human knowledge. Existing self-evolution frameworks offer an alternative but are typically restricted by the model's inherent capabilities and single-round interactions, hindering the development of complex curricula involving tool use or dynamic reasoning. We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data through multi-step co-evolution and seamless tool integration. Agent0 establishes a symbiotic competition between two agents initialized from the same base LLM: a curriculum agent that proposes increasingly challenging frontier tasks, and an executor agent that learns to solve them. We integrate external tools to enhance the executor's problem-solving capacity; this improvement, in turn, pressures the curriculum agent to construct more complex, tool-aware tasks. Through this iterative process, Agent0 establishes a self-reinforcing cycle that continuously produces high-quality curricula. Empirically, Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks. Code is available at https://github.com/aiming-lab/Agent0.


Compiling to recurrent neurons

Velez-Ginorio, Joey, Amin, Nada, Kording, Konrad, Zdancewic, Steve

arXiv.org Artificial Intelligence

Discrete structures are currently second-class in differentiable programming. Since functions over discrete structures lack overt derivatives, differentiable programs do not differentiate through them and limit where they can be used. For example, when programming a neural network, conditionals and iteration cannot be used everywhere; they can break the derivatives necessary for gradient-based learning to work. This limits the class of differentiable algorithms we can directly express, imposing restraints on how we build neural networks and differentiable programs more generally. However, these restraints are not fundamental. Recent work shows conditionals can be first-class, by compiling them into differentiable form as linear neurons. Similarly, this work shows iteration can be first-class -- by compiling to linear recurrent neurons. We present a minimal typed, higher-order and linear programming language with iteration called $\textsf{Cajal}\scriptstyle(\mathbb{\multimap}, \mathbb{2}, \mathbb{N})$. We prove its programs compile correctly to recurrent neurons, allowing discrete algorithms to be expressed in a differentiable form compatible with gradient-based learning. With our implementation, we conduct two experiments where we link these recurrent neurons against a neural network solving an iterative image transformation task. This determines part of its function prior to learning. As a result, the network learns faster and with greater data-efficiency relative to a neural network programmed without first-class iteration. A key lesson is that recurrent neurons enable a rich interplay between learning and the discrete structures of ordinary programming.






A Proofs

Neural Information Processing Systems

When CondInstanceNorm++ is added, we name them "CondResBlock" and "CondRefineBlock" We use the ELU activation function [25] throughout all architectures. The latter is configured according to Technique 1-4. The learning rates and batch sizes are provided in Appendix B.1 and Table 4. EMA with momentum 0.9 to smooth the curves in Figure 1. We can interpolate between two different samples from NCSN/NCSNv2 via interpolating the Gaussian random noise injected by annealed Langevin dynamics. As indicated by Figs. 4 and 8, EMA can stabilize training and remove sample FID scores should be interpreted with caution because they may not align well with human judgement.



Dual Manifold Adversarial Robustness: Defense against L p and non-L p Adversarial Attacks A OM-ImageNet Details A.1 Overview

Neural Information Processing Systems

Figure 1: Visual comparison between original images and projected images. All the classification models are trained using two P6000 GPUs with a batch size of 64 for 20 epochs. We study how different choices affect the robustness of the trained networks against unseen attacks. Table 4: Classification accuracy against unseen attacks applied to OM-ImageNet test set. Table 5. 3 Table 5: Classification accuracy against known (PGD-50 and OM-PGD-50) and unseen attacks Brighter colors indicate larger absolute differences.