Goto

Collaborating Authors

 topology


NeuroMAS: Multi-Agent Systems as Neural Networks with Joint Reinforcement Learning

arXiv.org Machine Learning

Multi-agent language systems are often built as hand-designed workflows, where agents are assigned semantic roles and communication protocols are specified in advance. We propose NeuroMAS, a method that first treats a multi-agent language system as a trainable and scalable neural-network-like architecture with LLM agents as nodes and intermediate textual signals as edges. In NeuroMAS, agent nodes are role-free but structure-aware: the topology only determines how information can flow in general, while reinforcement learning training determines how nodes communicate, specialize, and coordinate. This formulation shifts multi-agent design from workflow engineering toward architecture design, where depth, width, connectivity, and growth protocol become scalable sources of capability. Further, we provide a theoretical perspective showing why such modular textual computation is more parameter-efficient when tasks admit hierarchical decompositions. Experiments show that NeuroMAS improves significantly over both inference-time and trained multi-agent baselines. We further find that organizational scaling is path-dependent: larger systems can be challenging to train from scratch, but become feasible when grown progressively from smaller trained systems. These results suggest that learned neural multi-agent systems are a promising scaling axis for LLMs.


Design, Cups, and Blankets. A Free-Energy-Principle-Based Approach to Product Design

arXiv.org Machine Learning

Classical design theory treats the type of an object as a given: the designer decides in advance that this will be a cup, then optimizes its parameters. This paper argues that object type is not a presupposition but an inference, something that can be determined from physical data and functional requirements jointly. We call this problem requirement-steered interface type inference and show that it is inexpressible within existing design frameworks. This paper makes two contributions that are jointly necessary and individually incomplete. The first is the problem itself, which classical design cannot pose because it presupposes the very thing our problem seeks to determine. The second is C-DMBD, a constrained extension of the Dynamic Markov Blanket Detection algorithm, which makes requirement-steered inference computationally tractable. Drawing on the free-energy principle and active inference, established frameworks in theoretical neuroscience and Bayesian mechanics, we model a product's surface as a Markov blanket: the minimal boundary through which all causal exchange between object and environment must pass. Different blanket structures correspond to different object types; different parameterizations of the same structure correspond to different functional modes of the same type. This paper is a proof of concept and a theoretical proposal. It reframes design as inference rather than optimization, and as a relation between generative models rather than a specification of parameters.



Re Think and Re Design Graph Neural Networks in Spaces of Continuous Graph Diffusion Functionals

Neural Information Processing Systems

Graphs are ubiquitous in various domains, such as social networks and biological systems. Despite the great successes of graph neural networks (GNNs) in modeling and analyzing complex graph data, the inductive bias of locality assumption, which involves exchanging information only within neighboring connected nodes, restricts GNNs in capturing long-range dependencies and global patterns in graphs. Inspired by the classic Brachistochrone problem, we seek how to devise a new inductive bias for cutting-edge graph application and present a general framework through the lens of variational analysis. The backbone of our framework is a two-way mapping between the discrete GNN model and continuous diffusion functional, which allows us to design application-specific objective function in the continuous domain and engineer discrete deep model with mathematical guarantees. First, we address over-smoothing in current GNNs.


AThe Algorithm

Neural Information Processing Systems

Construct optimistic MDP fMk and compute optimistic policy πk (Algorithm 5). When the counter is 0 it gets (s,a), i.e., Ωi,e = (s,a,). When the counter is 1, we take (s,a) from ωn and map them to ωn/2 while eliminating half of the factors in consideration with the consistent scope Zi chosen by the policy (stored in factor 2d+ 1 + iof the state). It is handled similarly to the previous item, but considers the reward consistent scope zj chosen by the policy (stored in factor 3d+ 1 + j of the state). For i = 1,...,d, the i-th factor is taken from factor i of the previous state when the counter is not log n + 1, and otherwise performs the optimistic transition of factor i. Denote the value in the last factor of Ωi,e by ve, the policy's chosen scope by Zi (stored in factor 2d+ 1 + iof the state) and the policy's chosen next state direction by s0i (stored in factor d+ 1 + iof the state).


ATopological Perspective on Causal Inference

Neural Information Processing Systems

This paper presents a topological learning-theoretic perspective on causal inference by introducing a series of topologies defined on general spaces of structural causal models (SCMs). As an illustration of the framework we prove a topological causal hierarchy theorem, showing that substantive assumption-free causal inference is possible only in a meager set of SCMs. Thanks to a known correspondence between open sets in the weak topology and statistically verifiable hypotheses, our results show that inductive assumptions sufficient to license valid causal inferences are statistically unverifiable in principle. Similar to no-free-lunch theorems for statistical inference, the present results clarify the inevitability of substantial assumptions for causal inference. An additional benefit of our topological approach is that it easily accommodates SCMs with infinitely many variables. We finally suggest that the framework may be helpful for the positive project of exploring and assessing alternative causal-inductive assumptions.




0aa800df4298539770b57824afc77a89-Supplemental-Conference.pdf

Neural Information Processing Systems

Figure 8: The average values during training of the two components used in the criteria for neuron importance in the input layer: the absolute gradient of the loss with respect to the reconstructed samples and the sum of the absolute weights connected to a neuron. A.1 Implementation Details For all datasets, we used standard normalization that scales the features to have zero mean and standard deviation of one. The architecture of the autoencoder consists of one hidden layer with sigmoid activation. A linear activation is used for the output layer. We use a hidden layer of 200 neurons for all datasets.


Graphs

Neural Information Processing Systems

A.1 Construction of a D-EquiStatic graph A practical method to construct a D-EquiStatic weight matrix W is provided in Alg. 3. We should mention that the "while" loop in the algorithm is adopted to guarantee kΠWk2 ρ. Construct W by (3); end Output: The D-EquiStatic weight matrix W and its associated basis indices {ut}Mt=1 A.2 Proof of Theorem 1 Before showing properties of W defined by (3), we provide two lemmas as follows. Referring to Theorem 1.6 of [32], we have the following result for a sequence of random matrices. Lemma 1 (Matrix Bernstein) Consider a sequence of K independent random n n matrices {Mi}Ki=1. Assume that each random matrix satisfies E[Mi] = 0, and kMik2 R almost surely. Theorem 1 (Formal restatement of Theorem 1) Let A(u) be defined by (2) for any u [n 1]and the D-EquiStatic weight matrix W be constructed by (3) with {ui}Mi=1 following an independent and identical uniform distribution from [n 1].