Goto

Collaborating Authors

 Technology


Thought Communication in Multiagent Collaboration

Neural Information Processing Systems

Natural language has long enabled human cooperation, but its lossy, ambiguous, and indirect nature limits the potential of collective intelligence. While machines are not subject to these constraints, most LLM-based multi-agent systems still rely solely on natural language, exchanging tokens or their embeddings. To go beyond language, we introduce a new paradigm,, which enables agents to interact directly mind-to-mind, akin to telepathy. To uncover these latent thoughts in a principled way, we formalize the process as a general latent variable model, where agent states are generated by an unknown function of underlying thoughts. We prove that, in a nonparametric setting without auxiliary information, both shared and private latent thoughts between any pair of agents can be identified. Moreover, the global structure of thought sharing, including which agents share which thoughts and how these relationships are structured, can also be recovered with theoretical guarantees.


On Agnostic PAC Learning in the Small Error Regime

Neural Information Processing Systems

Binary classification in the classic PAC model exhibits a curious phenomenon: Empirical Risk Minimization (ERM) learners are suboptimal in the realizable case yet optimal in the agnostic case. Roughly speaking, this owes itself to the fact that non-realizable distributions $\\mathcal{D}$ are more difficult to learn than realizable distributions -- even when one discounts a learner's error by $\\mathrm{err}(h^\\ast_\\mathcal{D})$, i.e., the error of the best hypothesis in $\\mathcal{H}$. Thus, optimal agnostic learners are permitted to incur excess error on (easier-to-learn) distributions $\\mathcal{D}$ for which $\\tau = \\mathrm{err}(h^\\ast_\\mathcal{D})$ is small.


FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic

Neural Information Processing Systems

Low-bit floating-point (FP) formats, such as FP8, provide significant acceleration and memory savings in model training thanks to native hardware support on modern GPUs and NPUs. However, we analyze that FP8 quantization offers speedup primarily for large-dimensional matrix multiplications, while inherent quantization overheads diminish speedup when applied to low-rank adaptation (LoRA), which uses small-dimensional matrices for efficient fine-tuning of large language models (LLMs). To address this limitation, we propose FALQON, a novel framework that eliminates the quantization overhead from separate LoRA computational paths by directly merging LoRA adapters into an FP8-quantized backbone during fine-tuning. Furthermore, we reformulate the forward and backward computations for merged adapters to significantly reduce quantization overhead, and introduce a row-wise proxy update mechanism that efficiently integrates substantial updates into the quantized backbone. Experimental evaluations demonstrate that FALQON achieves approximately a 3$\times$ training speedup over existing quantized LoRA methods with a similar level of accuracy, providing a practical solution for efficient large-scale model fine-tuning. Moreover, FALQON's end-to-end FP8 workflow removes the need for post-training quantization, facilitating efficient deployment.


RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis

Neural Information Processing Systems

Clinical diagnosis is a highly specialized discipline requiring both domain expertise and strict adherence to rigorous guidelines. While current AI-driven medical research predominantly focuses on knowledge graphs or natural text pretraining paradigms to incorporate medical knowledge, these approaches primarily rely on implicitly encoded knowledge within model parameters, neglecting task-specific knowledge required by diverse downstream tasks.


OrdShap: Feature Position Importance for Sequential Black-Box Models

Neural Information Processing Systems

Sequential deep learning models excel in domains with temporal or sequential dependencies, but their complexity necessitates post-hoc feature attribution methods for understanding their predictions. While existing techniques quantify feature importance, they inherently assume fixed feature ordering -- conflating the effects of (1) feature values and (2) their positions within input sequences. To address this gap, we introduce OrdShap, a novel attribution method that disentangles these effects by quantifying how a model's predictions change in response to permuting feature position. We establish a game-theoretic connection between OrdShap and Sanchez-Bergantiños values, providing a theoretically grounded approach to position-sensitive attribution. Empirical results from health, natural language, and synthetic datasets highlight OrdShap's effectiveness in capturing feature value and feature position attributions, and provide deeper insight into model behavior.


HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell data

Neural Information Processing Systems

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning on high-dimensional point clouds. Our work is motivated by single-cell data which can have very high-dimensionality --exceeding the capabilities of existing methods for point clouds which are mostly tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e., one data set for every patient), necessitating models that can process large, high-dimensional point-clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric and topological information. In contrast, HiPoNet models the point-cloud as a set of higher-order simplicial complexes, with each particular complex being created using a reweighting of features. This method thus generates multiple constructs corresponding to different views of high-dimensional data, which in biology offers the possibility of disentangling distinct cellular processes. It then employs simplicial wavelet transforms to extract multiscale features, capturing both local and global topology from each view. We show that geometric and topological information is preserved in this framework both theoretically and empirically.


Impartial Selection with Predictions

Neural Information Processing Systems

We study the selection of agents based on mutual nominations, a theoretical problem with many applications from committee selection to AI alignment. As agents both select and are selected, they may be incentivized to misrepresent their true opinion about the eligibility of others to influence their own chances of selection. Impartial mechanisms circumvent this issue by guaranteeing that the selection of an agent is independent of the nominations cast by that agent. Previous research has established strong bounds on the performance of impartial mechanisms, measured by their ability to approximate the number of nominations for the most highly nominated agents. We study to what extent the performance of impartial mechanisms can be improved if they are given a prediction of a set of agents receiving a maximum number of nominations. Specifically, we provide bounds on the consistency and robustness of such mechanisms, where consistency measures the performance of the mechanisms when the prediction is correct and robustness its performance when the prediction is incorrect. For the general setting where up to $k$ agents are to be selected and agents nominate any number of other agents, we give a mechanism with consistency $1-O\big(\frac{1}{k}\big)$ and robustness $1-\frac{1}{e}-O\big(\frac{1}{k}\big)$. For the special case of selecting a single agent based on a single nomination per agent, we prove that $1$-consistency can be achieved while guaranteeing $\frac{1}{2}$-robustness. A close comparison with previous results shows that (asymptotically) optimal consistency can be achieved with little to no sacrifice in terms of robustness.


BrainEC-LLM: Brain Effective Connectivity Estimation by Multiscale Mixing LLM

Neural Information Processing Systems

Pre-trained Large language models (LLMs) have shown impressive advancements in functional magnetic resonance imaging (fMRI) analysis and causal discovery. Considering the unique nature of the causal discovery field, which focuses on extracting causal graphs from observed data, research on LLMs in this field is still at an early exploratory stage. As a subfield of causal discovery, effective connectivity (EC) has received even less attention, and LLM-based approaches in EC remain unexplored. Existing LLM-based approaches for causal discovery typically rely on iterative querying to assess the causal influence between variable pairs, without any model adaptation or fine-tuning, making them ill-suited for handling the cross-modal gap and complex causal structures. To this end, we propose BrainEC-LLM, the first method to fine-tune LLMs for estimating brain EC from fMRI data. Specifically, multiscale decomposition mixing module decomposes fMRI time series data into short-term and long-term multiscale trends, then mixing them in bottom-up (fine to coarse) and top-down (coarse to fine) manner to extract multiscale temporal variations. And cross attention is applied with pre-trained word embeddings to ensure consistency between the fMRI input and pre-trained natural language. The experimental results on simulated and real resting-state fMRI datasets demonstrate that BrainEC-LLM can achieve superior performance when compared to state-of-the-art baselines.


Less Greedy Equivalence Search

Neural Information Processing Systems

Greedy Equivalence Search (GES) is a classic score-based algorithm for causal discovery from observational data. In the sample limit, it recovers the Markov equivalence class of graphs that describe the data. Still, it faces two challenges in practice: computational cost and finite-sample accuracy. In this paper, we develop Less Greedy Equivalence Search (LGES), a variant of GES that retains its theoretical guarantees while partially addressing these limitations. LGES modifies the greedy step; rather than always applying the highest-scoring insertion, it avoids edge insertions between variables for which the score implies some conditional independence.


Crucible: Quantifying the Potential of Control Algorithms through LLM Agents

Neural Information Processing Systems

Control algorithms in production environments typically require domain experts to tune their parameters and logic for specific scenarios. However, existing research predominantly focuses on algorithmic performance under ideal or default configurations, overlooking the critical aspect of Tuning Potential. To bridge this gap, we introduce \texttt{Crucible}, an agent that employs an LLM-driven, multi-level expert simulation to turn algorithms and defines a formalized metric to quantitatively evaluate their Tuning Potential. We demonstrate \texttt{Crucible}'s effectiveness across a wide spectrum of case studies, from classic control tasks to complex computer systems, and validate its findings in a real-world deployment. Our experimental results reveal that \texttt{Crucible} systematically quantifies the tunable space across different algorithms. Furthermore, \texttt{Crucible} provides a new dimension for algorithm analysis and design, which ultimately leads to performance improvements.