AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

GAMMA: Gated Multi-hop Message Passing for Homophily-Agnostic Node Representation in GNNs

Neural Information Processing SystemsJun-14-2026, 20:42:50 GMT

The success of Graph Neural Networks (GNNs) leverages the homophily principle, where connected nodes share similar features and labels. However, this assumption breaks down in heterophilic graphs, where same-class nodes are often distributed across distant neighborhoods rather than immediate connections. Recent attempts expand the receptive field through multi-hop aggregation schemes that explicitly preserve intermediate representations from each hop distance. While effective at capturing heterophilic patterns, these methods require separate weight matrices per hop and feature concatenation, causing parameters to scale linearly with hop count. This leads to high computational complexity and GPU memory consumption. We propose Gated Multi-hop Message Passing (GAMMA), where nodes assess how relevant the aggregated information is from their k-hop neighbors. This assessment occurs through multiple refinement steps where the node compares each hop's embedding with its current representation, allowing it to focus on the most informative hops. During the forward pass, GAMMA finds the optimal mix of multi-hop information local to each node using a single feature vector without needing separate representations for each hop, thereby maintaining dimensionality comparable to single hop GNNs. In addition, we propose a weight sharing scheme that leverages a unified transformation for aggregated features from multiple hops so the global heterophilic patterns specific to each hop are learned during training.

artificial intelligence, information, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Florida (0.45)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Energy (0.92)
Information Technology (0.68)
Health & Medicine (0.67)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Localist Topographic Expert Routing: ABarrel Cortex-Inspired Modular Network for Sensorimotor Processing

Neural Information Processing SystemsJun-14-2026, 20:41:09 GMT

Biological sensorimotor systems process information through spatially organized, functionally specialized modules. A canonical example is the rodent barrel cortex, in which each vibrissa (whisker) projects to a dedicated cortical column, forming a precise somatotopic map. This anatomical organization stands in stark contrast to the architectures of most artificial neural networks, which are typically monolithic or rely on expert-isolated mixture-of-experts (MoE) mechanisms. In this work, we introduce a brain-inspired modular architecture that treats the barrel cortex as a biologically constrained instantiation of an expert system. Each module (or "expert") corresponds to a cortical column composed of multiple neuron subtypes spanning vertical cortical layers.

artificial intelligence, barrel cortex, machine learning, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Cascaded Language Models for Cost-Effective Human-AI Decision-Making

Neural Information Processing SystemsJun-14-2026, 20:27:14 GMT

A challenge in human-AI decision-making is to balance three factors: the correctness of predictions, the cost of knowledge and reasoning complexity, and the confidence about whether to abstain from automated answers or escalate to human experts. In this work, we present a cascaded LLM decision framework that adaptively delegates tasks across multiple tiers of expertise - a base model for initial candidate answers, a more capable and knowledgeable (but costlier) large model, and a human expert for when the model cascade abstains.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia (0.68)
Europe (0.46)
North America (0.45)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

Aligning Evaluation with Clinical Priorities: Calibration, Label Shift, and Error Costs

Neural Information Processing SystemsJun-14-2026, 20:24:37 GMT

Machine learning-based decision support systems are increasingly deployed in clinical settings, where probabilistic scoring functions are used to inform and prioritize patient management decisions. However, widely used scoring rules, such as accuracy and AUC-ROC, fail to adequately reflect key clinical priorities, including calibration, robustness to distributional shifts, and sensitivity to asymmetric error costs. In this work, we propose a principled yet practical evaluation framework for selecting calibrated thresholded classifiers that explicitly accounts for uncertainty in class prevalences and domain-specific cost asymmetries. Building on the theory of proper scoring rules, particularly the Schervish representation, we derive an adjusted variant of cross-entropy (log score) that averages cost-weighted performance over clinically relevant ranges of class balance. The resulting evaluation is simple to apply, sensitive to clinical deployment conditions, and designed to prioritize models that are both calibrated and robust to real-world variations.

accuracy, decision support system, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.46)
North America > United States > Massachusetts (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Single GPUTask Adaptation of Pathology Foundation Models for Whole Slide Image Analysis

Neural Information Processing SystemsJun-14-2026, 19:55:02 GMT

Pathology foundation models (PFMs) have emerged as powerful tools for analyzing whole slide images (WSIs). However, adapting these pretrained PFMs for specific clinical tasks presents considerable challenges, primarily due to the availability of only weak (WSI-level) labels for gigapixel images, necessitating multiple instance learning (MIL) paradigm for effective WSI analysis. This paper proposes a novel approach for single-GPUTask Adaptation of PFMs (TAPFM) that uses vision transformer (ViT) attention for MIL aggregation while optimizing both for feature representations and attention weights. The proposed approach maintains separate computational graphs for MIL aggregator and the PFM to create stable training dynamics that align with downstream task objectives during end-to-end adaptation. Evaluated on mutation prediction tasks for bladder cancer and lung adenocarcinoma across institutional and The Cancer Genome Atlas (TCGA) cohorts, TAPFM consistently outperforms conventional approaches, with H-Optimus-0 (TAPFM) outperforming the benchmarks. TAPFM effectively handles multi-label classification of actionable mutations as well. Thus, TAPFM makes adaptation of powerful pre-trained PFMs practical on standard hardware for various clinical applications.

machine learning, natural language, tapfm, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model

Neural Information Processing SystemsJun-14-2026, 19:23:21 GMT

Generative modeling of non-negative, discrete data, such as symbolic music, remains challenging due to two persistent limitations in existing methods. First, most approaches rely on modeling continuous embeddings, which are not wellsuited for inherently discrete data distributions.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment (0.93)
Media > Music (0.67)
Transportation > Ground > Road (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

0fa694fb9f1e265117e8da75966820fe-Paper-Conference.pdf

Neural Information Processing SystemsJun-14-2026, 19:11:57 GMT

We consider how to construct state abstractions compatible with a given set of abstract actions, to obtain a well-formed abstract Markov decision process (MDP). We show that the Bellman equation suggests that abstract states should represent distributions over states in the ground MDP; we characterize the conditions under which the resulting process is Markov and approximately model-preserving, derive an algorithm for constructing the abstract MDP, and apply it to visual chain and maze tasks. We generalize these results to the factored actions case, characterize the conditions that lead to factored abstract states, and apply the resulting algorithm to a visual grid and Montezuma's Revenge. These results provide a principled, powerful framework for learning neurosymbolic abstract Markov decision processes.

abstraction, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback

Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression

Neural Information Processing SystemsJun-14-2026, 19:03:27 GMT

State-space models (SSMs), particularly Mamba, emerge as an efficient Transformer alternative with linear complexity for long-sequence modeling. Recent empirical works demonstrate Mamba's in-context learning (ICL) capabilities competitive with Transformers, a critical capacity for large foundation models. However, theoretical understanding of Mamba's ICL remains limited, restricting deeper insights into its underlying mechanisms. Even fundamental tasks such as linear regression ICL, widely studied as a standard theoretical benchmark for Transformers, have not been thoroughly analyzed in the context of Mamba. To address this gap, we study the training dynamics of Mamba on the linear regression ICL task. By developing novel techniques tackling non-convex optimization with gradient descent related to Mamba's structure, we establish an exponential convergence rate to ICL solution, and derive a loss bound that is comparable to Transformer's. Importantly, our results reveal that Mamba can perform a variant of online gradient descent to learn the latent function in context. This mechanism is different from that of Transformer, which is typically understood to achieve ICL through gradient descent emulation. The theoretical results are verified by experimental simulation.

artificial intelligence, exp, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

Add feedback

Beyond Prediction: Managing the Repercussions of Machine Learning Applications

Neural Information Processing SystemsJun-14-2026, 19:03:07 GMT

Machine learning models are often designed to maximize a primary goal, such as accuracy. However, as these models are increasingly used to inform decisions that affect people's lives or well-being, it is often unclear what the real-world repercussions of their deployment might be--making it crucial to understand and manage such repercussions effectively. Models maximizing user engagement on social media platforms, e.g., may inadvertently contribute to the spread of misinformation and content that deepens political polarization. This issue is not limited to social media--it extends to other applications where machine learning-informed decisions can have real-world repercussions, such as education, employment, and lending. Existing methods addressing this issue require prior knowledge or estimates of analytical models describing the relationship between a classifier's predictions and their corresponding repercussions. We introduce THEIA, a novel classification algorithm capable of optimizing a primary objective, such as accuracy, while providing high-confidence guarantees about its potential repercussions. Importantly, THEIA solves the open problem of providing such guarantees based solely on existing data with observations of previous repercussions. We prove that it satisfies constraints on a model's repercussions with high confidence and that it is guaranteed to identify a solution, if one exists, given sufficient data. We empirically demonstrate, using real-life data, that THEIA can identify models that achieve high accuracy while ensuring, with high confidence, that constraints on their repercussions are satisfied.

artificial intelligence, machine learning, social media, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Banking & Finance (1.00)
Education (0.86)
Media > News (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.45)

Add feedback

Synthetic Series-Symbol Data Generation for Time Series Foundation Models

Neural Information Processing SystemsJun-14-2026, 18:48:04 GMT

Foundation models for time series analysis (TSA) have attracted significant attention. However, challenges such as training data scarcity and imbalance continue to hinder their development. Inspired by complex dynamic system theories, we design a series-symbol data generation mechanism, enabling the unrestricted creation of high-quality time series data paired with corresponding symbolic expressions. To leverage series-symbol data pairs with strong correlations, we develop SymTime, a pre-trained foundation model for enhancing time series representation using symbolic information. SymTime demonstrates competitive performance across five major TSA tasks when fine-tunes with downstream tasks, rivaling foundation models pre-trained on real-world datasets. This approach underscores the potential of series-symbol data generation and pretraining mechanisms in overcoming data scarcity and enhancing task performance.

data mining, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: