AITopics | eqc

Hardware-specific optimizations in machine learning (ML) frameworks can cause numerical deviations of inference results. Quite surprisingly, despite using a fixed trained model and fixed input data, inference results are not consistent across platforms, and sometimes not even deterministic on the same platform. We study the causes of these numerical deviations for convolutional neural networks (CNN) on realistic end-to-end inference pipelines and in isolated experiments. Results from 75 distinct platforms suggest that the main causes of deviations on CPUs are differences in SIMD use, and the selection of convolution algorithms at runtime on GPUs. We link the causes and propagation effects to properties of the ML model and evaluate potential mitigations. We make our research code publicly available.

artificial intelligence, deviation, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Austria > Tyrol > Innsbruck (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

ModularGaussianProcessesforTransferLearning Supplementary Material

Neural Information Processing SystemsFeb-11-2026, 06:25:50 GMT

More insights about the predictive GP posterior used for the contrastive expectation integrals are inthefirst section. Our coding idea is that the Python user only specifies a dictionary of models as input:models = {model1, model2, ..., modelK}.

artificial intelligence, logp, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

1dba3025b159cd9354da65e2d0436a31-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 18:15:06 GMT

Traditionally, federated learning (FL) aims to train a single global model while collaboratively using multiple clients and a server. Two natural challenges that FL algorithms face are heterogeneity in data across clients and collaboration of clients with diverse resources.

artificial intelligence, eqc, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Understanding the Nature of Depth-1 Equivariant Quantum Circuit

Teo, Jonathan, Wei, Lee Xin, Lau, Hoong Chuin

arXiv.org Artificial IntelligenceNov-20-2025

The Equivariant Quantum Circuit (EQC) for the Travelling Salesman Problem (TSP) has been shown to achieve near-optimal performance in solving small TSP problems (up to 20 nodes) using only two parameters at depth 1. However, extending EQCs to larger TSP problem sizes remains challenging due to the exponential time and memory for quantum circuit simulation, as well as increasing noise and decoherence when running on actual quantum hardware. In this work, we propose the Size-Invariant Grid Search (SIGS), an efficient training optimization for Quantum Reinforcement Learning (QRL), and use it to simulate the outputs of a trained Depth-1 EQC up to 350-node TSP instances - well beyond previously tractable limits. At TSP with 100 nodes, we reduce total simulation times by 96.4%, when comparing to RL simulations with the analytical expression (151 minutes using RL to under 6 minutes using SIGS on TSP-100), while achieving a mean optimality gap within 0.005 of the RL trained model on the test set. SIGS provides a practical benchmarking tool for the QRL community, allowing us to efficiently analyze the performance of QRL algorithms on larger problem sizes. We provide a theoretical explanation for SIGS called the Size-Invariant Properties that goes beyond the concept of equivariance discussed in prior literature.

machine learning, node, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2511.10756

Genre: Research Report (0.64)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback

Supplementary Material Causes and Effects of Unanticipated Numerical Deviations in Neural Network Inference Frameworks

Neural Information Processing SystemsOct-9-2025, 04:50:19 GMT

For CIFAR-10, model Cifar10-small reaches 53.18 % accuracy, and the Cifar10-R18 reaches 60.25 % accuracy. These accuracies are not competitive with the state of the art, but sufficiently better than random guessing. We can safely assume that the kernels learn meaningful weights. Experiment samples We process three samples for each of our models to measure the consistency of our results. The first sample is the first test sample (for simplicity); we additionally use a sample from a different class (sample index 1 for CIFAR-10, and index 6 for Deep Weeds), a sample from the same class as the first sample is also used (index 6 for CIFAR-10, and index 1 for Deep Weeds). All sample indexes refer to the unshuffled test set of the respective dataset.

artificial intelligence, machine learning, precision, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Austria > Tyrol > Innsbruck (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)

Add feedback

Lifelong Graph Summarization with Neural Networks: 2012, 2022, and a Time Warp

Frank, Jonatan, Hoffmann, Marcel, Lell, Nicolas, Richerby, David, Scherp, Ansgar

arXiv.org Artificial IntelligenceJul-25-2024

Summarizing web graphs is challenging due to the heterogeneity of the modeled information and its changes over time. We investigate the use of neural networks for lifelong graph summarization. Assuming we observe the web graph at a certain time, we train the networks to summarize graph vertices. We apply this trained network to summarize the vertices of the changed graph at the next point in time. Subsequently, we continue training and evaluating the network to perform lifelong graph summarization. We use the GNNs Graph-MLP and GraphSAINT, as well as an MLP baseline, to summarize the temporal graphs. We compare $1$-hop and $2$-hop summaries. We investigate the impact of reusing parameters from a previous snapshot by measuring the backward and forward transfer and the forgetting rate of the neural networks. Our extensive experiments on ten weekly snapshots of a web graph with over $100$M edges, sampled in 2012 and 2022, show that all networks predominantly use $1$-hop information to determine the summary, even when performing $2$-hop summarization. Due to the heterogeneity of web graphs, in some snapshots, the $2$-hop summary produces over ten times more vertex summaries than the $1$-hop summary. When using the network trained on the last snapshot from 2012 and applying it to the first snapshot of 2022, we observe a strong drop in accuracy. We attribute this drop over the ten-year time warp to the strongly increased heterogeneity of the web graph in 2022.

graph, snapshot, vertex, (14 more...)

arXiv.org Artificial Intelligence

2407.18042

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(15 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Ensemble Quantile Classifier

Lai, Yuanhao, McLeod, Ian

arXiv.org Machine LearningOct-28-2019

Both the median-based classifier and the quantile-based classifier are useful for discriminating high-dimensional data with heavy-tailed or skewed inputs. But these methods are restricted as they assign equal weight to each variable in an unregularized way. The ensemble quantile classifier is a more flexible regularized classifier that provides better performance with high-dimensional data, asymmetric data or when there are many irrelevant extraneous inputs. The improved performance is demonstrated by a simulation study as well as an application to text categorization. It is proven that the estimated parameters of the ensemble quantile classifier consistently estimate the minimal population loss under suitable general model assumptions. It is also shown that the ensemble quantile classifier is Bayes optimal under suitable assumptions with asymmetric Laplace distribution inputs.

classifier, equation, error rate, (15 more...)

arXiv.org Machine Learning

doi: 10.1016/j.csda.2019.106849

1910.1296

Country: