AITopics | Statistical Learning

Please refer to Table 5. Table 5: Architecture of E4-Net on Mnist-rot classification, p means dropout rate. The hyperparameters we use in this architecture are kernel size k = 5, reduction ratio r = 1, and the number of slices s = 2. In the large model, we increase the channel dimension to 24, the number of slices to 12, the reduction ratio to 2, and keep other hyperparameters the same. We take ResNet-18 [2], which is composed of an initial convolution layer, followed by 4 stage Res-Blocks and one final classification layer.

artificial intelligence, machine learning, object-oriented architecture, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.41)

Add feedback

Efficient Equivariant Network

Neural Information Processing SystemsApr-25-2026, 06:00:51 GMT

Convolutional neural networks (CNNs) have dominated the field of Computer Vision and achieved great success due to their built-in translation equivariance. Group equivariant CNNs (G-CNNs) that incorporate more equivariance can significantly improve the performance of conventional CNNs. However, G-CNNs are faced with two major challenges: spatial-agnostic problem and expensive computational cost. In this work, we propose a general framework of previous equivariant models, which includes G-CNNs and equivariant self-attention layers as special cases.

artificial intelligence, machine learning, neural network, (17 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

Uncertainty Estimation for Multi-view Data: The Power of Seeing the Whole Picture

Neural Information Processing SystemsApr-25-2026, 05:49:28 GMT

Uncertainty estimation is essential to make neural networks trustworthy in realworld applications. Extensive research efforts have been made to quantify and reduce predictive uncertainty. However, most existing works are designed for unimodal data, whereas multi-view uncertainty estimation has not been sufficiently investigated. Therefore, we propose a new multi-view classification framework for better uncertainty estimation and out-of-domain sample detection, where we associate each view with an uncertainty-aware classifier and combine the predictions of all the views in a principled way.

artificial intelligence, machine learning, uncertainty estimation, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Reconciling Competing Sampling Strategies of Network Embedding

Neural Information Processing SystemsApr-25-2026, 05:48:52 GMT

Network embedding plays a significant role in a variety of applications. To capture the topology of the network, most of the existing network embedding algorithms follow a sampling training procedure, which maximizes the similarity (e.g., embedding vectors' dot product) between positively sampled node pairs and minimizes the similarity between negatively sampled node pairs in the embedding space. Typically, close node pairs function as positive samples while distant node pairs are usually considered as negative samples. However, under different or even competing sampling strategies, some methods champion sampling distant node pairs as positive samples to encapsulate longer distance information in link prediction, whereas others advocate adding close nodes into the negative sample set to boost the performance of node recommendation. In this paper, we seek to understand the intrinsic relationships between these competing strategies. To this end, we identify two properties (discrimination and monotonicity) that given any node pair proximity distribution, node embeddings should embrace. Moreover, we quantify the empirical error of the trained similarity score w.r.t. the sampling strategy, which leads to an important finding that the discrimination property and the monotonicity property for all node pairs can not be satisfied simultaneously in real-world applications. Guided by such analysis, a simple yet novel model (SENSEI) is proposed, which seamlessly fulfills the discrimination property and the partial monotonicity within the top-K ranking list. Extensive experiments show that SENSEI outperforms the state-of-the-arts in plain network embedding.

data mining, machine learning, node, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report (0.48)

Industry: Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Information Management (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Learning where to learn: Gradient sparsity in meta and continual learning

Neural Information Processing SystemsApr-25-2026, 05:48:45 GMT

Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to learn a weight initialization such that a small number of weight changes results in low generalization error. We show that this form of meta-learning can be improved by letting the learning algorithm decide which weights to change, i.e., by learning where to learn. We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-byproblem basis.

artificial intelligence, machine learning, sparsity, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

2a7157c84dcf263f77b37d6c11d7d149-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 05:47:39 GMT

artificial intelligence, data augmentation, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

2a7157c84dcf263f77b37d6c11d7d149-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 05:47:35 GMT

artificial intelligence, inductive learning, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation

Neural Information Processing SystemsApr-25-2026, 05:46:55 GMT

Recent studies in reinforcement learning (RL) have made significant progress by leveraging function approximation to alleviate the sample complexity hurdle for better performance. Despite the success, existing provably efficient algorithms typically rely on the accessibility of immediate feedback upon taking actions. The failure to account for the impact of delay in observations can significantly degrade the performance of real-world systems due to the regret blow-up. In this work, we tackle the challenge of delayed feedback in RL with linear function approximation by employing posterior sampling, which has been shown to empirically outperform the popular UCB algorithms in a wide range of regimes. We first introduce Delayed-PSVI, an optimistic value-based algorithm that effectively explores the value function space via noise perturbation with posterior sampling. We provide the first analysis for posterior sampling algorithms with delayed feedback in RL and show our algorithm achieves eO( d3H3T +d2H2E[τ])worst-case regret in the presence of unknown stochastic delays. Here E[τ] is the expected delay. To further improve its computational efficiency and to expand its applicability in high-dimensional RL problems, we incorporate a gradient-based approximate sampling scheme via Langevin dynamics for Delayed-LPSVI, which maintains the same order-optimal regret guarantee with eO(dHK) computational cost. Empirical evaluations are performed to demonstrate the statistical and computational efficacy of our algorithms.

machine learning, probability 1, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report (0.87)

Industry: