AITopics | taxnodes:Technology: Instructional Materials

Conformal Prediction (CP) is a distribution-free uncertainty estimation framework that constructs prediction sets guaranteed to contain the true answer with a userspecified probability. Intuitively, the size of the prediction set encodes a general notion of uncertainty, with larger sets associated with higher degrees of uncertainty. In this work, we leverage information theory to connect conformal prediction to other notions of uncertainty. More precisely, we prove three different ways to upper bound the intrinsic uncertainty, as described by the conditional entropy of the target variable given the inputs, by combining CP with information theoretical inequalities. Moreover, we demonstrate two direct and useful applications of such connection between conformal prediction and information theory: (i) more principled and effective conformal training objectives that generalize previous approaches and enable end-to-end training of machine learning models from scratch, and (ii) a natural mechanism to incorporate side information into conformal prediction. We empirically validate both applications in centralized and federated learning settings, showing our theoretical results translate to lower inefficiency (average prediction set size) for popular CP methods.

artificial intelligence, machine learning, prediction, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Instructional Material (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Using Statistics to Automate Stochastic Optimization

Neural Information Processing SystemsMar-27-2025, 01:48:23 GMT

Despite the development of numerous adaptive optimizers, tuning the learning rate of stochastic gradient methods remains a major roadblock to obtaining good practical performance in machine learning. Rather than changing the learning rate at each iteration, we propose an approach that automates the most common hand-tuning heuristic: use a constant learning rate until "progress stops", then drop. We design an explicit statistical test that determines when the dynamics of stochastic gradient descent reach a stationary distribution. This test can be performed easily during training, and when it fires, we decrease the learning rate by a constant multiplicative factor. Our experiments on several deep learning tasks demonstrate that this statistical adaptive stochastic approximation (SASA) method can automatically find good learning rate schedules and match the performance of hand-tuned methods using default settings of its parameters. The statistical testing helps to control the variance of this procedure and improves its robustness.

artificial intelligence, machine learning, variance, (19 more...)

Neural Information Processing Systems

Country:

North America (0.46)
Europe (0.46)

Genre: Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Bandit Learning with Implicit Feedback Yi Qi

Neural Information Processing SystemsMar-27-2025, 01:47:12 GMT

Implicit feedback, such as user clicks, although abundant in online information service systems, does not provide substantial evidence on users' evaluation of system's output. Without proper modeling, such incomplete supervision inevitably misleads model estimation, especially in a bandit learning setting where the feedback is acquired on the fly. In this work, we perform contextual bandit learning with implicit feedback by modeling the feedback as a composition of user result examination and relevance judgment. Since users' examination behavior is unobserved, we introduce latent variables to model it. We perform Thompson sampling on top of variational Bayesian inference for arm selection and model update. Our upper regret bound analysis of the proposed algorithm proves its feasibility of learning from implicit feedback in a bandit setting; and extensive empirical evaluations on click logs collected from a major MOOC platform further demonstrate its learning effectiveness in practice.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America (0.46)
Asia (0.28)

Genre: Instructional Material > Online (0.49)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.67)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Online-Within-Online Meta-Learning

Giulia Denevi, Dimitris Stamos, Carlo Ciliberto, Massimiliano Pontil

Neural Information Processing SystemsMar-27-2025, 01:46:19 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Instructional Material > Online (0.70)

Industry: Education > Educational Setting (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

How to Start Training: The Effect of Initialization and Architecture

Boris Hanin, David Rolnick

Neural Information Processing SystemsMar-27-2025, 01:38:55 GMT

We identify and study two common failure modes for early training in deep ReLU nets. For each, we give a rigorous proof of when it occurs and how to avoid it, for fully connected, convolutional, and residual architectures. We show that the first failure mode, exploding or vanishing mean activation length, can be avoided by initializing weights from a symmetric distribution with variance 2/fan-in and, for ResNets, by correctly scaling the residual modules. We prove that the second failure mode, exponentially large variance of activation length, never occurs in residual nets once the first failure mode is avoided. In contrast, for fully connected nets, we prove that this failure mode can happen and is avoided by keeping constant the sum of the reciprocals of layer widths. We demonstrate empirically the effectiveness of our theoretical results in predicting when networks are able to start training. In particular, we note that many popular initializations fail our criteria, whereas correct initialization and architecture allows much deeper networks to be trained.

artificial intelligence, initialization, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.28)
North America > United States > Massachusetts (0.28)

Genre: Instructional Material > Training Manual (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Virtual Class Enhanced Discriminative Embedding Learning

Binghui Chen, Weihong Deng, Haifeng Shen

Neural Information Processing SystemsMar-27-2025, 01:37:08 GMT

Figure 1: Illustration of angularly distributed features on 2-D space.

artificial intelligence, machine learning, virtual softmax, (11 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre:

Instructional Material > Online (0.52)
Instructional Material > Course Syllabus & Notes (0.52)

Industry: Education > Educational Setting > Online (0.52)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

Acknowledgement

Neural Information Processing SystemsMar-27-2025, 01:22:04 GMT

This research was funded by Natural Sciences and Engineering Research Council of Canada. We wish to thank Tao Yu and Hongjin Su for running our code on the hold out test set of Spider and Jinyang Li, Binyuan Hui, Reynold Cheng, Ge Qu and the other authors of BIRD for running our code on the holdout test set of BIRD. We also wish to thank Csaba Czepesvari, Dale Schuurmans and the anonymous reviewers of NeurIPS for their constructive comments to improve this work. Language models are few-shot learners. Lgesql: line graph enhanced text-to-sql model with mixed local and non-local relations. Evaluating large language models trained on code. Large language models are few (1)-shot table reasoners. Training verifiers to solve math word problems. Few-shot table-to-text generation with prompt planning and knowledge memorization. Transforming meaning representation grammars to improve semantic parsing. Large language models are zero-shot reasoners.

artificial intelligence, large language model, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.24)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education (0.68)
Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

8313b1920ee9c78d846c5798c1ce48be-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 00:02:11 GMT

artificial intelligence, domain adaptation, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Instructional Material > Online (0.41)
Instructional Material > Course Syllabus & Notes (0.41)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Dataset and Analysis of Long-Term Skill Acquisition in Robot-Assisted Minimally Invasive Surgery

Sharon, Yarden, Geftler, Alex, Lev, Hanna Kossowsky, Nisky, Ilana

arXiv.org Artificial IntelligenceMar-27-2025

Objective: We aim to investigate long-term robotic surgical skill acquisition among surgical residents and the effects of training intervals and fatigue on performance. Methods: For six months, surgical residents participated in three training sessions once a month, surrounding a single 26-hour hospital shift. In each shift, they participated in training sessions scheduled before, during, and after the shift. In each training session, they performed three dry-lab training tasks: Ring Tower Transfer, Knot-Tying, and Suturing. We collected a comprehensive dataset, including videos synchronized with kinematic data, activity tracking, and scans of the suturing pads. Results: We collected a dataset of 972 trials performed by 18 residents of different surgical specializations. Participants demonstrated consistent performance improvement across all tasks. In addition, we found variations in between-shift learning and forgetting across metrics and tasks, and hints for possible effects of fatigue. Conclusion: The findings from our first analysis shed light on the long-term learning processes of robotic surgical skills with extended intervals and varying levels of fatigue. Significance: This study lays the groundwork for future research aimed at optimizing training protocols and enhancing AI applications in surgery, ultimately contributing to improved patient outcomes. The dataset will be made available upon acceptance of our journal submission.

artificial intelligence, machine learning, participant, (12 more...)

arXiv.org Artificial Intelligence

2503.21591

Country: