AITopics | taxnodes:Technology: Instructional Materials

Collaborating Authors

taxnodes:Technology: Instructional Materials

News Overviews Instructional Materials AI-Alerts Classics

How to Start Training: The Effect of Initialization and Architecture

Neural Information Processing SystemsMar-27-2025, 01:38:55 GMT

We identify and study two common failure modes for early training in deep ReLU nets. For each, we give a rigorous proof of when it occurs and how to avoid it, for fully connected, convolutional, and residual architectures. We show that the first failure mode, exploding or vanishing mean activation length, can be avoided by initializing weights from a symmetric distribution with variance 2/fan-in and, for ResNets, by correctly scaling the residual modules. We prove that the second failure mode, exponentially large variance of activation length, never occurs in residual nets once the first failure mode is avoided. In contrast, for fully connected nets, we prove that this failure mode can happen and is avoided by keeping constant the sum of the reciprocals of layer widths. We demonstrate empirically the effectiveness of our theoretical results in predicting when networks are able to start training. In particular, we note that many popular initializations fail our criteria, whereas correct initialization and architecture allows much deeper networks to be trained.

artificial intelligence, initialization, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.28)
North America > United States > Massachusetts (0.28)

Genre: Instructional Material > Training Manual (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Virtual Class Enhanced Discriminative Embedding Learning

Binghui Chen, Weihong Deng, Haifeng Shen

Neural Information Processing SystemsMar-27-2025, 01:37:08 GMT

Figure 1: Illustration of angularly distributed features on 2-D space.

artificial intelligence, machine learning, virtual softmax, (11 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre:

Instructional Material > Online (0.52)
Instructional Material > Course Syllabus & Notes (0.52)

Industry: Education > Educational Setting > Online (0.52)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

Acknowledgement

Neural Information Processing SystemsMar-27-2025, 01:22:04 GMT

This research was funded by Natural Sciences and Engineering Research Council of Canada. We wish to thank Tao Yu and Hongjin Su for running our code on the hold out test set of Spider and Jinyang Li, Binyuan Hui, Reynold Cheng, Ge Qu and the other authors of BIRD for running our code on the holdout test set of BIRD. We also wish to thank Csaba Czepesvari, Dale Schuurmans and the anonymous reviewers of NeurIPS for their constructive comments to improve this work. Language models are few-shot learners. Lgesql: line graph enhanced text-to-sql model with mixed local and non-local relations. Evaluating large language models trained on code. Large language models are few (1)-shot table reasoners. Training verifiers to solve math word problems. Few-shot table-to-text generation with prompt planning and knowledge memorization. Transforming meaning representation grammars to improve semantic parsing. Large language models are zero-shot reasoners.

artificial intelligence, large language model, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.24)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education (0.68)
Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

8313b1920ee9c78d846c5798c1ce48be-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 00:02:11 GMT

artificial intelligence, domain adaptation, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Instructional Material > Online (0.41)
Instructional Material > Course Syllabus & Notes (0.41)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Dataset and Analysis of Long-Term Skill Acquisition in Robot-Assisted Minimally Invasive Surgery

Sharon, Yarden, Geftler, Alex, Lev, Hanna Kossowsky, Nisky, Ilana

arXiv.org Artificial IntelligenceMar-27-2025

Objective: We aim to investigate long-term robotic surgical skill acquisition among surgical residents and the effects of training intervals and fatigue on performance. Methods: For six months, surgical residents participated in three training sessions once a month, surrounding a single 26-hour hospital shift. In each shift, they participated in training sessions scheduled before, during, and after the shift. In each training session, they performed three dry-lab training tasks: Ring Tower Transfer, Knot-Tying, and Suturing. We collected a comprehensive dataset, including videos synchronized with kinematic data, activity tracking, and scans of the suturing pads. Results: We collected a dataset of 972 trials performed by 18 residents of different surgical specializations. Participants demonstrated consistent performance improvement across all tasks. In addition, we found variations in between-shift learning and forgetting across metrics and tasks, and hints for possible effects of fatigue. Conclusion: The findings from our first analysis shed light on the long-term learning processes of robotic surgical skills with extended intervals and varying levels of fatigue. Significance: This study lays the groundwork for future research aimed at optimizing training protocols and enhancing AI applications in surgery, ultimately contributing to improved patient outcomes. The dataset will be made available upon acceptance of our journal submission.

artificial intelligence, machine learning, participant, (12 more...)

arXiv.org Artificial Intelligence

2503.21591

Country:

North America > United States (0.28)
Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Education (1.00)
Health & Medicine > Health Care Technology (0.94)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Meta-Learning Adversarial Bandit Algorithms Anonymous Author(s) Affiliation Address email We study online meta-learning with bandit feedback, with the goal of improving

Neural Information Processing SystemsMar-26-2025, 23:42:01 GMT

Such feedback can be stochastic, e.g. the

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report (0.46)
Instructional Material > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.82)

Add feedback

6f627c706a7d9961cc1ff55f37f07f97-Paper-Conference.pdf

Neural Information Processing SystemsMar-26-2025, 23:41:58 GMT

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.45)

Genre:

Instructional Material (0.46)
Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Mixed-Initiative Multiagent Apprenticeship Learning for Human Training of Robot Teams

Neural Information Processing SystemsMar-26-2025, 23:37:15 GMT

Extending recent advances in Learning from Demonstration (LfD) frameworks to multi-robot settings poses critical challenges such as environment non-stationarity due to partial observability which is detrimental to the applicability of existing methods. Although prior work has shown that enabling communication among agents of a robot team can alleviate such issues, creating inter-agent communication under existing Multi-Agent LfD (MA-LfD) frameworks requires the human expert to provide demonstrations for both environment actions and communication actions, which necessitates an efficient communication strategy on a known message space. To address this problem, we propose Mixed-Initiative Multi-Agent Apprenticeship Learning (MixTURE). MixTURE enables robot teams to learn from a human expert-generated data a preferred policy to accomplish a collaborative task, while simultaneously learning emergent inter-agent communication to enhance team coordination. The key ingredient to MixTURE's success is automatically learning a communication policy, enhanced by a mutual-information maximizing reverse model that rationalizes the underlying expert demonstrations without the need for human generated data or an auxiliary reward function. MixTURE outperforms a variety of relevant baselines on diverse data generated by human experts in complex heterogeneous domains. MixTURE is the first MA-LfD framework to enable learning multi-robot collaborative policies directly from real human data, resulting in 44% less human workload, and 46% higher usability score.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.86)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Biomedical Visual Instruction Tuning with Clinician Preference Alignment

Neural Information Processing SystemsMar-26-2025, 22:54:01 GMT

Recent advancements in multimodal foundation models have showcased impressive capabilities in understanding and reasoning with visual and textual information. Adapting these foundation models trained for general usage to specialized domains like biomedicine requires large-scale domain-specific instruction datasets. While existing works have explored curating such datasets automatically, the resultant datasets are not explicitly aligned with domain expertise. In this work, we propose a data-centric framework, Biomedical Visual Instruction Tuning with Clinician Preference Alignment (BioMed-VITAL), that incorporates clinician preferences into both stages of generating and selecting instruction data for tuning biomedical multimodal foundation models. First, during the generation stage, we prompt the GPT-4V generator with a diverse set of clinician-selected demonstrations for preference-aligned data candidate generation. Then, during the selection phase, we train a separate selection model, which explicitly distills clinician and policy-guided model preferences into a rating function to select high-quality data for medical instruction tuning. Results show that the model tuned with the instruction data from our method demonstrates a significant improvement in open visual chat (18.5% relatively) and medical VQA (win rate up to 81.73%). Our instruction-following data, models, and code are available at https://BioMed-VITAL.github.io.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: