AITopics | teacher-student

Collaborating Authors

teacher-student

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods

Akiyama, Shunta, Suzuki, Taiji

arXiv.org Machine LearningJun-6-2022

While deep learning has outperformed other methods for various tasks, theoretical frameworks that explain its reason have not been fully established. To address this issue, we investigate the excess risk of two-layer ReLU neural networks in a teacher-student regression model, in which a student network learns an unknown teacher network through its outputs. Especially, we consider the student network that has the same width as the teacher network and is trained in two phases: first by noisy gradient descent and then by the vanilla gradient descent. Our result shows that the student network provably reaches a near-global optimal solution and outperforms any kernel methods estimator (more generally, linear estimators), including neural tangent kernel approach, random feature model, and other kernel methods, in a sense of the minimax optimal rate. The key concept inducing this superiority is the non-convexity of the neural network models. Even though the loss landscape is highly non-convex, the student network adaptively learns the teacher neurons.

artificial intelligence, machine learning, two-layer relu neural network, (4 more...)

arXiv.org Machine Learning

2205.14818

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting

Akiyama, Shunta, Suzuki, Taiji

arXiv.org Machine LearningJun-11-2021

Deep learning empirically achieves high performance in many applications, but its training dynamics has not been fully understood theoretically. In this paper, we explore theoretical analysis on training two-layer ReLU neural networks in a teacher-student regression model, in which a student network learns an unknown teacher network through its outputs. We show that with a specific regularization and sufficient over-parameterization, the student network can identify the parameters of the teacher network with high probability via gradient descent with a norm dependent stepsize even though the objective function is highly non-convex. The key theoretical tool is the measure representation of the neural networks and a novel application of a dual certificate argument for sparse estimation on a measure space. We analyze the global minima and global convergence property in the measure space.

gradient method, teacher-student, two-layer relu neural network, (13 more...)

arXiv.org Machine Learning

2106.06251

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Plymouth County > Hanover (0.04)
Asia > China (0.04)

Genre: Research Report (0.63)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Understanding Robustness in Teacher-Student Setting: A New Perspective

Yang, Zhuolin, Chen, Zhaoxi, Cai, Tiffany, Chen, Xinyun, Li, Bo, Tian, Yuandong

arXiv.org Artificial IntelligenceFeb-28-2021

Adversarial examples have appeared as a ubiquitous property of machine learning models where bounded adversarial perturbation could mislead the models to make arbitrarily incorrect predictions. Such examples provide a way to assess the robustness of machine learning models as well as a proxy for understanding the model training process. Extensive studies try to explain the existence of adversarial examples and provide ways to improve model robustness (e.g. adversarial training). While they mostly focus on models trained on datasets with predefined labels, we leverage the teacher-student framework and assume a teacher model, or oracle, to provide the labels for given instances. We extend Tian (2019) in the case of low-rank input data and show that student specialization (trained student neuron is highly correlated with certain teacher neuron at the same layer) still happens within the input subspace, but the teacher and student nodes could differ wildly out of the data subspace, which we conjecture leads to adversarial examples. Extensive experiments show that student specialization correlates strongly with model robustness in different scenarios, including student trained via standard training, adversarial training, confidence-calibrated adversarial training, and training with robust feature dataset. Our studies could shed light on the future exploration about adversarial examples, and enhancing model robustness via principled data augmentation.

node, specialization, student node, (13 more...)

arXiv.org Artificial Intelligence

2102.1317

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

An Advising Framework for Multiagent Reinforcement Learning Systems

Silva, Felipe Leno da (Escola Politécnica da Universidade de São Paulo) | Glatt, Ruben (Escola Politécnica da Universidade de São Paulo) | Costa, Anna Helena Reali (Escola Politécnica da Universidade de São Paulo)

AAAI ConferencesFeb-14-2017

Reinforcement Learning has long been employed to solve sequential decision-making problems with minimal input data. However, the classical approach requires a long time to learn a suitable policy, especially in Multiagent Systems. The teacher-student framework proposes to mitigate this problem by integrating an advising procedure in the learning process, in which an experienced agent (human or not) can advise a student to guide her exploration. However, the teacher is assumed to be an expert in the learning task. We here propose an advising framework where multiple agents advise each other while learning in a shared environment, and the advisor is not expected to necessarily act optimally. Our experiments in a simulated Robot Soccer environment show that the learning process is improved by incorporating this kind of advice.

agent, artificial intelligence, machine learning, (13 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: South America > Brazil (0.16)

Industry: Leisure & Entertainment > Sports > Soccer (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback