AITopics | Xie, Pengtao

Collaborating Authors

Xie, Pengtao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning by Teaching, with Application to Neural Architecture Search

Sheth, Parth, Jiang, Yueyu, Xie, Pengtao

arXiv.org Artificial IntelligenceMar-11-2021

In human learning, an effective skill in improving learning outcomes is learning by teaching: a learner deepens his/her understanding of a topic by teaching this topic to others. In this paper, we aim to borrow this teaching-driven learning methodology from humans and leverage it to train more performant machine learning models, by proposing a novel ML framework referred to as learning by teaching (LBT). In the LBT framework, a teacher model improves itself by teaching a student model to learn well. Specifically, the teacher creates a pseudo-labeled dataset and uses it to train a student model. Based on how the student performs on a validation dataset, the teacher re-learns its model and re-teaches the student until the student achieves great validation performance. Our framework is based on three-level optimization which contains three stages: teacher learns; teacher teaches student; teacher re-learns based on how well the student performs. A simple but efficient algorithm is developed to solve the three-level optimization problem. We apply LBT to search neural architectures on CIFAR-10, CIFAR-100, and ImageNet. The efficacy of our method is demonstrated in various experiments.

instructional theory, neural network, student, (19 more...)

arXiv.org Artificial Intelligence

2103.07009

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning by Ignoring

Zhao, Xingchen, Xie, Pengtao

arXiv.org Artificial IntelligenceDec-28-2020

Learning by ignoring, which identifies less important things and excludes them from the learning process, is an effective learning technique in human learning. There has been psychological studies showing that learning to ignore certain things is a powerful tool for helping people focus. We are interested in investigating whether this powerful learning technique can be borrowed from humans to improve the learning abilities of machines. We propose a novel learning approach called learning by ignoring (LBI). Our approach automatically identifies pretraining data examples that have large domain shift from the target distribution by learning an ignoring variable for each example and excludes them from the pretraining process. We propose a three-level optimization framework to formulate LBI which involves three stages of learning: pretraining by minimizing the losses weighed by ignoring variables; finetuning; updating the ignoring variables by minimizing the validation loss. We develop an efficient algorithm to solve the LBI problem. Experiments on various datasets demonstrate the effectiveness of our method.

neural network, optimization problem, source example, (15 more...)

arXiv.org Artificial Intelligence

2012.14288

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Small-Group Learning, with Application to Neural Architecture Search

Du, Xuefeng, Xie, Pengtao

arXiv.org Artificial IntelligenceDec-23-2020

In human learning, an effective learning methodology is small-group learning: a small group of students work together towards the same learning objective, where they express their understanding of a topic to their peers, compare their ideas, and help each other to troubleshoot problems. In this paper, we aim to investigate whether this human learning method can be borrowed to train better machine learning models, by developing a novel ML framework - small-group learning (SGL). In our framework, a group of learners (ML models) with different model architectures collaboratively help each other to learn by leveraging their complementary advantages. Specifically, each learner uses its intermediately trained model to generate a pseudo-labeled dataset and re-trains its model using pseudo-labeled datasets generated by other learners. SGL is formulated as a multi-level optimization framework consisting of three learning stages: each learner trains a model independently and uses this model to perform pseudo-labeling; each learner trains another model using datasets pseudo-labeled by other learners; learners improve their architectures by minimizing validation losses. An efficient algorithm is developed to solve the multi-level optimization problem. We apply SGL for neural architecture search. Results on CIFAR-100, CIFAR-10, and ImageNet demonstrate the effectiveness of our method.

architecture, artificial intelligence, neural network, (19 more...)

arXiv.org Artificial Intelligence

2012.12502

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning by Self-Explanation, with Application to Neural Architecture Search

Hosseini, Ramtin, Xie, Pengtao

arXiv.org Artificial IntelligenceDec-23-2020

Learning by self-explanation, where students explain a learned topic to themselves for deepening their understanding of this topic, is a broadly used methodology in human learning and shows great effectiveness in improving learning outcome. We are interested in investigating whether this powerful learning technique can be borrowed from humans to improve the learning abilities of machines. We propose a novel learning approach called learning by self-explanation (LeaSE). In our approach, an explainer model improves its learning ability by trying to clearly explain to an audience model regarding how a prediction outcome is made. We propose a multi-level optimization framework to formulate LeaSE which involves four stages of learning: explainer learns; explainer explains; audience learns; explainer and audience validate themselves. We develop an efficient algorithm to solve the LeaSE problem. We apply our approach to neural architecture search on CIFAR-100, CIFAR-10, and ImageNet. The results demonstrate the effectiveness of our method.

architecture, neural network, optimization problem, (19 more...)

arXiv.org Artificial Intelligence

2012.12899

Country: North America > United States > California (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

Skillearn: Machine Learning Inspired by Humans' Learning Skills

Xie, Pengtao, Du, Xuefeng, Ban, Hao

arXiv.org Artificial IntelligenceDec-8-2020

Humans, as the most powerful learners on the planet, have accumulated a lot of learning skills, such as learning through tests, interleaving learning, self-explanation, active recalling, to name a few. These learning skills and methodologies enable humans to learn new topics more effectively and efficiently. We are interested in investigating whether humans' learning skills can be borrowed to help machines to learn better. Specifically, we aim to formalize these skills and leverage them to train better machine learning (ML) models. To achieve this goal, we develop a general framework -- Skillearn, which provides a principled way to represent humans' learning skills mathematically and use the formally-represented skills to improve the training of ML models. In two case studies, we apply Skillearn to formalize two learning skills of humans: learning by passing tests and interleaving learning, and use the formalized skills to improve neural architecture search. Experiments on various datasets show that trained using the skills formalized by Skillearn, ML models achieve significantly better performance.

focused education, learner, neural network, (15 more...)

arXiv.org Artificial Intelligence

2012.04863

Country: North America > United States > California (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.93)
Research Report (0.64)

Industry: Education > Focused Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning by Passing Tests, with Application to Neural Architecture Search

Du, Xuefeng, Xie, Pengtao

arXiv.org Artificial IntelligenceNov-30-2020

Learning through tests is a broadly used methodology in human learning and shows great effectiveness in improving learning outcome: a sequence of tests are made with increasing levels of difficulty; the learner takes these tests to identify his/her weak points in learning and continuously addresses these weak points to successfully pass these tests. We are interested in investigating whether this powerful learning technique can be borrowed from humans to improve the learning abilities of machines. We propose a novel learning approach called learning by passing tests (LPT). In our approach, a tester model creates increasingly more-difficult tests to evaluate a learner model. The learner tries to continuously improve its learning ability so that it can successfully pass however difficult tests created by the tester. We propose a multi-level optimization framework to formulate LPT, where the tester learns to create difficult and meaningful tests and the learner learns to pass these tests. We develop an efficient algorithm to solve the LCT problem. Our method is applied for neural architecture search and achieves significant improvement over state-of-the-art baselines on CIFAR-100, CIFAR-10, and ImageNet.

artificial intelligence, neural network, tester, (17 more...)

arXiv.org Artificial Intelligence

2011.15102

Genre: Research Report (0.64)

Industry: Education (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

TreeGAN: Incorporating Class Hierarchy into Image Generation

Zhang, Ruisi, Mou, Luntian, Xie, Pengtao

arXiv.org Artificial IntelligenceSep-16-2020

Conditional image generation (CIG) is a widely studied problem in computer vision and machine learning. Given a class, CIG takes the name of this class as input and generates a set of images that belong to this class. In existing CIG works, for different classes, their corresponding images are generated independently, without considering the relationship among classes. In real-world applications, the classes are organized into a hierarchy and their hierarchical relationships are informative for generating high-fidelity images. In this paper, we aim to leverage the class hierarchy for conditional image generation. We propose two ways of incorporating class hierarchy: prior control and post constraint. In prior control, we first encode the class hierarchy, then feed it as a prior into the conditional generator to generate images. In post constraint, after the images are generated, we measure their consistency with the class hierarchy and use the consistency score to guide the training of the generator. Based on these two ideas, we propose a TreeGAN model which consists of three modules: (1) a class hierarchy encoder (CHE) which takes the hierarchical structure of classes and their textual names as inputs and learns an embedding for each class; the embedding captures the hierarchical relationship among classes; (2) a conditional image generator (CIG) which takes the CHE-generated embedding of a class as input and generates a set of images belonging to this class; (3) a consistency checker which performs hierarchical classification on the generated images and checks whether the generated images are compatible with the class hierarchy; the consistency score is used to guide the CIG to generate hierarchy-compatible images. Experiments on various datasets demonstrate the effectiveness of our method.

class hierarchy, health & medicine, immunology, (18 more...)

arXiv.org Artificial Intelligence

2009.07734

Genre: Research Report (0.40)

Industry:

Health & Medicine > Diagnostic Medicine (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Contrastive Self-supervised Learning for Graph Classification

Zeng, Jiaqi, Xie, Pengtao

arXiv.org Machine LearningSep-13-2020

Graph classification is a widely studied problem and has broad applications. In many real-world problems, the number of labeled graphs available for training classification models is limited, which renders these models prone to overfitting. To address this problem, we propose two approaches based on contrastive self-supervised learning (CSSL) to alleviate overfitting. In the first approach, we use CSSL to pretrain graph encoders on widely-available unlabeled graphs without relying on human-provided labels, then finetune the pretrained encoders on labeled graphs. In the second approach, we develop a regularizer based on CSSL, and solve the supervised classification task and the unsupervised CSSL task simultaneously. To perform CSSL on graphs, given a collection of original graphs, we perform data augmentation to create augmented graphs out of the original graphs. An augmented graph is created by consecutively applying a sequence of graph alteration operations. A contrastive loss is defined to learn graph encoders by judging whether two augmented graphs are from the same original graph. Experiments on various graph classification datasets demonstrate the effectiveness of our proposed methods.

graph, inductive learning, neural network, (18 more...)

arXiv.org Machine Learning

2009.05923

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Differentially-private Federated Neural Architecture Search

Singh, Ishika, Zhou, Haoyi, Yang, Kunlin, Ding, Meng, Lin, Bill, Xie, Pengtao

arXiv.org Machine LearningJun-22-2020

Neural architecture search, which aims to automatically search for architectures (e.g., convolution, max pooling) of neural networks that maximize validation performance, has achieved remarkable progress recently. In many application scenarios, several parties would like to collaboratively search for a shared neural architecture by leveraging data from all parties. However, due to privacy concerns, no party wants its data to be seen by other parties. To address this problem, we propose federated neural architecture search (FNAS), where different parties collectively search for a differentiable architecture by exchanging gradients of architecture variables without exposing their data to other parties. To further preserve privacy, we study differentially-private FNAS (DP-FNAS), which adds random noise to the gradients of architecture variables. We provide theoretical guarantees of DP-FNAS in achieving differential privacy. Experiments show that DP-FNAS can search highly-performant neural architectures while protecting the privacy of individual parties.

artificial intelligence, neural network, privacy, (17 more...)

arXiv.org Machine Learning

2006.10559

Country: North America > United States > California (0.15)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Transfer Learning or Self-supervised Learning? A Tale of Two Pretraining Paradigms

Yang, Xingyi, He, Xuehai, Liang, Yuxiao, Yang, Yue, Zhang, Shanghang, Xie, Pengtao

arXiv.org Machine LearningJun-19-2020

Pretraining has become a standard technique in computer vision and natural language processing, which usually helps to improve performance substantially. Previously, the most dominant pretraining method is transfer learning (TL), which uses labeled data to learn a good representation network. Recently, a new pretraining approach -- self-supervised learning (SSL) -- has demonstrated promising results on a wide range of applications. SSL does not require annotated labels. It is purely conducted on input data by solving auxiliary tasks defined on the input data examples. The current reported results show that in certain applications, SSL outperforms TL and the other way around in other applications. There has not been a clear understanding on what properties of data and tasks render one approach outperforms the other. Without an informed guideline, ML researchers have to try both methods to find out which one is better empirically. It is usually time-consuming to do so. In this work, we aim to address this problem. We perform a comprehensive comparative study between SSL and TL regarding which one works better under different properties of data and tasks, including domain difference between source and target tasks, the amount of pretraining data, class imbalance in source data, and usage of target data for additional pretraining, etc. The insights distilled from our comparative studies can help ML researchers decide which method to use based on the properties of their applications.

deep learning, neural network, target task, (22 more...)

arXiv.org Machine Learning

2007.04234

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback