AITopics | Yang, Enneng

Collaborating Authors

Yang, Enneng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continual Learning From a Stream of APIs

Yang, Enneng, Wang, Zhenyi, Shen, Li, Yin, Nan, Liu, Tongliang, Guo, Guibing, Wang, Xingwei, Tao, Dacheng

arXiv.org Artificial IntelligenceAug-31-2023

Continual learning (CL) aims to learn new tasks without forgetting previous tasks. However, existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks. Instead, stakeholders usually release pre-trained machine learning models as a service (MLaaS), which users can access via APIs. This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs), which achieve CL from a stream of APIs with partial or no raw data. Performing CL under these two new settings faces several challenges: unavailable full raw data, unknown model parameters, heterogeneous models of arbitrary architecture and scale, and catastrophic forgetting of previous APIs. To overcome these issues, we propose a novel data-free cooperative continual distillation learning framework that distills knowledge from a stream of APIs into a CL model by generating pseudo data, just by querying APIs. Specifically, our framework includes two cooperative generators and one CL model, forming their training as an adversarial game. We first use the CL model and the current API as fixed discriminators to train generators via a derivative-free method. Generators adversarially generate hard and diverse synthetic data to maximize the response gap between the CL model and the API. Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model. Furthermore, we propose a new regularization term based on network similarity to prevent catastrophic forgetting of previous APIs.Our method performs comparably to classic CL with full raw data on the MNIST and SVHN in the DFCL-APIs setting. In the DECL-APIs setting, our method achieves 0.97x, 0.75x and 0.69x performance of classic CL on CIFAR10, CIFAR100, and MiniImageNet.

artificial intelligence, continual learning, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2309.00023

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning

Wang, Zhenyi, Yang, Enneng, Shen, Li, Huang, Heng

arXiv.org Artificial IntelligenceJul-23-2023

Forgetting refers to the loss or deterioration of previously acquired information or knowledge. While the existing surveys on forgetting have primarily focused on continual learning, forgetting is a prevalent phenomenon observed in various other research domains within deep learning. Forgetting manifests in research fields such as generative models due to generator shifts, and federated learning due to heterogeneous data distributions across clients. Addressing forgetting encompasses several challenges, including balancing the retention of old task knowledge with fast learning of new tasks, managing task interference with conflicting goals, and preventing privacy leakage, etc. Moreover, most existing surveys on continual learning implicitly assume that forgetting is always harmful. In contrast, our survey argues that forgetting is a double-edged sword and can be beneficial and desirable in certain cases, such as privacy-preserving scenarios. By exploring forgetting in a broader context, we aim to present a more nuanced understanding of this phenomenon and highlight its potential advantages. Through this comprehensive survey, we aspire to uncover potential solutions by drawing upon ideas and approaches from various fields that have dealt with forgetting. By examining forgetting beyond its conventional boundaries, in future work, we hope to encourage the development of novel strategies for mitigating, harnessing, or even embracing forgetting in real applications. A comprehensive list of papers about forgetting in various research fields is available at \url{https://github.com/EnnengYang/Awesome-Forgetting-in-Deep-Learning}.

artificial intelligence, data mining, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2307.09218

Country: North America > United States > Maryland (0.27)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.67)
Law (0.67)
Education > Educational Setting > Online (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning

Yang, Enneng, Pan, Junwei, Wang, Ximei, Yu, Haibin, Shen, Li, Chen, Xihua, Xiao, Lei, Jiang, Jie, Guo, Guibing

arXiv.org Artificial IntelligenceMay-18-2023

Multi-task learning (MTL) models have demonstrated impressive results in computer vision, natural language processing, and recommender systems. Even though many approaches have been proposed, how well these approaches balance different tasks on each parameter still remains unclear. In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter. Specifically, we compute the total updates by the exponentially decaying Average of the squared Updates (AU) on a parameter from the corresponding task.Based on this novel metric, we observe that many parameters in existing MTL methods, especially those in the higher shared layers, are still dominated by one or several tasks. The dominance of AU is mainly due to the dominance of accumulative gradients from one or several tasks. Motivated by this, we propose a Task-wise Adaptive learning rate approach, AdaTask in short, to separate the \emph{accumulative gradients} and hence the learning rate of each task for each parameter in adaptive learning rate approaches (e.g., AdaGrad, RMSProp, and Adam). Comprehensive experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks, resulting SOTA average task-wise performance. Analysis on both synthetic and real-world datasets shows AdaTask balance parameters in every shared layer well.

adatask, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2211.15055

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)

Add feedback