Goto

Collaborating Authors

 Transfer Learning


Transfer Learning of Artist Group Factors to Musical Genre Classification

arXiv.org Machine Learning

The automated recognition of music genres from audio information is a challenging problem, as genre labels are subjective and noisy. Artist labels are less subjective and less noisy, while certain artists may relate more strongly to certain genres. At the same time, at prediction time, it is not guaranteed that artist labels are available for a given audio segment. Therefore, in this work, we propose to apply the transfer learning framework, learning artist-related information which will be used at inference time for genre classification. We consider different types of artist-related information, expressed through artist group factors, which will allow for more efficient learning and stronger robustness to potential label noise. Furthermore, we investigate how to achieve the highest validation accuracy on the given FMA dataset, by experimenting with various kinds of transfer methods, including single-task transfer, multi-task transfer and finally multi-task learning.


Taskonomy: Disentangling Task Transfer Learning

arXiv.org Artificial Intelligence

Do visual tasks have a relationship, or are they unrelated? For instance, could having surface normals simplify estimating the depth of an image? Intuition answers these questions positively, implying existence of a structure among visual tasks. Knowing this structure has notable values; it is the concept underlying transfer learning and provides a principled way for identifying redundancies across tasks, e.g., to seamlessly reuse supervision among related tasks or solve many tasks in one system without piling up the complexity. We proposes a fully computational approach for modeling the structure of space of visual tasks. This is done via finding (first and higher-order) transfer learning dependencies across a dictionary of twenty six 2D, 2.5D, 3D, and semantic tasks in a latent space. The product is a computational taxonomic map for task transfer learning. We study the consequences of this structure, e.g. nontrivial emerged relationships, and exploit them to reduce the demand for labeled data. For example, we show that the total number of labeled datapoints needed for solving a set of 10 tasks can be reduced by roughly 2/3 (compared to training independently) while keeping the performance nearly the same. We provide a set of tools for computing and probing this taxonomical structure including a solver that users can employ to devise efficient supervision policies for their use cases.


Learning to Learn Deep Learning E-Learning

#artificialintelligence

Welcome to this e-learning course developed and produced by Dr Neil Thompson and hosted by Simpliv. Neil is a well-published author in the people professions field, an international conference speaker and sought-after consultant.The overall aim of this course is to help you broaden and deepen your understanding of what is involved in learning, what can prevent it from happening and what you can do to maximize your learning. Learning is part of everyday life and something we are very familiar with. But, that does not mean that we are making the most of the learning opportunities we encounter. Indeed, it is fair to say that, despite the emphasis on the importance of learning, relatively few people achieve optimal learning.


Demis Hassabis: Transfer learning is key to AGI

#artificialintelligence

I think transfer learning is the key to general intelligence. And I think the key to doing transfer learning will be the acquisition of conceptual knowledge that is abstracted away from perceptual details of where you learned it from.


[D]What makes "Meta-SGD: Learning to Learn Quickly for Few-Shot Learning" to work so good? • r/MachineLearning

@machinelearnbot

I'm interested in Few-Shot-Learning, so this paper is really intriguing for me either. I think that I still don't get paper (I'm not familiar with Meta-Learning), but learning algorithm look completely different than in normal supervised learning. So for weight update they use test set (which could be also a part of train set, not sure of proper name, but it would be better if we call it train-test and second one train-train). Do you see the difference? Why they use such idea?


Transfer Learning for Traffic Speed Prediction: A Preliminary Study

AAAI Conferences

Traffic speed prediction can benefit a wide range of IoT applications in intelligent transportation and smart city. Recent supervised machine learning approaches heavily leverage vast amount of historical time-series data. Consequently, they degrade dramatically in the areas where collecting a large traffic data is not quite feasible. With the aim of predicting the traffic speed of such urban areas, we propose a transfer learning framework that exploits historical data of some other data abundant areas by utilizing various spatio-temporal semantic features. Experimental results show that classic regression models and our proposed kernel regression model can achieve competitive performance comparing to baseline methods that heavily rely on the historical data of target areas.


Incremental Learning-to-Learn with Statistical Guarantees

arXiv.org Machine Learning

In learning-to-learn the goal is to infer a learning algorithm that works well on a class of tasks sampled from an unknown meta distribution. In contrast to previous work on batch learning-to-learn, we consider a scenario where tasks are presented sequentially and the algorithm needs to adapt incrementally to improve its performance on future tasks. Key to this setting is for the algorithm to rapidly incorporate new observations into the model as they arrive, without keeping them in memory. We focus on the case where the underlying algorithm is ridge regression parameterized by a positive semidefinite matrix. We propose to learn this matrix by applying a stochastic strategy to minimize the empirical error incurred by ridge regression on future tasks sampled from the meta distribution. We study the statistical properties of the proposed algorithm and prove non-asymptotic bounds on its excess transfer risk, that is, the generalization performance on new tasks from the same meta distribution. We compare our online learning-to-learn approach with a state of the art batch method, both theoretically and empirically.


Pseudo-task Augmentation: From Deep Multitask Learning to Intratask Sharing---and Back

arXiv.org Machine Learning

Deep multitask learning boosts performance by sharing learned structure across related tasks. This paper adapts ideas from deep multitask learning to the setting where only a single task is available. The method is formalized as pseudo-task augmentation, in which models are trained with multiple decoders for each task. Pseudo-tasks simulate the effect of training towards closely-related tasks drawn from the same universe. In a suite of experiments, pseudo-task augmentation is shown to improve performance on single-task learning problems. When combined with multitask learning, further improvements are achieved, including state-of-the-art performance on the CelebA dataset, showing that pseudo-task augmentation and multitask learning have complementary value. All in all, pseudo-task augmentation is a broadly applicable and efficient way to boost performance in deep learning systems.


Learning and Transferring IDs Representation in E-commerce

arXiv.org Machine Learning

Many machine intelligence techniques are developed in E-commerce and one of the most essential components is the representation of IDs, including user ID, item ID, product ID, store ID, brand ID, category ID etc. The classical encoding based methods (like one-hot encoding) are inefficient in that it suffers sparsity problems due to its high dimension, and it cannot reflect the relationships among IDs, either homogeneous or heterogeneous ones. In this paper, we propose an embedding based framework to learn and transfer the representation of IDs. As the the implicit feedbacks of users, a tremendous amount of item ID sequences can be easily collected from the interactive sessions. By jointly using these informative sequences and the structural connections among IDs, all types of IDs can be embedded into one low-dimensional semantic space. Subsequently, the learned representations are utilized and transferred in four scenarios: (i) measuring the similarity between items, (ii) transferring from seen items to unseen items, (iii) transferring across different domains, (iv) transferring across different tasks. We deploy and evaluate the proposed approach in Hema App and the results validate its effectiveness.


Not to Cry Wolf: Distantly Supervised Multitask Learning in Critical Care

arXiv.org Artificial Intelligence

Patients in the intensive care unit (ICU) require constant and close supervision. To assist clinical staff in this task, hospitals use monitoring systems that trigger audiovisual alarms if their algorithms indicate that a patient's condition may be worsening. However, current monitoring systems are extremely sensitive to movement artefacts and technical errors. As a result, they typically trigger hundreds to thousands of false alarms per patient per day - drowning the important alarms in noise and adding to the exhaustion of clinical staff. In this setting, data is abundantly available, but obtaining trustworthy annotations by experts is laborious and expensive. We frame the problem of false alarm reduction from multivariate time series as a machine-learning task and address it with a novel multitask network architecture that utilises distant supervision through multiple related auxiliary tasks in order to reduce the number of expensive labels required for training. We show that our approach leads to significant improvements over several state-of-the-art baselines on real-world ICU data and provide new insights on the importance of task selection and architectural choices in distantly supervised multitask learning.