AITopics | Ericsson, Linus

Collaborating Authors

Ericsson, Linus

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

einspace: Searching for Neural Architectures from Fundamental Operations

Ericsson, Linus, Espinosa, Miguel, Yang, Chenhongyi, Antoniou, Antreas, Storkey, Amos, Cohen, Shay B., McDonagh, Steven, Crowley, Elliot J.

arXiv.org Machine LearningMay-31-2024

Neural architecture search (NAS) finds high performing networks for a given task. Yet the results of NAS are fairly prosaic; they did not e.g. create a shift from convolutional structures to transformers. This is not least because the search spaces in NAS often aren't diverse enough to include such transformations a priori. Instead, for NAS to provide greater potential for fundamental design shifts, we need a novel expressive search space design which is built from more fundamental operations. To this end, we introduce einspace, a search space based on a parameterised probabilistic context-free grammar. Our space is versatile, supporting architectures of various sizes and complexities, while also containing diverse network operations which allow it to model convolutions, attention components and more. It contains many existing competitive architectures, and provides flexibility for discovering new ones. Using this search space, we perform experiments to find novel architectures as well as improvements on existing ones on the diverse Unseen NAS datasets. We show that competitive architectures can be obtained by searching from scratch, and we consistently find large improvements when initialising the search with strong baselines. We believe that this work is an important advancement towards a transformative NAS paradigm where search space expressivity and strategic search initialisation play key roles.

architecture, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2405.20838

Country: Europe (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Hyperparameter Selection in Continual Learning

Lee, Thomas L., Hellan, Sigrid Passano, Ericsson, Linus, Crowley, Elliot J., Storkey, Amos

arXiv.org Machine LearningApr-9-2024

In continual learning (CL) -- where a learner trains on a stream of data -- standard hyperparameter optimisation (HPO) cannot be applied, as a learner does not have access to all of the data at the same time. This has prompted the development of CL-specific HPO frameworks. The most popular way to tune hyperparameters in CL is to repeatedly train over the whole data stream with different hyperparameter settings. However, this end-of-training HPO is unrealistic as in practice a learner can only see the stream once. Hence, there is an open question: what HPO framework should a practitioner use for a CL problem in reality? This paper answers this question by evaluating several realistic HPO frameworks. We find that all the HPO frameworks considered, including end-of-training HPO, perform similarly. We therefore advocate using the realistic and most computationally efficient method: fitting the hyperparameters on the first task and then fixing them throughout training.

artificial intelligence, hpo framework, machine learning, (15 more...)

arXiv.org Machine Learning

2404.06466

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition

Yang, Chenhongyi, Chen, Zehui, Espinosa, Miguel, Ericsson, Linus, Wang, Zhenyu, Liu, Jiaming, Crowley, Elliot J.

arXiv.org Artificial IntelligenceMar-26-2024

We present PlainMamba: a simple non-hierarchical state space model (SSM) designed for general visual recognition. The recent Mamba model has shown how SSMs can be highly competitive with other architectures on sequential data and initial attempts have been made to apply it to images. In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information. Our architecture is designed to be easy to use and easy to scale, formed by stacking identical PlainMamba blocks, resulting in a model with constant width throughout all layers. The architecture is further simplified by removing the need for special tokens. We evaluate PlainMamba on a variety of visual recognition tasks including image classification, semantic segmentation, object detection, and instance segmentation. Our method achieves performance gains over previous non-hierarchical models and is competitive with hierarchical alternatives. For tasks requiring high-resolution inputs, in particular, PlainMamba requires much less computing while maintaining high performance.

artificial intelligence, deep learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2403.17695

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Self-Supervised Disentanglement by Leveraging Structure in Data Augmentations

Eastwood, Cian, von Kügelgen, Julius, Ericsson, Linus, Bouchacourt, Diane, Vincent, Pascal, Schölkopf, Bernhard, Ibrahim, Mark

arXiv.org Machine LearningNov-15-2023

Self-supervised representation learning often uses data augmentations to induce some invariance to "style" attributes of the data. However, with downstream tasks generally unknown at training time, it is difficult to deduce a priori which attributes of the data are indeed "style" and can be safely discarded. To address this, we introduce a more principled approach that seeks to disentangle style features rather than discard them. The key idea is to add multiple style embedding spaces where: (i) each is invariant to all-but-one augmentation; and (ii) joint entropy is maximized. We formalize our structured data-augmentation procedure from a causal latent-variable-model perspective, and prove identifiability of both content and (multiple blocks of) style variables. We empirically demonstrate the benefits of our approach on synthetic datasets and then present promising but limited results on ImageNet.

artificial intelligence, international conference, machine learning, (11 more...)

arXiv.org Machine Learning

2311.08815

Country:

Europe > Germany (0.14)
North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.34)

Add feedback

Better Practices for Domain Adaptation

Ericsson, Linus, Li, Da, Hospedales, Timothy M.

arXiv.org Artificial IntelligenceSep-7-2023

Distribution shifts are all too common in real-world applications of machine learning. Domain adaptation (DA) aims to address this by providing various frameworks for adapting models to the deployment data without using labels. However, the domain shift scenario raises a second more subtle challenge: the difficulty of performing hyperparameter optimisation (HPO) for these adaptation algorithms without access to a labelled validation set. The unclear validation protocol for DA has led to bad practices in the literature, such as performing HPO using the target test labels when, in real-world scenarios, they are not available. This has resulted in over-optimism about DA research progress compared to reality. In this paper, we analyse the state of DA when using good evaluation practice, by benchmarking a suite of candidate validation criteria and using them to assess popular adaptation algorithms. We show that there are challenges across all three branches of domain adaptation methodology including Unsupervised Domain Adaptation (UDA), Source-Free Domain Adaptation (SFDA), and Test Time Adaptation (TTA). While the results show that realistically achievable performance is often worse than expected, they also show that using proper validation splits is beneficial, as well as showing that some previously unexplored validation metrics provide the best options to date. Altogether, our improved practices covering data, training, validation and hyperparameter optimisation form a new rigorous pipeline to improve benchmarking, and hence research progress, within this important field going forward.

artificial intelligence, domain adaptation, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2309.03879

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity

Dutt, Raman, Ericsson, Linus, Sanchez, Pedro, Tsaftaris, Sotirios A., Hospedales, Timothy

arXiv.org Artificial IntelligenceMay-24-2023

We present a comprehensive evaluation of Parameter-Efficient Fine-Tuning (PEFT) techniques for diverse medical image analysis tasks. PEFT is increasingly exploited as a valuable approach for knowledge transfer from pre-trained models in natural language processing, vision, speech, and cross-modal tasks, such as vision-language and text-to-image generation. However, its application in medical image analysis remains relatively unexplored. As foundation models are increasingly exploited in the medical domain, it is crucial to investigate and comparatively assess various strategies for knowledge transfer that can bolster a range of downstream tasks. Our study, the first of its kind (to the best of our knowledge), evaluates 16 distinct PEFT methodologies proposed for convolutional and transformer-based networks, focusing on image classification and text-to-image generation tasks across six medical datasets ranging in size, modality, and complexity. Through a battery of more than 600 controlled experiments, we demonstrate performance gains of up to 22% under certain scenarios and demonstrate the efficacy of PEFT for medical text-to-image generation. Further, we reveal the instances where PEFT methods particularly dominate over conventional fine-tuning approaches by studying their relationship with downstream data volume.

artificial intelligence, fine-tuning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.08252

Country:

Asia (0.46)
Europe (0.28)

Genre: Research Report > Experimental Study (0.54)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Self-Supervised Representation Learning: Introduction, Advances and Challenges

Ericsson, Linus, Gouk, Henry, Loy, Chen Change, Hospedales, Timothy M.

arXiv.org Machine LearningOct-18-2021

Self-supervised representation learning methods aim to provide powerful deep feature learning without the requirement of large annotated datasets, thus alleviating the annotation bottleneck that is one of the main barriers to practical deployment of deep learning today. These methods have advanced rapidly in recent years, with their efficacy approaching and sometimes surpassing fully supervised pre-training alternatives across a variety of data modalities including image, video, sound, text and graphs. This article introduces this vibrant area including key concepts, the four main families of approach and associated state of the art, and how self-supervised methods are applied to diverse modalities of data. We further discuss practical considerations including workflows, representation transferability, and compute cost. Finally, we survey the major open challenges in the field that provide fertile ground for future work.

machine learning, teaching medhods, teaching method, (23 more...)

arXiv.org Machine Learning

2110.09327

Country: Europe (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)

Add feedback

Don't Wait, Just Weight: Improving Unsupervised Representations by Learning Goal-Driven Instance Weights

Ericsson, Linus, Gouk, Henry, Hospedales, Timothy M.

arXiv.org Machine LearningJun-22-2020

In the absence of large labelled datasets, self-supervised learning techniques can boost performance by learning useful representations from unlabelled data, which is often more readily available. However, there is often a domain shift between the unlabelled collection and the downstream target problem data. We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy by prioritising the most useful instances. Additionally, we show that the training time can be reduced by discarding unnecessary datapoints. Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon. We compare to related instance weighting schemes, both hand-designed heuristics and meta-learning, as well as conventional self-supervised learning. BetaDataWeighter achieves both the highest average accuracy and rank across datasets, and on STL-10 it prunes up to 78% of unlabelled images without significant loss in accuracy, corresponding to over 50% reduction in training time.

betadataweighter, deep learning, neural network, (21 more...)

arXiv.org Machine Learning

2006.1236

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback