AITopics | Fookes, Clinton

Collaborating Authors

Fookes, Clinton

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Diversity is Definitely Needed: Improving Model-Agnostic Zero-shot Classification via Stable Diffusion

Shipard, Jordan, Wiliem, Arnold, Thanh, Kien Nguyen, Xiang, Wei, Fookes, Clinton

arXiv.org Artificial IntelligenceApr-16-2023

In this work, we investigate the problem of Model-Agnostic Zero-Shot Classification (MA-ZSC), which refers to training non-specific classification architectures (downstream models) to classify real images without using any real images during training. Recent research has demonstrated that generating synthetic training images using diffusion models provides a potential solution to address MA-ZSC. However, the performance of this approach currently falls short of that achieved by large-scale vision-language models. One possible explanation is a potential significant domain gap between synthetic and real images. Our work offers a fresh perspective on the problem by providing initial insights that MA-ZSC performance can be improved by improving the diversity of images in the generated dataset. We propose a set of modifications to the text-to-image generation process using a pre-trained diffusion model to enhance diversity, which we refer to as our $\textbf{bag of tricks}$. Our approach shows notable improvements in various classification architectures, with results comparable to state-of-the-art models such as CLIP. To validate our approach, we conduct experiments on CIFAR10, CIFAR100, and EuroSAT, which is particularly difficult for zero-shot classification due to its satellite image domain. We evaluate our approach with five classification architectures, including ResNet and ViT. Our findings provide initial insights into the problem of MA-ZSC using diffusion models. All code will be available on GitHub.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.03298

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.47)

Add feedback

Towards Self-Explainability of Deep Neural Networks with Heatmap Captioning and Large-Language Models

Tursun, Osman, Denman, Simon, Sridharan, Sridha, Fookes, Clinton

arXiv.org Artificial IntelligenceApr-4-2023

Heatmaps are widely used to interpret deep neural networks, particularly for computer vision tasks, and the heatmap-based explainable AI (XAI) techniques are a well-researched topic. However, most studies concentrate on enhancing the quality of the generated heatmap or discovering alternate heatmap generation techniques, and little effort has been devoted to making heatmap-based XAI automatic, interactive, scalable, and accessible. To address this gap, we propose a framework that includes two modules: (1) context modelling and (2) reasoning. We proposed a template-based image captioning approach for context modelling to create text-based contextual information from the heatmap and input data. The reasoning module leverages a large language model to provide explanations in combination with specialised knowledge. Our qualitative experiments demonstrate the effectiveness of our framework and heatmap captioning approach. The code for the proposed template-based heatmap captioning approach will be publicly available.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.02202

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Spectral Geometric Verification: Re-Ranking Point Cloud Retrieval for Metric Localization

Vidanapathirana, Kavisha, Moghadam, Peyman, Sridharan, Sridha, Fookes, Clinton

arXiv.org Artificial IntelligenceMar-6-2023

In large-scale metric localization, an incorrect result during retrieval will lead to an incorrect pose estimate or loop closure. Re-ranking methods propose to take into account all the top retrieval candidates and re-order them to increase the likelihood of the top candidate being correct. However, state-of-the-art re-ranking methods are inefficient when re-ranking many potential candidates due to their need for resource intensive point cloud registration between the query and each candidate. In this work, we propose an efficient spectral method for geometric verification (named SpectralGV) that does not require registration. We demonstrate how the optimal inter-cluster score of the correspondence compatibility graph of two point clouds represents a robust fitness score measuring their spatial consistency. This score takes into account the subtle geometric differences between structurally similar point clouds and therefore can be used to identify the correct candidate among potential matches retrieved by global similarity search. SpectralGV is deterministic, robust to outlier correspondences, and can be computed in parallel for all potential candidates. We conduct extensive experiments on 5 large-scale datasets to demonstrate that SpectralGV outperforms other state-of-the-art re-ranking methods and show that it consistently improves the recall and pose estimation of 3 state-of-the-art metric localization architectures while having a negligible effect on their runtime. The open-source implementation and trained models are available at: https://github.com/csiro-robotics/SpectralGV.

artificial intelligence, machine learning, point cloud, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3255560

2210.04432

Country: Oceania > Australia (0.28)

Genre: Research Report (0.40)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Wild-Places: A Large-Scale Dataset for Lidar Place Recognition in Unstructured Natural Environments

Knights, Joshua, Vidanapathirana, Kavisha, Ramezani, Milad, Sridharan, Sridha, Fookes, Clinton, Moghadam, Peyman

arXiv.org Artificial IntelligenceMar-2-2023

Many existing datasets for lidar place recognition are solely representative of structured urban environments, and have recently been saturated in performance by deep learning based approaches. Natural and unstructured environments present many additional challenges for the tasks of long-term localisation but these environments are not represented in currently available datasets. To address this we introduce Wild-Places, a challenging large-scale dataset for lidar place recognition in unstructured, natural environments. Wild-Places contains eight lidar sequences collected with a handheld sensor payload over the course of fourteen months, containing a total of 63K undistorted lidar submaps along with accurate 6DoF ground truth. Our dataset contains multiple revisits both within and between sequences, allowing for both intra-sequence (i.e. loop closure detection) and inter-sequence (i.e. re-localisation) place recognition. We also benchmark several state-of-the-art approaches to demonstrate the challenges that this dataset introduces, particularly the case of long-term place recognition due to natural environments changing over time. Our dataset and code will be available at https://csiro-robotics.github.io/Wild-Places.

artificial intelligence, machine learning, survey article, (15 more...)

arXiv.org Artificial Intelligence

2211.12732

Genre: Research Report > Promising Solution (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

FICE: Text-Conditioned Fashion Image Editing With Guided GAN Inversion

Pernuš, Martin, Fookes, Clinton, Štruc, Vitomir, Dobrišek, Simon

arXiv.org Artificial IntelligenceJan-5-2023

Fashion-image editing represents a challenging computer vision task, where the goal is to incorporate selected apparel into a given input image. Most existing techniques, known as Virtual Try-On methods, deal with this task by first selecting an example image of the desired apparel and then transferring the clothing onto the target person. Conversely, in this paper, we consider editing fashion images with text descriptions. Such an approach has several advantages over example-based virtual try-on techniques, e.g.: (i) it does not require an image of the target fashion item, and (ii) it allows the expression of a wide variety of visual concepts through the use of natural language. Existing image-editing methods that work with language inputs are heavily constrained by their requirement for training sets with rich attribute annotations or they are only able to handle simple text descriptions. We address these constraints by proposing a novel text-conditioned editing model, called FICE (Fashion Image CLIP Editing), capable of handling a wide variety of diverse text descriptions to guide the editing procedure. Specifically with FICE, we augment the common GAN inversion process by including semantic, pose-related, and image-level constraints when generating images. We leverage the capabilities of the CLIP model to enforce the semantics, due to its impressive image-text association capabilities. We furthermore propose a latent-code regularization technique that provides the means to better control the fidelity of the synthesized images. We validate FICE through rigorous experiments on a combination of VITON images and Fashion-Gen text descriptions and in comparison with several state-of-the-art text-conditioned image editing approaches. Experimental results demonstrate FICE generates highly realistic fashion images and leads to stronger editing performance than existing competing approaches.

artificial intelligence, guided gan inversion, text-conditioned fashion image editing, (1 more...)

arXiv.org Artificial Intelligence

2301.0211

Genre: Research Report (0.69)

Industry: Media > Photography (1.00)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

InCloud: Incremental Learning for Point Cloud Place Recognition

Knights, Joshua, Moghadam, Peyman, Ramezani, Milad, Sridharan, Sridha, Fookes, Clinton

arXiv.org Artificial IntelligenceNov-29-2022

Place recognition is a fundamental component of robotics, and has seen tremendous improvements through the use of deep learning models in recent years. Networks can experience significant drops in performance when deployed in unseen or highly dynamic environments, and require additional training on the collected data. However naively fine-tuning on new training distributions can cause severe degradation of performance on previously visited domains, a phenomenon known as catastrophic forgetting. In this paper we address the problem of incremental learning for point cloud place recognition and introduce InCloud, a structure-aware distillation-based approach which preserves the higher-order structure of the network's embedding space. We introduce several challenging new benchmarks on four popular and large-scale LiDAR datasets (Oxford, MulRan, In-house and KITTI) showing broad improvements in point cloud place recognition performance over a variety of network architectures. To the best of our knowledge, this work is the first to effectively apply incremental learning for point cloud place recognition. Data pre-processing, training and evaluation code for this paper can be found at https://github.com/csiro-robotics/InCloud.

artificial intelligence, machine learning, place recognition, (14 more...)

arXiv.org Artificial Intelligence

2203.00807

Country: Oceania > Australia > Queensland (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Continuous Human Action Recognition for Human-Machine Interaction: A Review

Gammulle, Harshala, Ahmedt-Aristizabal, David, Denman, Simon, Tychsen-Smith, Lachlan, Petersson, Lars, Fookes, Clinton

arXiv.org Artificial IntelligenceFeb-26-2022

With advances in data-driven machine learning research, a wide variety of prediction models have been proposed to capture spatio-temporal features for the analysis of video streams. Recognising actions and detecting action transitions within an input video are challenging but necessary tasks for applications that require real-time human-machine interaction. By reviewing a large body of recent related work in the literature, we thoroughly analyse, explain and compare action segmentation methods and provide details on the feature extraction and learning strategies that are used on most state-of-the-art methods. We cover the impact of the performance of object detection and tracking techniques on human action segmentation methodologies. We investigate the application of such models to real-world scenarios and discuss several limitations and key research directions towards improving interpretability, generalisation, optimisation and deployment.

data mining, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3587931

2202.13096

Country:

Europe (1.00)
South America > Brazil > Rio de Janeiro (0.14)
Oceania > Australia > Queensland (0.14)
(2 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

The State of Aerial Surveillance: A Survey

Nguyen, Kien, Fookes, Clinton, Sridharan, Sridha, Tian, Yingli, Liu, Feng, Liu, Xiaoming, Ross, Arun

arXiv.org Artificial IntelligenceJan-12-2022

The rapid emergence of airborne platforms and imaging sensors are enabling new forms of aerial surveillance due to their unprecedented advantages in scale, mobility, deployment and covert observation capabilities. This paper provides a comprehensive overview of human-centric aerial surveillance tasks from a computer vision and pattern recognition perspective. It aims to provide readers with an in-depth systematic review and technical analysis of the current state of aerial surveillance tasks using drones, UAVs and other airborne platforms. The main object of interest is humans, where single or multiple subjects are to be detected, identified, tracked, re-identified and have their behavior analyzed. More specifically, for each of these four tasks, we first discuss unique challenges in performing these tasks in an aerial setting compared to a ground-based setting. We then review and analyze the aerial datasets publicly available for each task, and delve deep into the approaches in the aerial literature and investigate how they presently address the aerial challenges. We conclude the paper with discussion on the missing gaps and open research questions to inform future research avenues.

artificial intelligence, dataset, survey article, (18 more...)

arXiv.org Artificial Intelligence

2201.0308

Country:

Asia (1.00)
North America > United States > New York (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Government (0.92)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Add feedback

Point Cloud Segmentation Using Sparse Temporal Local Attention

Knights, Joshua, Moghadam, Peyman, Fookes, Clinton, Sridharan, Sridha

arXiv.org Artificial IntelligenceDec-2-2021

However, Point clouds are a key modality used for perception despite a number of successful recent approaches exploiting in autonomous vehicles, providing the means sequential data from 2D video streams for improved for a robust geometric understanding of the surrounding segmentation performance [Hu et al., 2020a; Li et al., 2018; environment. However despite the sensor Paul et al., 2020; Zhu et al., 2019; Jain et al., 2019], there outputs from autonomous vehicles being naturally has been limited exploration into leveraging temporal priors temporal in nature, there is still limited exploration for point cloud segmentation. Existing approaches either calculate of exploiting point cloud sequences for 3D semantic strict correspondences between point features across segmentation. In this paper we propose a novel frames [Cao et al., 2020] or perform global attention [Shi Sparse Temporal Local Attention (STELA) module et al., 2020] between whole point clouds. In the case of the which aggregates intermediate features from a local former, a breakdown of nearest-point matching due to displacement neighbourhood in previous point cloud frames between adjacent point clouds can result in the to provide a rich temporal context to the decoder.

artificial intelligence, ground transportation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2112.00289

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.69)

Add feedback

A Survey on Graph-Based Deep Learning for Computational Histopathology

Ahmedt-Aristizabal, David, Armin, Mohammad Ali, Denman, Simon, Fookes, Clinton, Petersson, Lars

arXiv.org Artificial IntelligenceSep-27-2021

With the remarkable success of representation learning for prediction problems, we have witnessed a rapid expansion of the use of machine learning and deep learning for the analysis of digital pathology and biopsy image patches. However, learning over patch-wise features using convolutional neural networks limits the ability of the model to capture global contextual information and comprehensively model tissue composition. The phenotypical and topological distribution of constituent histological entities play a critical role in tissue diagnosis. As such, graph data representations and deep learning have attracted significant attention for encoding tissue representations, and capturing intra- and inter- entity level interactions. In this review, we provide a conceptual grounding for graph analytics in digital pathology, including entity-graph construction and graph architectures, and present their current success for tumor localization and classification, tumor invasion and staging, image retrieval, and survival prediction. We provide an overview of these methods in a systematic manner organized by the graph representation of the input image, scale, and organ on which they operate. We also outline the limitations of existing techniques, and suggest potential future research directions in this domain.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.compmedimag.2021.102027

2107.00272

Country: Oceania > Australia > Queensland (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Dermatology (0.93)
Health & Medicine > Therapeutic Area > Gastroenterology (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback