AITopics | Fookes, Clinton

Plotting

Fookes, Clinton

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Divide and Conquer: Rethinking the Training Paradigm of Neural Radiance Fields

Ma, Rongkai, Lebrat, Leo, Cruz, Rodrigo Santa, Avraham, Gil, Zuo, Yan, Fookes, Clinton, Salvado, Olivier

arXiv.org Artificial IntelligenceJan-29-2024

Neural radiance fields (NeRFs) have exhibited potential in synthesizing high-fidelity views of 3D scenes but the standard training paradigm of NeRF presupposes an equal importance for each image in the training set. This assumption poses a significant challenge for rendering specific views presenting intricate geometries, thereby resulting in suboptimal performance. In this paper, we take a closer look at the implications of the current training paradigm and redesign this for more superior rendering quality by NeRFs. Dividing input views into multiple groups based on their visual similarities and training individual models on each of these groups enables each model to specialize on specific regions without sacrificing speed or efficiency. Subsequently, the knowledge of these specialized models is aggregated into a single entity via a teacher-student distillation paradigm, enabling spatial efficiency for online render-ing. Empirically, we evaluate our novel training framework on two publicly available datasets, namely NeRF synthetic and Tanks&Temples. Our evaluation demonstrates that our DaC training pipeline enhances the rendering quality of a state-of-the-art baseline model while exhibiting convergence to a superior minimum.

artificial intelligence, machine learning, radiance field, (16 more...)

arXiv.org Artificial Intelligence

2401.16144

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.41)

Add feedback

Zoom-shot: Fast and Efficient Unsupervised Zero-Shot Transfer of CLIP to Vision Encoders with Multimodal Loss

Shipard, Jordan, Wiliem, Arnold, Thanh, Kien Nguyen, Xiang, Wei, Fookes, Clinton

arXiv.org Artificial IntelligenceJan-21-2024

The fusion of vision and language has brought about a transformative shift in computer vision through the emergence of Vision-Language Models (VLMs). However, the resource-intensive nature of existing VLMs poses a significant challenge. We need an accessible method for developing the next generation of VLMs. To address this issue, we propose Zoom-shot, a novel method for transferring the zero-shot capabilities of CLIP to any pre-trained vision encoder. We do this by exploiting the multimodal information (i.e. text and image) present in the CLIP latent space through the use of specifically designed multimodal loss functions. These loss functions are (1) cycle-consistency loss and (2) our novel prompt-guided knowledge distillation loss (PG-KD). PG-KD combines the concept of knowledge distillation with CLIP's zero-shot classification, to capture the interactions between text and image features. With our multimodal losses, we train a $\textbf{linear mapping}$ between the CLIP latent space and the latent space of a pre-trained vision encoder, for only a $\textbf{single epoch}$. Furthermore, Zoom-shot is entirely unsupervised and is trained using $\textbf{unpaired}$ data. We test the zero-shot capabilities of a range of vision encoders augmented as new VLMs, on coarse and fine-grained classification datasets, outperforming the previous state-of-the-art in this problem domain. In our ablations, we find Zoom-shot allows for a trade-off between data and compute during training; and our state-of-the-art results can be obtained by reducing training from 20% to 1% of the ImageNet training data with 20 epochs. All code and models are available on GitHub.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2401.11633

Country: Oceania > Australia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pretraining

Mohamed, Shaheer, Haghighat, Maryam, Fernando, Tharindu, Sridharan, Sridha, Fookes, Clinton, Moghadam, Peyman

arXiv.org Artificial IntelligenceJan-3-2024

Hyperspectral images (HSIs) contain rich spectral and spatial information. Motivated by the success of transformers in the field of natural language processing and computer vision where they have shown the ability to learn long range dependencies within input data, recent research has focused on using transformers for HSIs. However, current state-of-the-art hyperspectral transformers only tokenize the input HSI sample along the spectral dimension, resulting in the under-utilization of spatial information. Moreover, transformers are known to be data-hungry and their performance relies heavily on large-scale pretraining, which is challenging due to limited annotated hyperspectral data. Therefore, the full potential of HSI transformers has not been fully realized. To overcome these limitations, we propose a novel factorized spectral-spatial transformer that incorporates factorized self-supervised pretraining procedures, leading to significant improvements in performance. The factorization of the inputs allows the spectral and spatial transformers to better capture the interactions within the hyperspectral data cubes. Inspired by masked image modeling pretraining, we also devise efficient masking strategies for pretraining each of the spectral and spatial transformers. We conduct experiments on six publicly available datasets for HSI classification task and demonstrate that our model achieves state-of-the-art performance in all the datasets. The code for our model will be made available at https://github.com/csiro-robotics/factoformer.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2309.09431

Country:

North America > United States (0.28)
Asia > China (0.28)
Oceania > Australia > Queensland > Brisbane (0.14)

Genre: Research Report (0.64)

Industry: Energy (0.49)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in Large-scale Natural Environments

Vidanapathirana, Kavisha, Knights, Joshua, Hausler, Stephen, Cox, Mark, Ramezani, Milad, Jooste, Jason, Griffiths, Ethan, Mohamed, Shaheer, Sridharan, Sridha, Fookes, Clinton, Moghadam, Peyman

arXiv.org Artificial IntelligenceDec-23-2023

Recent progress in semantic scene understanding has primarily been enabled by the availability of semantically annotated bi-modal (camera and lidar) datasets in urban environments. However, such annotated datasets are also needed for natural, unstructured environments to enable semantic perception for applications, including conservation, search and rescue, environment monitoring, and agricultural automation. Therefore, we introduce WildScenes, a bi-modal benchmark dataset consisting of multiple large-scale traversals in natural environments, including semantic annotations in high-resolution 2D images and dense 3D lidar point clouds, and accurate 6-DoF pose information. The data is (1) trajectory-centric with accurate localization and globally aligned point clouds, (2) calibrated and synchronized to support bi-modal inference, and (3) containing different natural environments over 6 months to support research on domain adaptation. Our 3D semantic labels are obtained via an efficient automated process that transfers the human-annotated 2D labels from multiple views into 3D point clouds, thus circumventing the need for expensive and time-consuming human annotation in 3D. We introduce benchmarks on 2D and 3D semantic segmentation and evaluate a variety of recent deep-learning techniques to demonstrate the challenges in semantic segmentation in natural environments. We propose train-val-test splits for standard benchmarks as well as domain adaptation benchmarks and utilize an automated split generation technique to ensure the balance of class label distributions. The data, evaluation scripts and pretrained models will be released upon acceptance at https://csiro-robotics.github.io/WildScenes.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2312.15364

Country: Oceania > Australia > Queensland > Brisbane (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Multi-stage Learning for Radar Pulse Activity Segmentation

Huang, Zi, Pemasiri, Akila, Denman, Simon, Fookes, Clinton, Martin, Terrence

arXiv.org Artificial IntelligenceDec-14-2023

Radio signal recognition is a crucial function in electronic warfare. Precise identification and localisation of radar pulse activities are required by electronic warfare systems to produce effective countermeasures. Despite the importance of these tasks, deep learning-based radar pulse activity recognition methods have remained largely underexplored. While deep learning for radar modulation recognition has been explored previously, classification tasks are generally limited to short and non-interleaved IQ signals, limiting their applicability to military applications. To address this gap, we introduce an end-to-end multi-stage learning approach to detect and localise pulse activities of interleaved radar signals across an extended time horizon. We propose a simple, yet highly effective multi-stage architecture for incrementally predicting fine-grained segmentation masks that localise radar pulse activities across multiple channels. We demonstrate the performance of our approach against several reference models on a novel radar dataset, while also providing a first-of-its-kind benchmark for radar pulse activity segmentation.

artificial intelligence, machine learning, segmentation, (18 more...)

arXiv.org Artificial Intelligence

2312.09489

Country:

Oceania > Australia (0.15)
Europe > Netherlands (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Government > Military (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Piecewise Deterministic Markov Processes for Bayesian Neural Networks

Goan, Ethan, Perrin, Dimitri, Mengersen, Kerrie, Fookes, Clinton

arXiv.org Machine LearningOct-19-2023

Inference on modern Bayesian Neural Networks (BNNs) often relies on a variational inference treatment, imposing violated assumptions of independence and the form of the posterior. Traditional MCMC approaches avoid these assumptions at the cost of increased computation due to its incompatibility to subsampling of the likelihood. New Piecewise Deterministic Markov Process (PDMP) samplers permit subsampling, though introduce a model specific inhomogenous Poisson Process (IPPs) which is difficult to sample from. This work introduces a new generic and adaptive thinning scheme for sampling from these IPPs, and demonstrates how this approach can accelerate the application of PDMPs for inference in BNNs. Experimentation illustrates how inference with these methods is computationally feasible, can improve predictive accuracy, MCMC mixing performance, and provide informative uncertainty measurements when compared against other approximate inference schemes.

artificial intelligence, machine learning, sampler, (20 more...)

arXiv.org Machine Learning

2302.08724

Country: North America > Canada (0.14)

Genre: Research Report (1.00)

Add feedback

Physical Adversarial Attacks for Surveillance: A Survey

Nguyen, Kien, Fernando, Tharindu, Fookes, Clinton, Sridharan, Sridha

arXiv.org Artificial IntelligenceOct-14-2023

Modern automated surveillance techniques are heavily reliant on deep learning methods. Despite the superior performance, these learning systems are inherently vulnerable to adversarial attacks - maliciously crafted inputs that are designed to mislead, or trick, models into making incorrect predictions. An adversary can physically change their appearance by wearing adversarial t-shirts, glasses, or hats or by specific behavior, to potentially avoid various forms of detection, tracking and recognition of surveillance systems; and obtain unauthorized access to secure properties and assets. This poses a severe threat to the security and safety of modern surveillance systems. This paper reviews recent attempts and findings in learning and designing physical adversarial attacks for surveillance applications. In particular, we propose a framework to analyze physical adversarial attacks and provide a comprehensive survey of physical adversarial attacks on four key surveillance tasks: detection, identification, tracking, and action recognition under this framework. Furthermore, we review and analyze strategies to defend against the physical adversarial attacks and the methods for evaluating the strengths of the defense. The insights in this paper present an important step in building resilience within surveillance systems to physical adversarial attacks.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TNNLS.2023.3321432

2305.01074

Country:

Europe (0.14)
Oceania > Australia (0.14)
North America > United States (0.14)
Asia (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Survey on Physics Informed Reinforcement Learning: Review and Open Problems

Banerjee, Chayan, Nguyen, Kien, Fookes, Clinton, Raissi, Maziar

arXiv.org Artificial IntelligenceSep-4-2023

The inclusion of physical information in machine learning frameworks has revolutionized many application areas. This involves enhancing the learning process by incorporating physical constraints and adhering to physical laws. In this work we explore their utility for reinforcement learning applications. We present a thorough review of the literature on incorporating physics information, as known as physics priors, in reinforcement learning approaches, commonly referred to as physics-informed reinforcement learning (PIRL). We introduce a novel taxonomy with the reinforcement learning pipeline as the backbone to classify existing works, compare and contrast them, and derive crucial insights. Existing works are analyzed with regard to the representation/ form of the governing physics modeled for integration, their specific contribution to the typical reinforcement learning architecture, and their connection to the underlying reinforcement learning pipeline stages. We also identify core learning architectures and physics incorporation biases (i.e., observational, inductive and learning) of existing PIRL approaches and use them to further categorize the works for better understanding and adaptation. By providing a comprehensive perspective on the implementation of the physics-informed capability, the taxonomy presents a cohesive approach to PIRL. It identifies the areas where this approach has been applied, as well as the gaps and opportunities that exist. Additionally, the taxonomy sheds light on unresolved issues and challenges, which can guide future research. This nascent field holds great potential for enhancing reinforcement learning algorithms by increasing their physical plausibility, precision, data efficiency, and applicability in real-world scenarios.

artificial intelligence, machine learning, physics informed reinforcement learning, (2 more...)

arXiv.org Artificial Intelligence

2309.01909

Genre:

Research Report (0.69)
Overview (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Uncertainty in Real-Time Semantic Segmentation on Embedded Systems

Goan, Ethan, Fookes, Clinton

arXiv.org Artificial IntelligenceJul-31-2023

Application for semantic segmentation models in areas such as autonomous vehicles and human computer interaction require real-time predictive capabilities. The challenges of addressing real-time application is amplified by the need to operate on resource constrained hardware. Whilst development of real-time methods for these platforms has increased, these models are unable to sufficiently reason about uncertainty present when applied on embedded real-time systems. This paper addresses this by combining deep feature extraction from pre-trained models with Bayesian regression and moment propagation for uncertainty aware predictions. We demonstrate how the proposed method can yield meaningful epistemic uncertainty on embedded hardware in real-time whilst maintaining predictive performance.

machine learning, real time system, segmentation, (20 more...)

arXiv.org Artificial Intelligence

2301.01201

Country:

Oceania > Australia (0.14)
North America > Canada (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Multi-task Learning for Radar Signal Characterisation

Huang, Zi, Pemasiri, Akila, Denman, Simon, Fookes, Clinton, Martin, Terrence

arXiv.org Artificial IntelligenceJun-19-2023

Radio signal recognition is a crucial task in both civilian and The application of convolutional neural networks (CNNs) military applications, as accurate and timely identification of to automatic modulation classification (AMC) was introduced unknown signals is an essential part of spectrum management by [8]. Their early works [9, 10] together with the release and electronic warfare. The majority of research in this field of several public datasets [11] initiated a wave of interest in has focused on applying deep learning for modulation classification, DL-based RSR. Recently, several alternative DL approaches leaving the task of signal characterisation as an understudied that adopt recurrent neural networks (RNNs) and hybrid architectures area. This paper addresses this gap by presenting [12] were able to consistently achieve above 90% an approach for tackling radar signal classification and characterisation modulation classification accuracy in relatively high signalto-noise as a multi-task learning (MTL) problem. We propose ratio (SNR) settings. Despite the success of DNNs, the IQ Signal Transformer (IQST) among several reference many recent approaches still rely on handcrafted features to architectures that allow for simultaneous optimisation of pre-process the complex-valued, in-phase and quadrature (IQ) multiple regression and classification tasks. We demonstrate data into image-based representations, such as spectrograms the performance of our proposed MTL model on a synthetic [12], prior to training. These approaches effectively transform radar dataset, while also providing a first-of-its-kind benchmark RSR into an image classification problem, and thus limits the for radar signal characterisation.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICASSPW59220.2023.10193318

2306.13105

Country: Oceania > Australia (0.15)

Genre: Research Report (0.50)

Industry: Government > Military (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback