AITopics | Moghadam, Peyman

Collaborating Authors

Moghadam, Peyman

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Inductive Graph Few-shot Class Incremental Learning

Li, Yayong, Moghadam, Peyman, Peng, Can, Ye, Nan, Koniusz, Piotr

arXiv.org Artificial IntelligenceNov-10-2024

Node classification with Graph Neural Networks (GNN) under a fixed set of labels is well known in contrast to Graph Few-Shot Class Incremental Learning (GFSCIL), which involves learning a GNN classifier as graph nodes and classes growing over time sporadically. We introduce inductive GFSCIL that continually learns novel classes with newly emerging nodes while maintaining performance on old classes without accessing previous data. This addresses the practical concern of transductive GFSCIL, which requires storing the entire graph with historical data. Compared to the transductive GFSCIL, the inductive setting exacerbates catastrophic forgetting due to inaccessible previous data during incremental training, in addition to overfitting issue caused by label sparsity. Thus, we propose a novel method, called Topology-based class Augmentation and Prototype calibration (TAP). To be specific, it first creates a triple-branch multi-topology class augmentation method to enhance model generalization ability. As each incremental session receives a disjoint subgraph with nodes of novel classes, the multi-topology class augmentation method helps replicate such a setting in the base session to boost backbone versatility. In incremental learning, given the limited number of novel class samples, we propose an iterative prototype calibration to improve the separation of class prototypes. Furthermore, as backbone fine-tuning poses the feature distribution drift, prototypes of old classes start failing over time, we propose the prototype shift method for old classes to compensate for the drift. We showcase the proposed method on four datasets.

artificial intelligence, incremental session, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.06634

Country:

North America > United States (0.28)
Oceania > Australia > Queensland (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Research Report (0.69)
Instructional Material (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Spatioformer: A Geo-encoded Transformer for Large-Scale Plant Species Richness Prediction

Guo, Yiqing, Mokany, Karel, Levick, Shaun R., Yang, Jinyan, Moghadam, Peyman

arXiv.org Artificial IntelligenceNov-5-2024

Earth observation data have shown promise in predicting species richness of vascular plants ($\alpha$-diversity), but extending this approach to large spatial scales is challenging because geographically distant regions may exhibit different compositions of plant species ($\beta$-diversity), resulting in a location-dependent relationship between richness and spectral measurements. In order to handle such geolocation dependency, we propose Spatioformer, where a novel geolocation encoder is coupled with the transformer model to encode geolocation context into remote sensing imagery. The Spatioformer model compares favourably to state-of-the-art models in richness predictions on a large-scale ground-truth richness dataset (HAVPlot) that consists of 68,170 in-situ richness samples covering diverse landscapes across Australia. The results demonstrate that geolocational information is advantageous in predicting species richness from satellite observations over large spatial scales. With Spatioformer, plant species richness maps over Australia are compiled from Landsat archive for the years from 2015 to 2023. The richness maps produced in this study reveal the spatiotemporal dynamics of plant species richness in Australia, providing supporting evidence to inform effective planning and policy development for plant diversity conservation. Regions of high richness prediction uncertainties are identified, highlighting the need for future in-situ surveys to be conducted in these areas to enhance the prediction accuracy.

artificial intelligence, machine learning, richness, (16 more...)

arXiv.org Artificial Intelligence

2410.19256

Country:

Oceania > Australia > Queensland (0.14)
Oceania > Australia > Australian Capital Territory (0.14)
Oceania > Australia > South Australia (0.14)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Food & Agriculture > Agriculture (0.67)
Government > Regional Government > Oceania Government > Australia Government (0.46)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.38)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Pair-VPR: Place-Aware Pre-training and Contrastive Pair Classification for Visual Place Recognition with Vision Transformers

Hausler, Stephen, Moghadam, Peyman

arXiv.org Artificial IntelligenceOct-9-2024

In this work we propose a novel joint training method for Visual Place Recognition (VPR), which simultaneously learns a global descriptor and a pair classifier for re-ranking. The pair classifier can predict whether a given pair of images are from the same place or not. The network only comprises Vision Transformer components for both the encoder and the pair classifier, and both components are trained using their respective class tokens. In existing VPR methods, typically the network is initialized using pre-trained weights from a generic image dataset such as ImageNet. In this work we propose an alternative pre-training strategy, by using Siamese Masked Image Modelling as a pre-training task. We propose a Place-aware image sampling procedure from a collection of large VPR datasets for pre-training our model, to learn visual features tuned specifically for VPR. By re-using the Mask Image Modelling encoder and decoder weights in the second stage of training, Pair-VPR can achieve state-of-the-art VPR performance across five benchmark datasets with a ViT-B encoder, along with further improvements in localization recall with larger encoders. The Pair-VPR website is: https://csiro-robotics.github.io/Pair-VPR.

artificial intelligence, machine learning, recognition, (14 more...)

arXiv.org Artificial Intelligence

2410.06614

Country: Oceania > Australia > Queensland > Brisbane (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

M2Distill: Multi-Modal Distillation for Lifelong Imitation Learning

Roy, Kaushik, Dissanayake, Akila, Tidd, Brendan, Moghadam, Peyman

arXiv.org Artificial IntelligenceOct-4-2024

Lifelong imitation learning for manipulation tasks poses significant challenges due to distribution shifts that occur in incremental learning steps. Existing methods often focus on unsupervised skill discovery to construct an ever-growing skill library or distillation from multiple policies, which can lead to scalability issues as diverse manipulation tasks are continually introduced and may fail to ensure a consistent latent space throughout the learning process, leading to catastrophic forgetting of previously learned skills. In this paper, we introduce M2Distill, a multi-modal distillation-based method for lifelong imitation learning focusing on preserving consistent latent space across vision, language, and action distributions throughout the learning process. By regulating the shifts in latent representations across different modalities from previous to current steps, and reducing discrepancies in Gaussian Mixture Model (GMM) policies between consecutive learning steps, we ensure that the learned policy retains its ability to perform previously learned tasks while seamlessly integrating new skills. Extensive evaluations on the LIBERO lifelong imitation learning benchmark suites, including LIBERO-OBJECT, LIBERO-GOAL, and LIBERO-SPATIAL, demonstrate that our method consistently outperforms prior state-of-the-art methods across all evaluated metrics.

artificial intelligence, imitation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.00064

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (0.84)

Industry: Education (0.89)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Object Registration in Neural Fields

Hall, David, Hausler, Stephen, Mahendren, Sutharsan, Moghadam, Peyman

arXiv.org Artificial IntelligenceMay-3-2024

Abstract-- Neural fields provide a continuous scene representation of 3D geometry and appearance in a way which has great promise for robotics applications. One functionality that unlocks unique use-cases for neural fields in robotics is object 6-DoF registration. In this paper, we provide an expanded analysis of the recent Reg-NF neural field registration method and its use-cases within a robotics context. We showcase the scenario of determining the 6-DoF pose of known objects within a scene using scene and object neural field models. We show how this may be used to better represent objects within imperfectly modelled scenes and generate new scenes by substituting object neural field models into the scene.

artificial intelligence, reg-nf, registration, (13 more...)

arXiv.org Artificial Intelligence

2404.18381

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (0.92)

Add feedback

Towards Long-term Robotics in the Wild

Hausler, Stephen, Griffiths, Ethan, Ramezani, Milad, Moghadam, Peyman

arXiv.org Artificial IntelligenceApr-29-2024

In this paper, we emphasise the critical importance of large-scale datasets for advancing field robotics capabilities, particularly in natural environments. While numerous datasets exist for urban and suburban settings, those tailored to natural environments are scarce. Our recent benchmarks WildPlaces and WildScenes address this gap by providing synchronised image, lidar, semantic and accurate 6-DoF pose information in forest-type environments. We highlight the multi-modal nature of this dataset and discuss and demonstrate its utility in various downstream tasks, such as place recognition and 2D and 3D semantic segmentation tasks.

artificial intelligence, dataset, place recognition, (13 more...)

arXiv.org Artificial Intelligence

2404.18477

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (0.40)

Industry: Transportation (0.47)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Reg-NF: Efficient Registration of Implicit Surfaces within Neural Fields

Hausler, Stephen, Hall, David, Mahendren, Sutharsan, Moghadam, Peyman

arXiv.org Artificial IntelligenceFeb-15-2024

Neural fields, coordinate-based neural networks, have recently gained popularity for implicitly representing a scene. In contrast to classical methods that are based on explicit representations such as point clouds, neural fields provide a continuous scene representation able to represent 3D geometry and appearance in a way which is compact and ideal for robotics applications. However, limited prior methods have investigated registering multiple neural fields by directly utilising these continuous implicit representations. In this paper, we present Reg-NF, a neural fields-based registration that optimises for the relative 6-DoF transformation between two arbitrary neural fields, even if those two fields have different scale factors. Key components of Reg-NF include a bidirectional registration loss, multi-view surface sampling, and utilisation of volumetric signed distance functions (SDFs). We showcase our approach on a new neural field dataset for evaluating registration problems. We provide an exhaustive set of experiments and ablation studies to identify the performance of our approach, while also discussing limitations to provide future direction to the research community on open challenges in utilizing neural fields in unconstrained environments.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2402.09722

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pretraining

Mohamed, Shaheer, Haghighat, Maryam, Fernando, Tharindu, Sridharan, Sridha, Fookes, Clinton, Moghadam, Peyman

arXiv.org Artificial IntelligenceJan-3-2024

Hyperspectral images (HSIs) contain rich spectral and spatial information. Motivated by the success of transformers in the field of natural language processing and computer vision where they have shown the ability to learn long range dependencies within input data, recent research has focused on using transformers for HSIs. However, current state-of-the-art hyperspectral transformers only tokenize the input HSI sample along the spectral dimension, resulting in the under-utilization of spatial information. Moreover, transformers are known to be data-hungry and their performance relies heavily on large-scale pretraining, which is challenging due to limited annotated hyperspectral data. Therefore, the full potential of HSI transformers has not been fully realized. To overcome these limitations, we propose a novel factorized spectral-spatial transformer that incorporates factorized self-supervised pretraining procedures, leading to significant improvements in performance. The factorization of the inputs allows the spectral and spatial transformers to better capture the interactions within the hyperspectral data cubes. Inspired by masked image modeling pretraining, we also devise efficient masking strategies for pretraining each of the spectral and spatial transformers. We conduct experiments on six publicly available datasets for HSI classification task and demonstrate that our model achieves state-of-the-art performance in all the datasets. The code for our model will be made available at https://github.com/csiro-robotics/factoformer.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2309.09431

Country:

North America > United States (0.28)
Asia > China (0.28)
Oceania > Australia > Queensland > Brisbane (0.14)

Genre: Research Report (0.64)

Industry: Energy (0.49)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in Large-scale Natural Environments

Vidanapathirana, Kavisha, Knights, Joshua, Hausler, Stephen, Cox, Mark, Ramezani, Milad, Jooste, Jason, Griffiths, Ethan, Mohamed, Shaheer, Sridharan, Sridha, Fookes, Clinton, Moghadam, Peyman

arXiv.org Artificial IntelligenceDec-23-2023

Recent progress in semantic scene understanding has primarily been enabled by the availability of semantically annotated bi-modal (camera and lidar) datasets in urban environments. However, such annotated datasets are also needed for natural, unstructured environments to enable semantic perception for applications, including conservation, search and rescue, environment monitoring, and agricultural automation. Therefore, we introduce WildScenes, a bi-modal benchmark dataset consisting of multiple large-scale traversals in natural environments, including semantic annotations in high-resolution 2D images and dense 3D lidar point clouds, and accurate 6-DoF pose information. The data is (1) trajectory-centric with accurate localization and globally aligned point clouds, (2) calibrated and synchronized to support bi-modal inference, and (3) containing different natural environments over 6 months to support research on domain adaptation. Our 3D semantic labels are obtained via an efficient automated process that transfers the human-annotated 2D labels from multiple views into 3D point clouds, thus circumventing the need for expensive and time-consuming human annotation in 3D. We introduce benchmarks on 2D and 3D semantic segmentation and evaluate a variety of recent deep-learning techniques to demonstrate the challenges in semantic segmentation in natural environments. We propose train-val-test splits for standard benchmarks as well as domain adaptation benchmarks and utilize an automated split generation technique to ensure the balance of class label distributions. The data, evaluation scripts and pretrained models will be released upon acceptance at https://csiro-robotics.github.io/WildScenes.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2312.15364

Country: Oceania > Australia > Queensland > Brisbane (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Pose-Graph Attentional Graph Neural Network for Lidar Place Recognition

Ramezani, Milad, Wang, Liang, Knights, Joshua, Li, Zhibin, Pounds, Pauline, Moghadam, Peyman

arXiv.org Artificial IntelligenceNov-22-2023

This paper proposes a pose-graph attentional graph neural network, called P-GAT, which compares (key)nodes between sequential and non-sequential sub-graphs for place recognition tasks as opposed to a common frame-to-frame retrieval problem formulation currently implemented in SOTA place recognition methods. P-GAT uses the maximum spatial and temporal information between neighbour cloud descriptors -- generated by an existing encoder -- utilising the concept of pose-graph SLAM. Leveraging intra- and inter-attention and graph neural network, P-GAT relates point clouds captured in nearby locations in Euclidean space and their embeddings in feature space. Experimental results on the large-scale publically available datasets demonstrate the effectiveness of our approach in scenes lacking distinct features and when training and testing environments have different distributions (domain adaptation). Further, an exhaustive comparison with the state-of-the-art shows improvements in performance gains. Code is available at https://github.com/csiro-robotics/P-GAT.

artificial intelligence, machine learning, point cloud, (14 more...)

arXiv.org Artificial Intelligence

2309.00168

Country:

Asia (0.28)
Oceania > Australia > Queensland (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback