Goto

Collaborating Authors

 South America


Panoptic Segmentation Meets Remote Sensing

arXiv.org Artificial Intelligence

Panoptic segmentation combines instance and semantic predictions, allowing the detection of "things" and "stuff" simultaneously. Effectively approaching panoptic segmentation in remotely sensed data can be auspicious in many challenging problems since it allows continuous mapping and specific target counting. Several difficulties have prevented the growth of this task in remote sensing: (a) most algorithms are designed for traditional images, (b) image labelling must encompass "things" and "stuff" classes, and (c) the annotation format is complex. Thus, aiming to solve and increase the operability of panoptic segmentation in remote sensing, this study has five objectives: (1) create a novel data preparation pipeline for panoptic segmentation, (2) propose an annotation conversion software to generate panoptic annotations; (3) propose a novel dataset on urban areas, (4) modify the Detectron2 for the task, and (5) evaluate difficulties of this task in the urban setting. We used an aerial image with a 0,24-meter spatial resolution considering 14 classes. Our pipeline considers three image inputs, and the proposed software uses point shapefiles for creating samples in the COCO format. Our study generated 3,400 samples with 512x512 pixel dimensions. We used the Panoptic-FPN with two backbones (ResNet-50 and ResNet-101), and the model evaluation considered semantic instance and panoptic metrics. We obtained 93.9, 47.7, and 64.9 for the mean IoU, box AP, and PQ. Our study presents the first effective pipeline for panoptic segmentation and an extensive database for other researchers to use and deal with other data or related problems requiring a thorough scene understanding.


We mapped every large solar plant on the planet using satellites and machine learning

#artificialintelligence

An astonishing 82% decrease in the cost of solar photovoltaic (PV) energy since 2010 has given the world a fighting chance to build a zero-emissions energy system which might be less costly than the fossil-fuelled system it replaces. The International Energy Agency projects that PV solar generating capacity must grow ten-fold by 2040 if we are to meet the dual tasks of alleviating global poverty and constraining warming to well below 2 C. Solar is "intermittent", since sunshine varies during the day and across seasons, so energy must be stored for when the sun doesn't shine. Policy must also be designed to ensure solar energy reaches the furthest corners of the world and places where it is most needed. And there will be inevitable trade-offs between solar energy and other uses for the same land, including conservation and biodiversity, agriculture and food systems, and community and indigenous uses. Colleagues and I have now published in the journal Nature the first global inventory of large solar energy generating facilities.


Teaching People by Justifying Tree Search Decisions: An Empirical Study in Curling

Journal of Artificial Intelligence Research

In this research note we show that a simple justification system can be used to teach humans non-trivial strategies of the Olympic sport of curling. This is achieved by justifying the decisions of Kernel Regression UCT (KR-UCT), a tree search algorithm that derives curling strategies by playing the game with itself. Given an action returned by KR-UCT and the expected outcome of that action, we use a decision tree to produce a counterfactual justification of KR-UCT's decision. The system samples other possible outcomes and selects for presentation the outcomes that are most similar to the expected outcome in terms of visual features and most different in terms of expected end-game value. A user study with 122 people shows that the participants who had access to the justifications produced by our system achieved much higher scores in a curling test than those who only observed the decision made by KR-UCT and those with access to the justifications of a baseline system. This is, to the best of our knowledge, the first work showing that a justification system is able to teach humans non-trivial strategies learned by an algorithm operating in self play.


Inference of time-ordered multibody interactions

arXiv.org Machine Learning

We introduce time-ordered multibody interactions to describe complex systems manifesting temporal as well as multibody dependencies. First, we show how the dynamics of multivariate Markov chains can be decomposed in ensembles of time-ordered multibody interactions. Then, we present an algorithm to extract combined interactions from data and a measure to characterize the complexity of interaction ensembles. Finally, we experimentally validate the robustness of our algorithm against statistical errors and its efficiency at obtaining simple interaction ensembles.


Weighing the Milky Way and Andromeda with Artificial Intelligence

arXiv.org Artificial Intelligence

We present new constraints on the masses of the halos hosting the Milky Way and Andromeda galaxies derived using graph neural networks. Our models, trained on thousands of state-of-the-art hydrodynamic simulations of the CAMELS project, only make use of the positions, velocities and stellar masses of the galaxies belonging to the halos, and are able to perform likelihood-free inference on halo masses while accounting for both cosmological and astrophysical uncertainties. Our constraints are in agreement with estimates from other traditional methods.


MetaFormer is Actually What You Need for Vision

arXiv.org Artificial Intelligence

Transformers have shown great potential in computer vision tasks. A common belief is their attention-based token mixer module contributes most to their competence. However, recent works show the attention-based module in transformers can be replaced by spatial MLPs and the resulted models still perform quite well. Based on this observation, we hypothesize that the general architecture of the transformers, instead of the specific token mixer module, is more essential to the model's performance. To verify this, we deliberately replace the attention module in transformers with an embarrassingly simple spatial pooling operator to conduct only the most basic token mixing. Surprisingly, we observe that the derived model, termed as PoolFormer, achieves competitive performance on multiple computer vision tasks. For example, on ImageNet-1K, PoolFormer achieves 82.1% top-1 accuracy, surpassing well-tuned vision transformer/MLP-like baselines DeiT-B/ResMLP-B24 by 0.3%/1.1% accuracy with 35%/52% fewer parameters and 48%/60% fewer MACs. The effectiveness of PoolFormer verifies our hypothesis and urges us to initiate the concept of "MetaFormer", a general architecture abstracted from transformers without specifying the token mixer. Based on the extensive experiments, we argue that MetaFormer is the key player in achieving superior results for recent transformer and MLP-like models on vision tasks. This work calls for more future research dedicated to improving MetaFormer instead of focusing on the token mixer modules. Additionally, our proposed PoolFormer could serve as a starting baseline for future MetaFormer architecture design. Code is available at https://github.com/sail-sg/poolformer


5 Best Practices for Testing AI Applications

#artificialintelligence

In light of the April 2021 announcement of the world's first legislative framework for regulating Artificial Intelligence (AI), the European Artificial Intelligence Act (EU AIA), now is an opportune time for developers to revisit their strategies for testing AI applications. Incoming regulations mean that the group of stakeholders who care about your testing results just got bigger and more involved. The stakes are high, not least because companies that violate the terms of the legislation could face fines higher than those levied under the General Data Protection Act (GDPR). For the purpose of transparency, certain types of AI also have to make their accuracy metrics available to users, which adds to the pressure to get functional testing right. Following on from Applause's step-by-step guide to training and testing your AI algorithm, this article summarizes how developers should be testing AI applications in anticipation of the new era of AI regulations.


How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey

arXiv.org Artificial Intelligence

Deepfake is content or material that is synthetically generated or manipulated using artificial intelligence (AI) methods, to be passed off as real and can include audio, video, image, and text synthesis. This survey has been conducted with a different perspective compared to existing survey papers, that mostly focus on just video and image deepfakes. This survey not only evaluates generation and detection methods in the different deepfake categories, but mainly focuses on audio deepfakes that are overlooked in most of the existing surveys. This paper critically analyzes and provides a unique source of audio deepfake research, mostly ranging from 2016 to 2020. To the best of our knowledge, this is the first survey focusing on audio deepfakes in English. This survey provides readers with a summary of 1) different deepfake categories 2) how they could be created and detected 3) the most recent trends in this domain and shortcomings in detection methods 4) audio deepfakes, how they are created and detected in more detail which is the main focus of this paper. We found that Generative Adversarial Networks(GAN), Convolutional Neural Networks (CNN), and Deep Neural Networks (DNN) are common ways of creating and detecting deepfakes. In our evaluation of over 140 methods we found that the majority of the focus is on video deepfakes and in particular in the generation of video deepfakes. We found that for text deepfakes there are more generation methods but very few robust methods for detection, including fake news detection, which has become a controversial area of research because of the potential of heavy overlaps with human generation of fake content. This paper is an abbreviated version of the full survey and reveals a clear need to research audio deepfakes and particularly detection of audio deepfakes.


Computational simulation and the search for a quantitative description of simple reinforcement schedules

arXiv.org Artificial Intelligence

We aim to discuss schedules of reinforcement in its theoretical and practical terms pointing to practical limitations on implementing those schedules while discussing the advantages of computational simulation. In this paper, we present a R script named Beak, built to simulate rates of behavior interacting with schedules of reinforcement. Using Beak, we've simulated data that allows an assessment of different reinforcement feedback functions (RFF). This was made with unparalleled precision, since simulations provide huge samples of data and, more importantly, simulated behavior isn't changed by the reinforcement it produces. Therefore, we can vary it systematically. We've compared different RFF for RI schedules, using as criteria: meaning, precision, parsimony and generality. Our results indicate that the best feedback function for the RI schedule was published by Baum (1981). We also propose that the model used by Killeen (1975) is a viable feedback function for the RDRL schedule. We argue that Beak paves the way for greater understanding of schedules of reinforcement, addressing still open questions about quantitative features of schedules. Also, they could guide future experiments that use schedules as theoretical and methodological tools.


Cyclic Graph Attentive Match Encoder (CGAME): A Novel Neural Network For OD Estimation

arXiv.org Artificial Intelligence

Origin-Destination Estimation plays an important role in traffic management and traffic simulation in the era of Intelligent Transportation System (ITS). Nevertheless, previous model-based models face the under-determined challenge, thus desperate demand for additional assumptions and extra data exists. Deep learning provides an ideal data-based method for connecting inputs and results by probabilistic distribution transformation. While relevant researches of applying deep learning into OD estimation are limited due to the challenges lying in data transformation across representation space, especially from dynamic spatial-temporal space to heterogeneous graph in this issue. To address it, we propose Cyclic Graph Attentive Matching Encoder (C-GAME) based on a novel Graph Matcher with double-layer attention mechanism. It realizes effective information exchange in underlying feature space and establishes coupling relationship across spaces. The proposed model achieves state-of-the-art results in experiments, and offers a novel framework for inference task across spaces in prospective employments.