Goto

Collaborating Authors

 South America


A Survey of Transformers

arXiv.org Artificial Intelligence

Transformers have achieved great success in many artificial intelligence fields, such as natural language processing, computer vision, and audio processing. Therefore, it is natural to attract lots of interest from academic and industry researchers. Up to the present, a great variety of Transformer variants (a.k.a. X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing. In this survey, we provide a comprehensive review of various X-formers. We first briefly introduce the vanilla Transformer and then propose a new taxonomy of X-formers. Next, we introduce the various X-formers from three perspectives: architectural modification, pre-training, and applications. Finally, we outline some potential directions for future research.


On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control

arXiv.org Machine Learning

Reinforcement learning is a framework for interactive decision-making with incentives sequentially revealed across time without a system dynamics model. Due to its scaling to continuous spaces, we focus on policy search where one iteratively improves a parameterized policy with stochastic policy gradient (PG) updates. In tabular Markov Decision Problems (MDPs), under persistent exploration and suitable parameterization, global optimality may be obtained. By contrast, in continuous space, the non-convexity poses a pathological challenge as evidenced by existing convergence results being mostly limited to stationarity or arbitrary local extrema. To close this gap, we step towards persistent exploration in continuous space through policy parameterizations defined by distributions of heavier tails defined by tail-index parameter alpha, which increases the likelihood of jumping in state space. Doing so invalidates smoothness conditions of the score function common to PG. Thus, we establish how the convergence rate to stationarity depends on the policy's tail index alpha, a Holder continuity parameter, integrability conditions, and an exploration tolerance parameter introduced here for the first time. Further, we characterize the dependence of the set of local maxima on the tail index through an exit and transition time analysis of a suitably defined Markov chain, identifying that policies associated with Levy Processes of a heavier tail converge to wider peaks. This phenomenon yields improved stability to perturbations in supervised learning, which we corroborate also manifests in improved performance of policy search, especially when myopic and farsighted incentives are misaligned.


ICDAR 2021 Competition on Components Segmentation Task of Document Photos

arXiv.org Artificial Intelligence

This paper describes the short-term competition on "Components Segmentation Task of Document Photos" that was prepared in the context of the "16th International Conference on Document Analysis and Recognition" (ICDAR 2021). This competition aims to bring together researchers working on the filed of identification document image processing and provides them a suitable benchmark to compare their techniques on the component segmentation task of document images. Three challenge tasks were proposed entailing different segmentation assignments to be performed on a provided dataset. The collected data are from several types of Brazilian ID documents, whose personal information was conveniently replaced. There were 16 participants whose results obtained for some or all the three tasks show different rates for the adopted metrics, like "Dice Similarity Coefficient" ranging from 0.06 to 0.99. Different Deep Learning models were applied by the entrants with diverse strategies to achieve the best results in each of the tasks. Obtained results show that the current applied methods for solving one of the proposed tasks (document boundary detection) are already well stablished. However, for the other two challenge tasks (text zone and handwritten sign detection) research and development of more robust approaches are still required to achieve acceptable results.


Game of GANs: Game Theoretical Models for Generative Adversarial Networks

arXiv.org Artificial Intelligence

Generative Adversarial Network, as a promising research direction in the AI community, recently attracts considerable attention due to its ability to generating high-quality realistic data. GANs are a competing game between two neural networks trained in an adversarial manner to reach a Nash equilibrium. Despite the improvement accomplished in GANs in the last years, there remain several issues to solve. In this way, how to tackle these issues and make advances leads to rising research interests. This paper reviews literature that leverages the game theory in GANs and addresses how game models can relieve specific generative models' challenges and improve the GAN's performance. In particular, we firstly review some preliminaries, including the basic GAN model and some game theory backgrounds. After that, we present our taxonomy to summarize the state-of-the-art solutions into three significant categories: modified game model, modified architecture, and modified learning method. The classification is based on the modifications made in the basic model by the proposed approaches from the game-theoretic perspective. We further classify each category into several subcategories. Following the proposed taxonomy, we explore the main objective of each class and review the recent work in each group. Finally, we discuss the remaining challenges in this field and present the potential future research topics.


Detection of ripe flowers of the Alstroemeria genus Morado

#artificialintelligence

The authors of this blog are Stan Zwinkels & Ted de Vries Lentsch. This blog aims to present our attempt to create a detection algorithm for detecting ripe flowers of the Alstroemeria genus Morado. Throughout this blog, we explain our process to create a dataset and detection model that achieves an F1 score of more than 0.75. This blog is part of the course Seminar Computer Vision By Deep Learning (CS4245) 2021 from the Delft University of Technology. Creating the dataset has been carried out in collaboration with the company Hoogenboom Alstroemeria.


Artificial Intelligence Market Growing at a Significant Rate in the Forecast Period - The Manomet Current

#artificialintelligence

A new market study is released on Global "Artificial Intelligence Market 2021" with data Tables for historical and forecast years represented with Chats & Graphs with easy to understand detailed analysis. The report also sheds light on present scenario and upcoming trends and developments that are contributing in the growth of the market. In addition, key market boomers and opportunities driving the market growth are provided that estimates for Global Artificial Intelligence Market till 2027. The authors of the Artificial Intelligence Market report have piled up a detailed study on crucial market dynamics, including growth drivers, restraints, and opportunities. The Global Artificial Intelligence Market accounted for USD 16.14 billion in 2017 and is projected to grow at a CAGR of 37.3% the forecast period of 2018 to 2025.


Artificial Intelligence is Monitoring Traces of Wildlife in the Falkland Islands

#artificialintelligence

Scientists at Duke University and the Wildlife Conservation Society (WCS) have come up with an interesting set of deep learning algorithms that could analyze more than 10,000 drone images of mixed colonies of seabirds in the Falkland Islands off Argentina's coast. The Falklands are home to the world's largest colonies of black-browed albatrosses (Thalassarche melanophris) and the second-largest colonies of southern rockhopper penguins (Eudyptes c. chrysocome). Hundreds of thousands of birds breed on the islands in densely interspersed groups. The deep-learning algorithm made by the scientists has successfully identified and counted the albatrosses with 97% and the penguins with 87% accuracy. Madeline C. Hayes, a remote sensing analyst at the Duke University Marine Lab, who led the study has a view that using drone surveys and deep learning gives them an alternative that is remarkably accurate, less disruptive, and significantly easier. One person, or a small team, can do it, and the equipment they need to do it isn't all that costly or complicated.


Preventing Extreme Polarization of Political Attitudes

arXiv.org Artificial Intelligence

Extreme polarization can undermine democracy by making compromise impossible and transforming politics into a zero-sum game. Ideological polarization - the extent to which political views are widely dispersed - is already strong among elites, but less so among the general public (McCarty, 2019, p. 50-68). Strong mutual distrust and hostility between Democrats and Republicans in the U.S., combined with the elites' already strong ideological polarization, could lead to increasing ideological polarization among the public. The paper addresses two questions: (1) Is there a level of ideological polarization above which polarization feeds upon itself to become a runaway process? (2) If so, what policy interventions could prevent such dangerous positive feedback loops? To explore these questions, we present an agent-based model of ideological polarization that differentiates between the tendency for two actors to interact (exposure) and how they respond when interactions occur, positing that interaction between similar actors reduces their difference while interaction between dissimilar actors increases their difference. Our analysis explores the effects on polarization of different levels of tolerance to other views, responsiveness to other views, exposure to dissimilar actors, multiple ideological dimensions, economic self-interest, and external shocks. The results suggest strategies for preventing, or at least slowing, the development of extreme polarization.


Learning Deep Morphological Networks with Neural Architecture Search

arXiv.org Artificial Intelligence

Over the last decade, deep learning has made several breakthroughs and demonstrated successful applications in various fields (e.g. in computer vision Krizhevsky et al. [2012], Simonyan and Zisserman [2014a], He et al. [2016a], Huang et al. [2017], object detection Redmon et al. [2016], or NLP Dai et al. [2019], Radford et al. [2019]). This success is mainly due to its automation of the feature engineering process. This success is mainly attributable to the fact that it automates the feature engineering process. Rather than manually designed features, features are learned in an end-to-end process from data. The need for improved architecture has swiftly followed the advent of deep learning. Experts now place a premium on architecture engineering in lieu of features engineering. Architecture engineering is concerned with determining the most appropriate operations for the network, their hyperparameters (e.g. the number of neurons for fully connected layers, or the number of filters or kernel size for convolutional layers), and the connectivity of all the operations. Generally, practitioners propose novel operations to validate various architectures and tasks in order to improve performance on specific tasks. As a result, developing a novel operation remains a time-consuming and costly process.


Sejong Face Database: A Multi-Modal Disguise Face Database

arXiv.org Artificial Intelligence

Commercial application of facial recognition demands robustness to a variety of challenges such as illumination, occlusion, spoofing, disguise, etc. Disguised face recognition is one of the emerging issues for access control systems, such as security checkpoints at the borders. However, the lack of availability of face databases with a variety of disguise addons limits the development of academic research in the area. In this paper, we present a multimodal disguised face dataset to facilitate the disguised face recognition research. The presented database contains 8 facial add-ons and 7 additional combinations of these add-ons to create a variety of disguised face images. Each facial image is captured in visible, visible plus infrared, infrared, and thermal spectra. Specifically, the database contains 100 subjects divided into subset-A (30 subjects, 1 image per modality) and subset-B (70 subjects, 5 plus images per modality). We also present baseline face detection results performed on the proposed database to provide reference results and compare the performance in different modalities. Qualitative and quantitative analysis is performed to evaluate the challenging nature of disguise addons. The dataset will be publicly available with the acceptance of the research article. The database is available at: https://github.com/usmancheema89/SejongFaceDatabase.