Goto

Collaborating Authors

Macêdo, David


Distinction Maximization Loss: Fast, Scalable, Turnkey, and Native Neural Networks Out-of-Distribution Detection simply by Replacing the SoftMax Loss

arXiv.org Machine Learning

Recently, many methods to reduce neural networks uncertainty have been proposed. However, most of the techniques used in these solutions usually present severe drawbacks. In this paper, we argue that neural networks low out-of-distribution detection performance is mainly due to the SoftMax loss anisotropy. Therefore, we built an isotropic loss to reduce neural networks uncertainty in a fast, scalable, turnkey, and native approach. Our experiments show that replacing SoftMax with the proposed loss does not affect classification accuracy. Moreover, our proposal overcomes ODIN typically by a large margin while producing usually competitive results against a state-of-the-art Mahalanobis method despite avoiding their limitations. Hence, neural networks uncertainty may be significantly reduced by a simple loss change without relying on special procedures such as data augmentation, adversarial training/validation, ensembles, or additional classification/regression models.


Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition

arXiv.org Machine Learning

Abstract--The recognition of sign language is a challenging task with an important role in society to facilitate the communication ofdeaf persons. We propose a new approach of Spatial-Temporal Graph Convolutional Network to sign language recognition based on the human skeletal movements. The method uses graphs to capture the signs dynamics in two dimensions, spatial and temporal, considering the complex aspects of the language. Additionally, we present a new dataset of human skeletons for sign language based on ASLLVD to contribute to future related studies. I. INTRODUCTION Sign language is a visual communication skill that enables individuals with different types of hearing impairment to communicate in society. It is the language used by most deaf people in their daily lives and, moreover, it is the symbol of identification between the members of that community and the main force that unites them. The sign language has a very close relationship with the culture of the country or even regions, and for this reason, each nation has its language [1]. According to the World Health Organization, the number of deaf people is about 466 million, and the organization estimates that by 2050 this number exceeds 900 million, which is equivalent to a forecast of 1 in 10 individuals around the world [2].


Additive Margin SincNet for Speaker Recognition

arXiv.org Machine Learning

Speaker Recognition is a challenging task with essential applications such as authentication, automation, and security. The SincNet is a new deep learning based model which has produced promising results to tackle the mentioned task. To train deep learning systems, the loss function is essential to the network performance. The Softmax loss function is a widely used function in deep learning methods, but it is not the best choice for all kind of problems. For distance-based problems, one new Softmax based loss function called Additive Margin Softmax (AM-Softmax) is proving to be a better choice than the traditional Softmax. The AM-Softmax introduces a margin of separation between the classes that forces the samples from the same class to be closer to each other and also maximizes the distance between classes. In this paper, we propose a new approach for speaker recognition systems called AM-SincNet, which is based on the SincNet but uses an improved AM-Softmax layer. The proposed method is evaluated in the TIMIT dataset and obtained an improvement of approximately 40% in the Frame Error Rate compared to SincNet.


Squeezed Very Deep Convolutional Neural Networks for Text Classification

arXiv.org Machine Learning

Abstract--Most of the research in convolutional neural networks hasfocused on increasing network depth to improve accuracy, resulting in a massive number of parameters which restricts the trained network to platforms with memory and processing constraints. We propose to modify the structure of the Very Deep Convolutional Neural Networks (VDCNN) model to fit mobile platforms constraints and keep performance. In this paper, we evaluate the impact of Temporal Depthwise Separable Convolutions and Global Average Pooling in the network parameters, storagesize, and latency. The squeezed model (SVDCNN) is between 10x and 20x smaller, depending on the network depth, maintaining a maximum size of 6MB. Regarding accuracy, the network experiences a loss between 0.4% and 1.3% and obtains lower latencies compared to the baseline model. I. INTRODUCTION The general trend in deep learning approaches has been developing models with increasing layers. Deep models can also learn hierarchical feature representations from images [1].


Heartbeat Anomaly Detection using Adversarial Oversampling

arXiv.org Machine Learning

Cardiovascular diseases are one of the most common causes of death in the world. Prevention, knowledge of previous cases in the family, and early detection is the best strategy to reduce this fact. Different machine learning approaches to automatic diagnostic are being proposed to this task. As in most health problems, the imbalance between examples and classes is predominant in this problem and affects the performance of the automated solution. In this paper, we address the classification of heartbeats images in different cardiovascular diseases. We propose a two-dimensional Convolutional Neural Network for classification after using a InfoGAN architecture for generating synthetic images to unbalanced classes. We call this proposal Adversarial Oversampling and compare it with the classical oversampling methods as SMOTE, ADASYN, and RandomOversampling. The results show that the proposed approach improves the classifier performance for the minority classes without harming the performance in the balanced classes.


Hierarchical Attentional Hybrid Neural Networks for Document Classification

arXiv.org Artificial Intelligence

Document classification is a challenging task with important applications. Deep learning approaches to the problem have gained much attention. Despite the progress, the proposed models do not incorporate the knowledge of the document structure in the architecture efficiently and not take into account the contexting dependent importance of words and sentences. In this paper, we propose a new approach based on convolutional neural networks, gated recurrent units and attention mechanisms for document classification tasks. The datasets IMDB Movie Reviews and Yelp were used in experiments. The proposed method improves the results of current attention-based approaches