Goto

Collaborating Authors

 South America


A Machine Learning alternative to placebo-controlled clinical trials upon new diseases: A primer

arXiv.org Machine Learning

The appearance of a new dangerous and contagious disease requires the development of a drug therapy faster than what is foreseen by usual mechanisms. Many drug therapy developments consist in investigating through different clinical trials the effects of different specific drug combinations by delivering it into a test group of ill patients, meanwhile a placebo treatment is delivered to the remaining ill patients, known as the control group. We compare the above technique to a new technique in which all patients receive a different and reasonable combination of drugs and use this outcome to feed a Neural Network. By averaging out fluctuations and recognizing different patient features, the Neural Network learns the pattern that connects the patients initial state to the outcome of the treatments and therefore can predict the best drug therapy better than the above method. In contrast to many available works, we do not study any detail of drugs composition nor interaction, but instead pose and solve the problem from a phenomenological point of view, which allows us to compare both methods. Although the conclusion is reached through mathematical modeling and is stable upon any reasonable model, this is a proof-of-concept that should be studied within other expertises before confronting a real scenario. All calculations, tools and scripts have been made open source for the community to test, modify or expand it. Finally it should be mentioned that, although the results presented here are in the context of a new disease in medical sciences, these are useful for any field that requires a experimental technique with a control group.



Unsupervised Fuzzy eIX: Evolving Internal-eXternal Fuzzy Clustering

arXiv.org Artificial Intelligence

Time-varying classifiers, namely, evolving classifiers, play an important role in a scenario in which information is available as a never-ending online data stream. We present a new unsupervised learning method for numerical data called evolving Internal-eXternal Fuzzy clustering method (Fuzzy eIX). We develop the notion of double-boundary fuzzy granules and elaborate on its implications. Type 1 and type 2 fuzzy inference systems can be obtained from the projection of Fuzzy eIX granules. We perform the principle of the balanced information granularity within Fuzzy eIX classifiers to achieve a higher level of model understandability. Internal and external granules are updated from a numerical data stream at the same time that the global granular structure of the classifier is autonomously evolved. A synthetic nonstationary problem called Rotation of Twin Gaussians shows the behavior of the classifier. The Fuzzy eIX classifier could keep up with its accuracy in a scenario in which offline-trained classifiers would clearly have their accuracy drastically dropped.


AI Ethics: DNV GL Exec on Why Women Are Key to Ethics Research

#artificialintelligence

"If you look at the key names in the global debate on AI ethics, it is in fact dominated by women who have many different types of backgrounds, not only tech backgrounds." Artificial Intelligence (AI) is the game-changer in the industry, turbocharging new use cases in transportation, law enforcement, e-commerce, retail, healthcare, and entertainment. However, the quick pace of transformation and adoption is not accompanied by concrete industry standards on AI ethics and fairness in Machine Learning algorithms. While ethics in AI have been a dominant narrative for sometime, Big Tech is still seeking ways to design a code of conduct when building ML algorithms. Some tech giants like Microsoft have laid down guidelines to responsible AI and has operationalized responsible AI at scale, others are yet to follow suit.


An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem

arXiv.org Machine Learning

This paper considers the sparse generalized eigenvalue problem (SGEP), which aims to find the leading eigenvector with at most $k$ nonzero entries. SGEP naturally arises in many applications in machine learning, statistics, and scientific computing, for example, the sparse principal component analysis (SPCA), the sparse discriminant analysis (SDA), and the sparse canonical correlation analysis (SCCA). In this paper, we focus on the development of a three-stage algorithm named {\em inverse-free truncated Rayleigh-Ritz method} ({\em IFTRR}) to efficiently solve SGEP. In each iteration of IFTRR, only a small number of matrix-vector products is required. This makes IFTRR well-suited for large scale problems. Particularly, a new truncation strategy is proposed, which is able to find the support set of the leading eigenvector effectively. Theoretical results are developed to explain why IFTRR works well. Numerical simulations demonstrate the merits of IFTRR.


Multi-Lead ECG Classification via an Information-Based Attention Convolutional Neural Network

arXiv.org Machine Learning

Objective: A novel structure based on channel-wise attention mechanism is presented in this paper. Embedding with the proposed structure, an efficient classification model that accepts multi-lead electrocardiogram (ECG) as input is constructed. Methods: One-dimensional convolutional neural networks (CNN) have proven to be effective in pervasive classification tasks, enabling the automatic extraction of features while classifying targets. We implement the Residual connection and design a structure which can learn the weights from the information contained in different channels in the input feature map during the training process. An indicator named mean square deviation is introduced to monitor the performance of a particular model segment in the classification task on the two out of the five ECG classes. The data in the MIT-BIH arrhythmia database is used and a series of control experiments is conducted. Results: Utilizing both leads of the ECG signals as input to the neural network classifier can achieve better classification results than those from using single channel inputs in different application scenarios. Models embedded with the channel-wise attention structure always achieve better scores on sensitivity and precision than the plain Resnet models. The proposed model exceeds the performance of most of the state-of-the-art models in ventricular ectopic beats (VEB) classification, and achieves competitive scores for supraventricular ectopic beats (SVEB). Conclusion: Adopting more lead ECG signals as input can increase the dimensions of the input feature maps, helping to improve both the performance and generalization of the network model. Significance: Due to its end-to-end characteristics, and the extensible intrinsic for multi-lead heart diseases diagnosing, the proposed model can be used for the real-time ECG tracking of ECG waveforms for Holter or wearable devices.


Born-Again Tree Ensembles

arXiv.org Machine Learning

The use of machine learning algorithms in finance, medicine, and criminal justice can deeply impact human lives. As a consequence, research into interpretable machine learning has rapidly grown in an attempt to better control and fix possible sources of mistakes and biases. Tree ensembles offer a good prediction quality in various domains, but the concurrent use of multiple trees reduces the interpretability of the ensemble. Against this background, we study born-again tree ensembles, i.e., the process of constructing a single decision tree of minimum size that reproduces the exact same behavior as a given tree ensemble. To find such a tree, we develop a dynamic-programming based algorithm that exploits sophisticated pruning and bounding rules to reduce the number of recursive calls. This algorithm generates optimal born-again trees for many datasets of practical interest, leading to classifiers which are typically simpler and more interpretable without any other form of compromise.


Learning to Play Soccer by Reinforcement and Applying Sim-to-Real to Compete in the Real World

arXiv.org Artificial Intelligence

This work presents an application of Reinforcement Learning (RL) for the complete control of real soccer robots of the IEEE Very Small Size Soccer (VSSS) [1], a traditional league in the Latin American Robotics Competition (LARC). In the VSSS league, two teams of three small robots play against each other. We propose a simulated environment in which continuous or discrete control policies can be trained, and a Sim-to-Real method to allow using the obtained policies to control a robot in the real world. The results show that the learned policies display a broad repertoire of behaviors which are difficult to specify by hand. This approach, called VSSS-RL, was able to beat the human-designed policy for the striker of the team ranked 3rd place in the 2018 LARC, in 1-vs-1 matches.


Data-driven models and computational tools for neurolinguistics: a language technology perspective

arXiv.org Machine Learning

In this paper, our focus is the connection and influence of language technologies on the research in neurolinguistics. We present a review of brain imaging-based neurolinguistic studies with a focus on the natural language representations, such as word embeddings and pre-trained language models. Mutual enrichment of neurolinguistics and language technologies leads to development of brain-aware natural language representations. The importance of this research area is emphasized by medical applications.


Wise Sliding Window Segmentation: A classification-aided approach for trajectory segmentation

arXiv.org Machine Learning

Large amounts of mobility data are being generated from many different sources, and several data mining methods have been proposed for this data. One of the most critical steps for trajectory data mining is segmentation. This task can be seen as a pre-processing step in which a trajectory is divided into several meaningful consecutive sub-sequences. This process is necessary because trajectory patterns may not hold in the entire trajectory but on trajectory parts. In this work, we propose a supervised trajectory segmentation algorithm, called Wise Sliding Window Segmentation (WS-II). It processes the trajectory coordinates to find behavioral changes in space and time, generating an error signal that is further used to train a binary classifier for segmenting trajectory data. This algorithm is flexible and can be used in different domains. We evaluate our method over three real datasets from different domains (meteorology, fishing, and individuals movements), and compare it with four other trajectory segmentation algorithms: OWS, GRASP-UTS, CB-SMoT, and SPD. We observed that the proposed algorithm achieves the highest performance for all datasets with statistically significant differences in terms of the harmonic mean of purity and coverage.