South America
Graph Neural Networks for Motion Planning
Khan, Arbaaz, Ribeiro, Alejandro, Kumar, Vijay, Francis, Anthony G.
This paper investigates the feasibility of using Graph Neural Networks (GNNs) for classical motion planning problems. Planning algorithms that search through discrete spaces as well as continuous ones are studied. This paper proposes using GNNs to guide the search algorithm by exploiting the ability of GNNs to extract low level information about the topology of a planning space. We present two techniques, GNNs over dense fixed graphs for low-dimensional problems and sampling-based GNNs for high-dimensional problems. We examine the ability of a GNN to tackle planning problems that are heavily dependent on the topology of the space such as identifying critical nodes, learning a heuristic that guides exploration in $\text{A}^*$, and learning the sampling distribution in Rapidly-exploring Random Trees (RRT). We demonstrate that GNNs can offer better results when compared to traditional analytic methods as well as learning-based approaches that employ fully-connected networks or convolutional neural networks.
Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
Qu, Guannan, Lin, Yiheng, Wierman, Adam, Li, Na
It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues due to the fact that the size of the state and action spaces are exponentially large in the number of agents. In this paper, we identify a rich class of networked MARL problems where the model exhibits a local dependence structure that allows it to be solved in a scalable manner. Specifically, we propose a Scalable Actor-Critic (SAC) method that can learn a near optimal localized policy for optimizing the average reward with complexity scaling with the state-action space size of local neighborhoods, as opposed to the entire network. Our result centers around identifying and exploiting an exponential decay property that ensures the effect of agents on each other decays exponentially fast in their graph distance.
How Interpretable and Trustworthy are GAMs?
Chang, Chun-Hao, Tan, Sarah, Lengerich, Ben, Goldenberg, Anna, Caruana, Rich
Generalized additive models (GAMs) have become a leading model class for data bias discovery and model auditing. However, there are a variety of algorithms for training GAMs, and these do not always learn the same things. Statisticians originally used splines to train GAMs, but more recently GAMs are being trained with boosted decision trees. It is unclear which GAM model(s) to believe, particularly when their explanations are contradictory. In this paper, we investigate a variety of different GAM algorithms both qualitatively and quantitatively on real and simulated datasets. Our results suggest that inductive bias plays a crucial role in model explanations and tree-based GAMs are to be recommended for the kinds of problems and dataset sizes we worked with.
Optimizing generalization on the train set: a novel gradient-based framework to train parameters and hyperparameters simultaneously
Lounici, Karim, Meziani, Katia, Riu, Benjamin
Generalization is a central problem in Machine Learning. Most prediction methods require careful calibration of hyperparameters carried out on a hold-out \textit{validation} dataset to achieve generalization. The main goal of this paper is to present a novel approach based on a new measure of risk that allows us to develop novel fully automatic procedures for generalization. We illustrate the pertinence of this new framework in the regression problem. The main advantages of this new approach are: (i) it can simultaneously train the model and perform regularization in a single run of a gradient-based optimizer on all available data without any previous hyperparameter tuning; (ii) this framework can tackle several additional objectives simultaneously (correlation, sparsity,...) $via$ the introduction of regularization parameters. Noticeably, our approach transforms hyperparameter tuning as well as feature selection (a combinatorial discrete optimization problem) into a continuous optimization problem that is solvable via classical gradient-based methods ; (iii) the computational complexity of our methods is $O(npK)$ where $n,p,K$ denote respectively the number of observations, features and iterations of the gradient descent algorithm. We observe in our experiments a significantly smaller runtime for our methods as compared to benchmark methods for equivalent prediction score. Our procedures are implemented in PyTorch (code is available for replication).
SLIC-UAV: A Method for monitoring recovery in tropical restoration projects through identification of signature species using UAVs
Williams, Jonathan, Schönlieb, Carola-Bibiane, Swinfield, Tom, Irawan, Bambang, Achmad, Eva, Zudhi, Muhammad, Habibi, null, Gemita, Elva, Coomes, David A.
Logged forests cover four million square kilometres of the tropics and restoring these forests is essential if we are to avoid the worst impacts of climate change, yet monitoring recovery is challenging. Tracking the abundance of visually identifiable, early-successional species enables successional status and thereby restoration progress to be evaluated. Here we present a new pipeline, SLIC-UAV, for processing Unmanned Aerial Vehicle (UAV) imagery to map early-successional species in tropical forests. The pipeline is novel because it comprises: (a) a time-efficient approach for labelling crowns from UAV imagery; (b) machine learning of species based on spectral and textural features within individual tree crowns, and (c) automatic segmentation of orthomosaiced UAV imagery into 'superpixels', using Simple Linear Iterative Clustering (SLIC). Creating superpixels reduces the dataset's dimensionality and focuses prediction onto clusters of pixels, greatly improving accuracy. To demonstrate SLIC-UAV, support vector machines and random forests were used to predict the species of hand-labelled crowns in a restoration concession in Indonesia. Random forests were most accurate at discriminating species for whole crowns, with accuracy ranging from 79.3% when mapping five common species, to 90.5% when mapping the three most visually-distinctive species. In contrast, support vector machines proved better for labelling automatically segmented superpixels, with accuracy ranging from 74.3% to 91.7% for the same species. Models were extended to map species across 100 hectares of forest. The study demonstrates the power of SLIC-UAV for mapping characteristic early-successional tree species as an indicator of successional stage within tropical forest restoration areas. Continued effort is needed to develop easy-to-implement and low-cost technology to improve the affordability of project management.
CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training
Guo, Qipeng, Jin, Zhijing, Qiu, Xipeng, Zhang, Weinan, Wipf, David, Zhang, Zheng
Two important tasks at the intersection of knowledge graphs and natural language processing are graph-to-text (G2T) and text-to-graph (T2G) conversion. Due to the difficulty and high cost of data collection, the supervised data available in the two fields are usually on the magnitude of tens of thousands, for example, 18K in the WebNLG dataset, which is far fewer than the millions of data for other tasks such as machine translation. Consequently, deep learning models in these two fields suffer largely from scarce training data. This work presents the first attempt to unsupervised learning of T2G and G2T via cycle training. We present CycleGT, an unsupervised training framework that can bootstrap from fully non-parallel graph and text datasets, iteratively back translate between the two forms, and use a novel pretraining strategy. Experiments on the benchmark WebNLG dataset show that, impressively, our unsupervised model trained on the same amount of data can achieve performance on par with the supervised models. This validates our framework as an effective approach to overcome the data scarcity problem in the fields of G2T and T2G.
A multi-objective-based approach for Fair Principal Component Analysis
Pelegrina, Guilherme D., Brotto, Renan D. B., Duarte, Leonardo T., Romano, João M. T., Attux, Romis
In dimension reduction problems, the adopted technique may produce disparities between the representation errors of two or more different groups. For instance, in the projected space, a specific class can be better represented in comparison with the other ones. Depending on the situation, this unfair result may introduce ethical concerns. In this context, this paper investigates how a fairness measure can be considered when performing dimension reduction through principal component analysis. Since both reconstruction error and fairness measure must be taken into account, we propose a multi-objective-based approach to tackle the Fair Principal Component Analysis problem. The experiments attest that a fairer result can be achieved with a very small loss in the reconstruction error.
The global AI agenda: Latin America
This report is part of "The global AI agenda," a thought leadership program by MIT Technology Review Insights examining how organizations are using AI today and planning to do so in the future. Featuring a global survey of 1,004 AI experts conducted in January and February 2020, it explores AI adoption, leading use cases, benefits, and challenges, and seeks to understand how organizations might share data with each other to develop new business models, products, and services in the years ahead. The regional summary explores how executives in Latin America see AI: the opportunities, challenges, and the potential for data to be shared with third parties for mutual benefit.
Rank Reduction, Matrix Balancing, and Mean-Field Approximation on Statistical Manifold
Ghalamkari, Kazu, Sugiyama, Mahito
We present a unified view of three different problems; rank reduction of matrices, matrix balancing, and mean-field approximation, using information geometry. Our key idea is to treat each matrix as a probability distribution represented by a loglinear model on a partially ordered set (poset), which enables us to formulate rank reduction and balancing of a matrix as projection onto a statistical submanifold, which corresponds to the set of low-rank matrices or that of balanced matrices. Moreover, the process of rank-1 reduction coincides with the mean-field approximation in the sense that the expectation parameters can be decomposed into products, where the mean-field equation holds. Our observation leads to a new convex optimization formulation of rank reduction, which applies to any nonnegative matrices, while the Nystr\"om method, one of the most popular rank reduction methods, is applicable to only kernel positive semidefinite matrices. We empirically show that our rank reduction method achieves better approximation of matrices produced by real-world data compared to Nystrom method.
A Machine Learning Early Warning System: Multicenter Validation in Brazilian Hospitals
Kobylarz, Jhonatan, Santos, Henrique D. P. dos, Barletta, Felipe, da Silva, Mateus Cichelero, Vieira, Renata, Morales, Hugo M. P., Rocha, Cristian da Costa
Early recognition of clinical deterioration is one of the main steps for reducing inpatient morbidity and mortality. The challenging task of clinical deterioration identification in hospitals lies in the intense daily routines of healthcare practitioners, in the unconnected patient data stored in the Electronic Health Records (EHRs) and in the usage of low accuracy scores. Since hospital wards are given less attention compared to the Intensive Care Unit, ICU, we hypothesized that when a platform is connected to a stream of EHR, there would be a drastic improvement in dangerous situations awareness and could thus assist the healthcare team. With the application of machine learning, the system is capable to consider all patient's history and through the use of high-performing predictive models, an intelligent early warning system is enabled. In this work we used 121,089 medical encounters from six different hospitals and 7,540,389 data points, and we compared popular ward protocols with six different scalable machine learning methods (three are classic machine learning models, logistic and probabilistic-based models, and three gradient boosted models). The results showed an advantage in AUC (Area Under the Receiver Operating Characteristic Curve) of 25 percentage points in the best Machine Learning model result compared to the current state-of-the-art protocols. This is shown by the generalization of the algorithm with leave-one-group-out (AUC of 0.949) and the robustness through cross-validation (AUC of 0.961). We also perform experiments to compare several window sizes to justify the use of five patient timestamps. A sample dataset, experiments, and code are available for replicability purposes.