Deep Distribution Regression Machine Learning

In recent years, a variety of machine learning methods, such as random forest, gradient boosting trees and neural networks have gained popularity and been widely adopted. These methods are often flexible enough to uncover complex relationships in high-dimensional data without strong assumptions on the underlying data structure. Off-the-shelf software is available to put these algorithms into production [Pedregosa et al. (2011), Abadi et al. (2016) and Paszke et al. (2017)]. However, in regression and forecasting tasks, many of the machine learning methods only provide a point estimate, without any additional information regarding the uncertainty of the target quantity. Understanding uncertainties are often crucial in fields such as financial markets and risk analysis [Diebold et al. (1997), Timmermann (2000)], population and demographic studies [Wilson and Bell (2007)], transportation and traffic analysis [Zhu and Laptev (2017), Rodrigues and Pereira (2018)] and energy forecasting [Hong et al. (2016)].

Batch Reinforcement Learning for Smart Home Energy Management

AAAI Conferences

Smart grids enhance power grids by integrating electronic equipment, communication systems and computational tools. In a smart grid, consumers can insert energy into the power grid. We propose a new energy management system (called RLbEMS) that autonomously defines a policy for selling or storing energy surplus in smart homes. This policy is achieved through Batch Reinforcement Learning with historical data about energy prices, energy generation, consumer demand and characteristics of storage systems. In practical problems, RLbEMS has learned good energy selling policies quickly and effectively. We obtained maximum gains of 20.78% and 10.64%, when compared to a Naive-greedy policy, for smart homes located in Brazil and in the USA, respectively. Another important result achieved by RLbEMS was the reduction of about 30% of peak demand, a central desideratum for smart grids.

A machine learning approach for underwater gas leakage detection Machine Learning

Underwater gas reservoirs are used in many situations. In particular, Carbon Capture and Storage (CCS) facilities that are currently being developed intend to store greenhouse gases inside geological formations in the deep sea. In these formations, however, the gas might percolate, leaking back to the water and eventually to the atmosphere. The early detection of such leaks is therefore tantamount to any underwater CCS project. In this work, we propose to use Passive Acoustic Monitoring (PAM) and a machine learning approach to design efficient detectors that can signal the presence of a leakage. We use data obtained from simulation experiments off the Brazilian shore, and show that the detection based on classification algorithms achieve good performance. We also propose a smoothing strategy based on Hidden Markov Models in order to incorporate previous knowledge about the probabilities of leakage occurrences.

Aboveground biomass mapping in French Guiana by combining remote sensing, forest inventories and environmental data Machine Learning

Mapping forest aboveground biomass (AGB) has become an important task, particularly for the reporting of carbon stocks and changes. AGB can be mapped using synthetic aperture radar data (SAR) or passive optical data. However, these data are insensitive to high AGB levels (\textgreater{}150 Mg/ha, and \textgreater{}300 Mg/ha for P-band), which are commonly found in tropical forests. Studies have mapped the rough variations in AGB by combining optical and environmental data at regional and global scales. Nevertheless, these maps cannot represent local variations in AGB in tropical forests. In this paper, we hypothesize that the problem of misrepresenting local variations in AGB and AGB estimation with good precision occurs because of both methodological limits (signal saturation or dilution bias) and a lack of adequate calibration data in this range of AGB values. We test this hypothesis by developing a calibrated regression model to predict variations in high AGB values (mean \textgreater{}300 Mg/ha) in French Guiana by a methodological approach for spatial extrapolation with data from the optical geoscience laser altimeter system (GLAS), forest inventories, radar, optics, and environmental variables for spatial inter-and extrapolation. Given their higher point count, GLAS data allow a wider coverage of AGB values. We find that the metrics from GLAS footprints are correlated with field AGB estimations (R 2 =0.54, RMSE=48.3 Mg/ha) with no bias for high values. First, predictive models, including remote-sensing, environmental variables and spatial correlation functions, allow us to obtain "wall-to-wall" AGB maps over French Guiana with an RMSE for the in situ AGB estimates of ~51 Mg/ha and R${}^2$=0.48 at a 1-km grid size. We conclude that a calibrated regression model based on GLAS with dependent environmental data can produce good AGB predictions even for high AGB values if the calibration data fit the AGB range. We also demonstrate that small temporal and spatial mismatches between field data and GLAS footprints are not a problem for regional and global calibrated regression models because field data aim to predict large and deep tendencies in AGB variations from environmental gradients and do not aim to represent high but stochastic and temporally limited variations from forest dynamics. Thus, we advocate including a greater variety of data, even if less precise and shifted, to better represent high AGB values in global models and to improve the fitting of these models for high values.

Opportunities at the Intersection of Synthetic Biology, Machine Learning, and Automation


A New Biology for a New Century Obstacles to an Exponential Increase in Synthetic Biology Productivity Machine Learning's Predictive Capabilities Machine Learning Needs Automation To Be Truly Effective Predictive Synthetic Biology Will Dramatically Impact Biology and Inspire Computer Science Biology has changed radically in the past two decades, transitioning from a descriptive science into a design science. The discovery of DNA as the repository of genetic information, and of recombinant DNA as an effective way to modify it, has first led into the development of genetic engineering and later the field of synthetic biology. Synthetic biology(1) goes beyond the historical practice of a biological research based on describing and cataloguing (e.g., Linnaean taxonomic classification or phylogenetic tree development), and aims to design biological systems to a given specification (e.g., production of a given amount of a medical drug or targeted invasion of a specific type of cancer cell). This transition into an industrialized synthetic biology is expected to affect most human activities, from improving human health, to producing renewable biofuels to combat climate change.(2) Some examples commercially available now include synthetic leather and spider silk, renewable biodiesel that propels the Rio de Janeiro public bus system, vegan burgers with meat taste, and sustainable skin-rejuvenating cosmetics.