Goto

Collaborating Authors

 Energy


InfoBot: Transfer and Exploration via the Information Bottleneck

arXiv.org Machine Learning

A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned policy with an information bottleneck, we can identify decision states by examining where the model actually leverages the goal state. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.


Bidirectional Inference Networks: A Class of Deep Bayesian Networks for Health Profiling

arXiv.org Machine Learning

We consider the problem of inferring the values of an arbitrary set of variables (e.g., risk of diseases) given other observed variables (e.g., symptoms and diagnosed diseases) and high-dimensional signals (e.g., MRI images or EEG). This is a common problem in healthcare since variables of interest often differ for different patients. Existing methods including Bayesian networks and structured prediction either do not incorporate high-dimensional signals or fail to model conditional dependencies among variables. To address these issues, we propose bidirectional inference networks (BIN), which stich together multiple probabilistic neural networks, each modeling a conditional dependency. Predictions are then made via iteratively updating variables using backpropagation (BP) to maximize corresponding posterior probability. Furthermore, we extend BIN to composite BIN (CBIN), which involves the iterative prediction process in the training stage and improves both accuracy and computational efficiency by adaptively smoothing the optimization landscape. Experiments on synthetic and real-world datasets (a sleep study and a dermatology dataset) show that CBIN is a single model that can achieve state-of-the-art performance and obtain better accuracy in most inference tasks than multiple models each specifically trained for a different task.


Mol-CycleGAN - a generative model for molecular optimization

arXiv.org Machine Learning

Designing a molecule with desired properties is one of the biggest challenges in drug development, as it requires optimization of chemical compound structures with respect to many complex properties. To augment the compound design process we introduce Mol-CycleGAN - a CycleGAN-based model that generates optimized compounds with high structural similarity to the original ones. Namely, given a molecule our model generates a structurally similar one with an optimized value of the considered property. We evaluate the performance of the model on selected optimization objectives related to structural properties (presence of halogen groups, number of aromatic rings) and to a physicochemical property (penalized logP). In the task of optimization of penalized logP of drug-like molecules our model significantly outperforms previous results.


Complexity, Statistical Risk, and Metric Entropy of Deep Nets Using Total Path Variation

arXiv.org Machine Learning

For any ReLU network there is a representation in which the sum of the absolute values of the weights into each node is exactly $1$, and the input layer variables are multiplied by a value $V$ coinciding with the total variation of the path weights. Implications are given for Gaussian complexity, Rademacher complexity, statistical risk, and metric entropy, all of which are shown to be proportional to $V$. There is no dependence on the number of nodes per layer, except for the number of inputs $d$. For estimation with sub-Gaussian noise, the mean square generalization error bounds that can be obtained are of order $V \sqrt{L + \log d}/\sqrt{n}$, where $L$ is the number of layers and $n$ is the sample size.


Regularizing Generative Models Using Knowledge of Feature Dependence

arXiv.org Machine Learning

Generative modeling is a fundamental problem in machine learning with many potential applications. Efficient learning of generative models requires available prior knowledge to be exploited as much as possible. In this paper, we propose a method to exploit prior knowledge of relative dependence between features for learning generative models. Such knowledge is available, for example, when side-information on features is present. We incorporate the prior knowledge by forcing marginals of the learned generative model to follow a prescribed relative feature dependence. To this end, we formulate a regularization term using a kernel-based dependence criterion. The proposed method can be incorporated straightforwardly into many optimization-based learning schemes of generative models, including variational autoencoders and generative adversarial networks. We show the effectiveness of the proposed method in experiments with multiple types of datasets and models.


Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications

arXiv.org Machine Learning

Reinforcement learning (RL) algorithms have been around for decades and employed to solve various sequential decision-making problems. These algorithms however have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This paper addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multi-agent deep RL (MADRL) is presented, including non-stationarity, partial observability, continuous state and action spaces, multi-agent training schemes, multi-agent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed, with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to future development of more robust and highly useful multi-agent learning methods for solving real-world problems.


An Estimation of Personnel Food Demand Quantity for Businesses by Using Artificial Neural Networks

arXiv.org Machine Learning

Today, many public or private institutions provide professional food service for personnels working in their own organizations. Regarding the planning of the said service, there are some obstacles due to the fact that the number of the personnel working in the institutions is generally high and the personnel are out of the institution due to personal or institutional reasons. Because of this, it is difficult to determine the daily food demand, and this causes cost, time and labor loss for the institutions. Statistical or heuristic methods are used to remove or at least minimize these losses. In this study, an artificial intelligence model was proposed, which estimates the daily food demand quantity using artificial neural networks for businesses. The data are obtained from a refectory database of a private institution with a capacity of 110 people serving daily meals and serving at different levels, covering the last two years (2016-2018). The model was created using the MATLAB package program. The performance of the model was determinde by the Regression values, the Mean Absolute Percentage Error (MAPE) and the Mean Squared Error (MSE). In the training of the ANN model, feed forward back propagation network architecture is used. The best model obtained as a result of the experiments is a multi-layer (8-10-10-1) structure with a training R ratio of 0,9948, a testing R ratio of 0,9830 and an error rate of 0,003783, respectively. Experimental results demonstrated that the model has low error rate, high performance and positive effect of using artificial neural networks for demand estimating.


Robust Matrix Completion State Estimation in Distribution Systems

arXiv.org Machine Learning

Due to the insufficient measurements in the distribution system state estimation (DSSE), full observability and redundant measurements are difficult to achieve without using the pseudo measurements. The matrix completion state estimation (MCSE) combines the matrix completion and power system model to estimate voltage by exploring the low-rank characteristics of the matrix. This paper proposes a robust matrix completion state estimation (RMCSE) to estimate the voltage in a distribution system under a low-observability condition. Tradition state estimation weighted least squares (WLS) method requires full observability to calculate the states and needs redundant measurements to proceed a bad data detection. The proposed method improves the robustness of the MCSE to bad data by minimizing the rank of the matrix and measurements residual with different weights. It can estimate the system state in a low-observability system and has robust estimates without the bad data detection process in the face of multiple bad data. The method is numerically evaluated on the IEEE 33-node radial distribution system. The estimation performance and robustness of RMCSE are compared with the WLS with the largest normalized residual bad data identification (WLS-LNR), and the MCSE.


Some machine learning schemes for high-dimensional nonlinear PDEs

arXiv.org Machine Learning

We propose new machine learning schemes for solving high dimensional nonlinear partial differential equations (PDEs). Relying on the classical backward stochastic differential equation (BSDE) representation of PDEs, our algorithms estimate simultaneously the solution and its gradient by deep neural networks. These approximations are performed at each time step from the minimization of loss functions defined recursively by backward induction. The methodology is extended to variational inequalities arising in optimal stopping problems. We analyze the convergence of the deep learning schemes and provide error estimates in terms of the universal approximation of neural networks. Numerical results show that our algorithms give very good results till dimension 50 (and certainly above), for both PDEs and variational inequalities problems. For the PDEs resolution, our results are very similar to those obtained by the recent method in \cite{weinan2017deep} when the latter converges to the right solution or does not diverge. Numerical tests indicate that the proposed methods are not stuck in poor local minimaas it can be the case with the algorithm designed in \cite{weinan2017deep}, and no divergence is experienced. The only limitation seems to be due to the inability of the considered deep neural networks to represent a solution with a too complex structure in high dimension.


Dynamic Real-time Multimodal Routing with Hierarchical Hybrid Planning

arXiv.org Artificial Intelligence

We introduce the problem of Dynamic Real-time Multimodal Routing (DREAMR), which requires planning and executing routes under uncertainty for an autonomous agent. The agent has access to a time-varying transit vehicle network in which it can use multiple modes of transportation. For instance, a drone can either fly or ride on terrain vehicles for segments of their routes. DREAMR is a difficult problem of sequential decision making under uncertainty with both discrete and continuous variables. We design a novel hierarchical hybrid planning framework to solve the DREAMR problem that exploits its structural decomposability. Our framework consists of a global open-loop planning layer that invokes and monitors a local closed-loop execution layer. Additional abstractions allow efficient and seamless interleaving of planning and execution. We create a large-scale simulation for DREAMR problems, with each scenario having hundreds of transportation routes and thousands of connection points. Our algorithmic framework significantly outperforms a receding horizon control baseline, in terms of elapsed time to reach the destination and energy expended by the agent.