Goto

Collaborating Authors

Results


A review of machine learning applications in wildfire science and management

arXiv.org Machine Learning

Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) in the environmental sciences. Here, we present a scoping review of ML in wildfire science and management. Our objective is to improve awareness of ML among wildfire scientists and managers, as well as illustrate the challenging range of problems in wildfire science available to data scientists. We first present an overview of popular ML approaches used in wildfire science to date, and then review their use in wildfire science within six problem domains: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. We also discuss the advantages and limitations of various ML approaches and identify opportunities for future advances in wildfire science and management within a data science context. We identified 298 relevant publications, where the most frequently used ML methods included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. There exists opportunities to apply more current ML methods (e.g., deep learning and agent based learning) in wildfire science. However, despite the ability of ML models to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods requires sophisticated knowledge for their application. Finally, we stress that the wildfire research and management community plays an active role in providing relevant, high quality data for use by practitioners of ML methods.


Artificial Intelligence for Social Good: A Survey

arXiv.org Artificial Intelligence

Its impact is drastic and real: Youtube's AIdriven recommendation system would present sports videos for days if one happens to watch a live baseball game on the platform [1]; email writing becomes much faster with machine learning (ML) based auto-completion [2]; many businesses have adopted natural language processing based chatbots as part of their customer services [3]. AI has also greatly advanced human capabilities in complex decision-making processes ranging from determining how to allocate security resources to protect airports [4] to games such as poker [5] and Go [6]. All such tangible and stunning progress suggests that an "AI summer" is happening. As some put it, "AI is the new electricity" [7]. Meanwhile, in the past decade, an emerging theme in the AI research community is the so-called "AI for social good" (AI4SG): researchers aim at developing AI methods and tools to address problems at the societal level and improve the wellbeing of the society.


Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing

arXiv.org Machine Learning

Various researchers have studied posterior inference of parameters in Bayesian mixture models [24, 42, 23], so that the statistical behavior of such models is relatively well-understood. In contrast, much less is known about the efficiency of different algorithms for sampling from the posterior distributions that arise from Bayesian mixture models. A standard approach for doing so is via some form of Markov Chain Monte Carlo (MCMC). Many different types of MCMC algorithms have been introduced for various types of Bayesian mixture models, including finite Bayesian mixture models [21, 49, 50, 26, 40], Dirichlet process mixture models [37, 41, 25, 28], and hierarchical and nested Dirichlet process models [52, 47]. Despite the plethora of possible MCMC methods, upper bounds on their mixing times are often challenging to establish. We refer the reader to the papers [27, 3, 55, 48, 57] for non-asymptotic upper bounds on mixing times for certain types of Bayesian models, different from those studied in this paper. In recent years, it has been increasingly common in the Bayesian literature to make use of a fractional likelihood--meaning an ordinary likelihood raised to some fractional power. Combining such a fractional likelihood with a prior distribution in the usual way leads to a class of posteriors known as power posterior or fractional posterior distributions. The power posterior distributions have been shown to have attractive properties in terms of robustness to mis-specification in Bayesian mixture models [39], and have been used in various applications 1 arXiv:1912.05153v1


Imputing Missing Events in Continuous-Time Event Streams

arXiv.org Machine Learning

Events in the world may be caused by other, unobserved events. We consider sequences of events in continuous time. Given a probability model of complete sequences, we propose particle smoothing---a form of sequential importance sampling---to impute the missing events in an incomplete sequence. We develop a trainable family of proposal distributions based on a type of bidirectional continuous-time LSTM: Bidirectionality lets the proposals condition on future observations, not just on the past as in particle filtering. Our method can sample an ensemble of possible complete sequences (particles), from which we form a single consensus prediction that has low Bayes risk under our chosen loss metric. We experiment in multiple synthetic and real domains, using different missingness mechanisms, and modeling the complete sequences in each domain with a neural Hawkes process (Mei & Eisner 2017). On held-out incomplete sequences, our method is effective at inferring the ground-truth unobserved events, with particle smoothing consistently improving upon particle filtering.


Thirty Years of Machine Learning:The Road to Pareto-Optimal Next-Generation Wireless Networks

arXiv.org Machine Learning

Next-generation wireless networks (NGWN) have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of machine learning by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning, respectively. Furthermore, we investigate their employment in the compelling applications of NGWNs, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various machine learning algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.


Variational Bayes Inference in Digital Receivers

arXiv.org Machine Learning

The digital telecommunications receiver is an important context for inference methodology, the key objective being to minimize the expected loss function in recovering the transmitted information. For that criterion, the optimal decision is the Bayesian minimum-risk estimator. However, the computational load of the Bayesian estimator is often prohibitive and, hence, efficient computational schemes are required. The design of novel schemes, striking new balances between accuracy and computational load, is the primary concern of this thesis. Two popular techniques, one exact and one approximate, will be studied. The exact scheme is a recursive one, namely the generalized distributive law (GDL), whose purpose is to distribute all operators across the conditionally independent (CI) factors of the joint model, so as to reduce the total number of operators required. In a novel theorem derived in this thesis, GDL, if applicable, will be shown to guarantee such a reduction in all cases. An associated lemma also quantifies this reduction. For practical use, two novel algorithms, namely the no-longer-needed (NLN) algorithm and the generalized form of the Markovian Forward-Backward (FB) algorithm, recursively factorizes and computes the CI factors of an arbitrary model, respectively. The approximate scheme is an iterative one, namely the Variational Bayes (VB) approximation, whose purpose is to find the independent (i.e. zero-order Markov) model closest to the true joint model in the minimum Kullback-Leibler divergence (KLD) sense. Despite being computationally efficient, this naive mean field approximation confers only modest performance for highly correlated models. A novel approximation, namely Transformed Variational Bayes (TVB), will be designed in the thesis in order to relax the zero-order constraint in the VB approximation, further reducing the KLD of the optimal approximation.



Automatic Parameter Tying: A New Approach for Regularized Parameter Learning in Markov Networks

AAAI Conferences

Parameter tying is a regularization method in which parameters (weights) of a machine learning model are partitioned into groups by leveraging prior knowledge and all parameters in each group are constrained to take the same value. In this paper, we consider the problem of parameter learning in Markov networks and propose a novel approach called automatic parameter tying (APT) that uses automatic instead of a priori and soft instead of hard parameter tying as a regularization method to alleviate overfitting. The key idea behind APT is to set up the learning problem as the task of finding parameters and groupings of parameters such that the likelihood plus a regularization term is maximized. The regularization term penalizes models where parameter values deviate from their group mean parameter value. We propose and use a block coordinate ascent algorithm to solve the optimization task. We analyze the sample complexity of our new learning algorithm and show that it yields optimal parameters with high probability when the groups are well separated. Experimentally, we show that our method improves upon L 2 regularization and suggest several pragmatic techniques for good practical performance.


"I can assure you [$\ldots$] that it's going to be all right" -- A definition, case for, and survey of algorithmic assurances in human-autonomy trust relationships

arXiv.org Machine Learning

As technology become more advanced, those who design, use and are otherwise affected by it want to know that it will perform correctly, and understand why it does what it does, and how to use it appropriately. In essence they want to be able to trust the systems that are being designed. In this survey we present assurances that are the method by which users can understand how to trust this technology. Trust between humans and autonomy is reviewed, and the implications for the design of assurances are highlighted. A survey of research that has been performed with respect to assurances is presented, and several key ideas are extracted in order to refine the definition of assurances. Several directions for future research are identified and discussed.


Poisson--Gamma Dynamical Systems

arXiv.org Machine Learning

We introduce a new dynamical system for sequentially observed multivariate count data. This model is based on the gamma--Poisson construction---a natural choice for count data---and relies on a novel Bayesian nonparametric prior that ties and shrinks the model parameters, thus avoiding overfitting. We present an efficient MCMC inference algorithm that advances recent work on augmentation schemes for inference in negative binomial models. Finally, we demonstrate the model's inductive bias using a variety of real-world data sets, showing that it exhibits superior predictive performance over other models and infers highly interpretable latent structure.