Goto

Collaborating Authors

A review of machine learning applications in wildfire science and management

arXiv.org Machine Learning

Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) in the environmental sciences. Here, we present a scoping review of ML in wildfire science and management. Our objective is to improve awareness of ML among wildfire scientists and managers, as well as illustrate the challenging range of problems in wildfire science available to data scientists. We first present an overview of popular ML approaches used in wildfire science to date, and then review their use in wildfire science within six problem domains: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. We also discuss the advantages and limitations of various ML approaches and identify opportunities for future advances in wildfire science and management within a data science context. We identified 298 relevant publications, where the most frequently used ML methods included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. There exists opportunities to apply more current ML methods (e.g., deep learning and agent based learning) in wildfire science. However, despite the ability of ML models to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods requires sophisticated knowledge for their application. Finally, we stress that the wildfire research and management community plays an active role in providing relevant, high quality data for use by practitioners of ML methods.


A Loss Function Analysis for Classification Categorization Methods in Text

AAAI Conferences

A large number of statistical classification methods have been applied to this problem, including linear regression, logistic regression (LR), neural networks (NNet), Naive Bayes (NB), k-nearest neighbor (kNN), Rocchio-style, Support Vector Machine (SVM) and other approaches (Yang & Liu, 1999; Yang, 1999; Joachims, 1998; Mc-Callum & Nigam; Zhang & Oles, 2001; Lewis et al., 2003). As more methods are published, we need to have a sound theoretical framework for cross-method comparison. Recent work in machine learning focusing on the regularization of classification methods and on the analysis of their loss functions is a step in this direction. Vapnik (Vapnik, 1995) defined the objective function in SVM as minimizing the expected risk on test examples, and decomposed that risk into two components: the empirical risk that reflects the training-set errors of the classifier, and the inverse of margin width that reflects how far the positive and negative training examples of a category are separated by the decision surface. Thus, both the minimization of training-set errors and the maximization of the margin width are the criteria used in the optimization of SVM. Balancing between the two criteria has been referred as the regularization of a classifier; the degree of regularization is often controlled by a parameter in that method (section 2). SVM have been extremely successful in text categorization, often resulting in the best performance in benchmark evaluations (Joachims, 1998; Yang Liu, 1999; Lewis et al., 2003). Hastie et al. (Hastie et al. 2001) presented a more general framework for estimating the potential of a model in making classification errors, and used a slightly different terminology: loss or generalization error corresponding to the expected risk, training-set loss corresponding to the empirical risk, and model complexity corresponding to the margin-related risk in SVM. Using this framework they compared alternative ways to penalize the model complexity, including the Akalke Information Criterion (AIC), the Bayesian Information Criterion (BIC), and the Minimum Description Length (MDL) criterion.


Local Model Feature Transformations

arXiv.org Machine Learning

Local learning methods are a popular class of machine learning algorithms. The basic idea for the entire cadre is to choose some non-local model family, to train many of them on small sections of neighboring data, and then to `stitch' the resulting models together in some way. Due to the limits of constraining a training dataset to a small neighborhood, research on locally-learned models has largely been restricted to simple model families. Also, since simple model families have no complex structure by design, this has limited use of the individual local models to predictive tasks. We hypothesize that, using a sufficiently complex local model family, various properties of the individual local models, such as their learned parameters, can be used as features for further learning. This dissertation improves upon the current state of research and works toward establishing this hypothesis by investigating algorithms for localization of more complex model families and by studying their applications beyond predictions as a feature extraction mechanism. We summarize this generic technique of using local models as a feature extraction step with the term ``local model feature transformations.'' In this document, we extend the local modeling paradigm to Gaussian processes, orthogonal quadric models and word embedding models, and extend the existing theory for localized linear classifiers. We then demonstrate applications of local model feature transformations to epileptic event classification from EEG readings, activity monitoring via chest accelerometry, 3D surface reconstruction, 3D point cloud segmentation, handwritten digit classification and event detection from Twitter feeds.


Automatic Language Identification in Texts: A Survey

Journal of Artificial Intelligence Research

Language identification (“LI”) is the problem of determining the natural language that a document or part thereof is written in. Automatic LI has been extensively researched for over fifty years. Today, LI is a key part of many text processing pipelines, as text processing techniques generally assume that the language of the input text is known. Research in this area has recently been especially active. This article provides a brief history of LI research, and an extensive survey of the features and methods used in the LI literature. We describe the features and methods using a unified notation, to make the relationships between methods clearer. We discuss evaluation methods, applications of LI, as well as off-the-shelfLI systems that do not require training by the end user. Finally, we identify open issues, survey the work to date on each issue, and propose future directions for research in LI.


An Analysis of Hierarchical Text Classification Using Word Embeddings

arXiv.org Artificial Intelligence

Efficient distributed numerical word representation models (word embeddings) combined with modern machine learning algorithms have recently yielded considerable improvement on automatic document classification tasks. However, the effectiveness of such techniques has not been assessed for the hierarchical text classification (HTC) yet. This study investigates the application of those models and algorithms on this specific problem by means of experimentation and analysis. We trained classification models with prominent machine learning algorithm implementations---fastText, XGBoost, SVM, and Keras' CNN---and noticeable word embeddings generation methods---GloVe, word2vec, and fastText---with publicly available data and evaluated them with measures specifically appropriate for the hierarchical context. FastText achieved an ${}_{LCA}F_1$ of 0.893 on a single-labeled version of the RCV1 dataset. An analysis indicates that using word embeddings and its flavors is a very promising approach for HTC.