Goto

Collaborating Authors

 Energy


Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing

arXiv.org Artificial Intelligence

The implementation of robust, stable, and user-centered data analytics and machine learning models is confronted by numerous challenges in production and manufacturing. Therefore, a systematic approach is required to develop, evaluate, and deploy such models. The data-driven knowledge discovery framework provides an orderly partition of the data-mining processes to ensure the practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data-- and model-development--related issues. These issues should be carefully addressed by allowing a flexible, customized, and industry-specific knowledge discovery framework; in our case, this takes the form of the cross-industry standard process for data mining (CRISP-DM). This framework is designed to ensure active cooperation between different phases to adequately address data- and model-related issues. In this paper, we review several extensions of CRISP-DM models and various data-robustness-- and model-robustness--related problems in machine learning, which currently lacks proper cooperation between data experts and business experts because of the limitations of data-driven knowledge discovery models.


Deep-Learning based Inverse Modeling Approaches: A Subsurface Flow Example

arXiv.org Machine Learning

Corresponding author: Email address: changhaibin@pku.edu.cn Key Points: Two categories of innovative deep-learning based inverse modeling methods are proposed and compared. The deep-learning surrogate-based inversion methods can accelerate the inversion process significantly. Abstract Deep-learning has achieved good performance and shown great potential for solving forward and inverse problems. In this work, two categories of innovative deep-learning based inverse modeling methods are proposed and compared. The first category is deep-learning surrogate-based inversion methods, in which the Theory-guided Neural Network (TgNN) is constructed as a deep-learning surrogate for problems with uncertain model parameters. By incorporating physical laws and other constraints, the TgNN surrogate can be constructed with limited simulation runs and accelerate the inversion process significantly. Three TgNN surrogate-based inversion methods are proposed, including the gradient method, the iterative ensemble smoother (IES), and the training method. In TgNN-geo, two neural networks are introduced to approximate the respective random model parameters and the solution. Since the prior geostatistical information can be incorporated, the direct-inversion method based on TgNN-geo works well, even in cases with sparse spatial measurements or imprecise prior statistics. Although the proposed deep-learning based inverse modeling methods are general in nature, and thus applicable to a wide variety of problems, they are tested with several subsurface flow problems. It is found that satisfactory results are obtained with a high efficiency.


Artificial Intelligence Platform Detects Power Grid Flaws And Wildfire Dangers Better And Faster Than Humans

#artificialintelligence

StartX startup Buzz Solutions out of Stanford, California just introduced its AI solution to help utilities quickly spot powerline and grid faults so repairs can be made before wildfires start. AI and machine vision technology, like that from Buzz Solutions, can take data from sources like ... [ ] this drone, helicopters, and aircraft to determine characteristics like powerline and grid faults so repairs can be made before wildfires start. Their unique platform uses AI and machine vision technology to analyze millions of images of powerlines and towers from drones, helicopters, and aircraft to find dangerous faults and flaws as well as overgrown vegetation, in and around the grid infrastructure to help utilities identify problem areas and repair them before a fire starts. This system can do the analysis at half the cost and in a fraction of the time compared to humans, hours to days not months to years. The California Department of Forestry and Fire Protection determined that PG&E's transmission lines were at fault for the huge Kincade Fire in Sonoma, California last year.


Learning Compositional Neural Programs for Continuous Control

arXiv.org Artificial Intelligence

We propose a novel solution to challenging sparse-reward, continuous control problems that require hierarchical planning at multiple levels of abstraction. Our solution, dubbed AlphaNPI-X, involves three separate stages of learning. First, we use off-policy reinforcement learning algorithms with experience replay to learn a set of atomic goal-conditioned policies, which can be easily repurposed for many tasks. Second, we learn self-models describing the effect of the atomic policies on the environment. Third, the self-models are harnessed to learn recursive compositional programs with multiple levels of abstraction. The key insight is that the self-models enable planning by imagination, obviating the need for interaction with the world when learning higher-level compositional programs. To accomplish the third stage of learning, we extend the AlphaNPI algorithm, which applies AlphaZero to learn recursive neural programmer-interpreters. We empirically show that AlphaNPI-X can effectively learn to tackle challenging sparse manipulation tasks, such as stacking multiple blocks, where powerful model-free baselines fail.


The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry

arXiv.org Machine Learning

The Fisher information matrix (FIM) is fundamental for understanding the trainability of deep neural networks (DNN) since it describes the local metric of the parameter space. We investigate the spectral distribution of the FIM given a single input by focusing on fully-connected networks achieving dynamical isometry. Then, while dynamical isometry is known to keep specific backpropagated signals independent of the depth, we find that the parameter space's local metric depends on the depth. In particular, we obtain an exact expression of the spectrum of the FIM given a single input and reveal that it concentrates around the depth point. Here, considering random initialization and the wide limit, we construct an algebraic methodology to examine the spectrum based on free probability theory, which is the algebraic wrapper of random matrix theory. As a byproduct, we provide the solvable spectral distribution in the two-hidden-layer case. Lastly, we empirically confirm that the spectrum of FIM with small batch-size has the same property as the single-input version. An experimental result shows that FIM's dependence on the depth determines the appropriate size of the learning rate for convergence at the initial phase of the online training of DNNs.


A Review on Computational Intelligence Techniques in Cloud and Edge Computing

arXiv.org Artificial Intelligence

Cloud computing (CC) is a centralized computing paradigm that accumulates resources centrally and provides these resources to users through Internet. Although CC holds a large number of resources, it may not be acceptable by real-time mobile applications, as it is usually far away from users geographically. On the other hand, edge computing (EC), which distributes resources to the network edge, enjoys increasing popularity in the applications with low-latency and high-reliability requirements. EC provides resources in a decentralized manner, which can respond to users' requirements faster than the normal CC, but with limited computing capacities. As both CC and EC are resource-sensitive, several big issues arise, such as how to conduct job scheduling, resource allocation, and task offloading, which significantly influence the performance of the whole system. To tackle these issues, many optimization problems have been formulated. These optimization problems usually have complex properties, such as non-convexity and NP-hardness, which may not be addressed by the traditional convex optimization-based solutions. Computational intelligence (CI), consisting of a set of nature-inspired computational approaches, recently exhibits great potential in addressing these optimization problems in CC and EC. This paper provides an overview of research problems in CC and EC and recent progresses in addressing them with the help of CI techniques. Informative discussions and future research trends are also presented, with the aim of offering insights to the readers and motivating new research directions.


Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

arXiv.org Artificial Intelligence

The goal of neural-symbolic computation is to integrate the connectionist and symbolist paradigms. Prior methods learn the neural-symbolic models using reinforcement learning (RL) approaches, which ignore the error propagation in the symbolic reasoning module and thus converge slowly with sparse rewards. In this paper, we address these issues and close the loop of neural-symbolic learning by (1) introducing the \textbf{grammar} model as a \textit{symbolic prior} to bridge neural perception and symbolic reasoning, and (2) proposing a novel \textbf{back-search} algorithm which mimics the top-down human-like learning procedure to propagate the error through the symbolic reasoning module efficiently. We further interpret the proposed learning framework as maximum likelihood estimation using Markov chain Monte Carlo sampling and the back-search algorithm as a Metropolis-Hastings sampler. The experiments are conducted on two weakly-supervised neural-symbolic tasks: (1) handwritten formula recognition on the newly introduced HWF dataset; (2) visual question answering on the CLEVR dataset. The results show that our approach significantly outperforms the RL methods in terms of performance, converging speed, and data efficiency. Our code and data are released at \url{https://liqing-ustc.github.io/NGS}.


Artificial Intelligence Technologies Driving Four Key Technology Market Applications

#artificialintelligence

Artificial intelligence (AI) is disrupting technologies, markets and applications everywhere. But what does it really mean? Professors Andreas Kaplan and Michael Haenlein defined AI as "a system's ability to correctly interpret external data, to learn from such data, and to use those learnings to achieve specific goals and tasks through flexible adaptation." AI's ascendancy has been made possible by exponential advances in computing power and the proliferation of devices capturing quantum of data. During this pandemic, we have a new appreciation for the international exchange of information about the spread of COVID-19, and the artificial intelligence technologies that are powering models about the spread and how to control it are critical to human survival.


GPT-3 Creative Fiction

#artificialintelligence

What if I told a story here, how would that story start?" Thus, the summarization prompt: "My second grader asked me what this passage means: โ€ฆ" When a given prompt isn't working and GPT-3 keeps pivoting into other modes of completion, that may mean that one hasn't constrained it enough by imitating a correct output, and one needs to go further; writing the first few words or sentence of the target output may be necessary.


Fully Bayesian Analysis of the Relevance Vector Machine Classification for Imbalanced Data

arXiv.org Machine Learning

Relevance Vector Machine (RVM) is a supervised learning algorithm extended from Support Vector Machine (SVM) based on the Bayesian sparsity model. Compared with the regression problem, RVM classification is difficult to be conducted because there is no closed-form solution for the weight parameter posterior. Original RVM classification algorithm used Newton's method in optimization to obtain the mode of weight parameter posterior then approximated it by a Gaussian distribution in Laplace's method. It would work but just applied the frequency methods in a Bayesian framework. This paper proposes a Generic Bayesian approach for the RVM classification. We conjecture that our algorithm achieves convergent estimates of the quantities of interest compared with the nonconvergent estimates of the original RVM classification algorithm. Furthermore, a Fully Bayesian approach with the hierarchical hyperprior structure for RVM classification is proposed, which improves the classification performance, especially in the imbalanced data problem. By the numeric studies, our proposed algorithms obtain high classification accuracy rates. The Fully Bayesian hierarchical hyperprior method outperforms the Generic one for the imbalanced data classification.