Bayesian Inference
A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective
Zhao, Yu, Du, Huaming, Li, Qing, Zhuang, Fuzhen, Liu, Ji, Kou, Gang
Enterprise financial risk analysis aims at predicting the future financial risk of enterprises. Due to its wide and significant application, enterprise financial risk analysis has always been the core research topic in the fields of Finance and Management. Based on advanced computer science and artificial intelligence technologies, enterprise risk analysis research is experiencing rapid developments and making significant progress. Therefore, it is both necessary and challenging to comprehensively review the relevant studies. Although there are already some valuable and impressive surveys on enterprise risk analysis from the perspective of Finance and Management, these surveys introduce approaches in a relatively isolated way and lack recent advances in enterprise financial risk analysis. In contrast, this paper attempts to provide a systematic literature survey of enterprise risk analysis approaches from Big Data perspective, which reviews more than 250 representative articles in the past almost 50 years (from 1968 to 2023). To the best of our knowledge, this is the first and only survey work on enterprise financial risk from Big Data perspective. Specifically, this survey connects and systematizes the existing enterprise financial risk studies, i.e. to summarize and interpret the problems, methods, and spotlights in a comprehensive way. In particular, we first introduce the issues of enterprise financial risks in terms of their types,granularity, intelligence, and evaluation metrics, and summarize the corresponding representative works. Then, we compare the analysis methods used to learn enterprise financial risk, and finally summarize the spotlights of the most representative works. Our goal is to clarify current cutting-edge research and its possible future directions to model enterprise risk, aiming to fully understand the mechanisms of enterprise risk generation and contagion.
Materials Informatics: An Algorithmic Design Rule
We have researched the organic semiconductor's enigmas through the material informatics approach. By applying diverse neural network topologies, logical axiom, and inferencing information science, we have developed data-driven procedures for novel organic semiconductor discovery for the semiconductor industry and knowledge extraction for the material science community. We have reviewed and corresponded with various algorithms for the neural network design topology for the material informatics dataset, as shown in Figure 1, a generalized neural network topology. We have used four chemical compound space databases for model training and validation in this research notebook. The first one is the general quantum chemistry structures and properties of 134-kilo molecules (QM9) of computed geometric, energetic, electronic, and thermodynamic properties for 134-kilo stable small organic molecules made up of C, H, O, N, F for the novel design of new drugs and materials.
Variational Nonlinear Kalman Filtering with Unknown Process Noise Covariance
Lan, Hua, Hu, Jinjie, Wang, Zengfu, Cheng, Qiang
Motivated by the maneuvering target tracking with sensors such as radar and sonar, this paper considers the joint and recursive estimation of the dynamic state and the time-varying process noise covariance in nonlinear state space models. Due to the nonlinearity of the models and the non-conjugate prior, the state estimation problem is generally intractable as it involves integrals of general nonlinear functions and unknown process noise covariance, resulting in the posterior probability distribution functions lacking closed-form solutions. This paper presents a recursive solution for joint nonlinear state estimation and model parameters identification based on the approximate Bayesian inference principle. The stochastic search variational inference is adopted to offer a flexible, accurate, and effective approximation of the posterior distributions. We make two contributions compared to existing variational inference-based noise adaptive filtering methods. First, we introduce an auxiliary latent variable to decouple the latent variables of dynamic state and process noise covariance, thereby improving the flexibility of the posterior inference. Second, we split the variational lower bound optimization into conjugate and non-conjugate parts, whereas the conjugate terms are directly optimized that admit a closed-form solution and the non-conjugate terms are optimized by natural gradients, achieving the trade-off between inference speed and accuracy. The performance of the proposed method is verified on radar target tracking applications by both simulated and real-world data.
Mixtures of Gaussian process experts based on kernel stick-breaking processes
Mixtures of Gaussian process experts is a class of models that can simultaneously address two of the key limitations inherent in standard Gaussian processes: scalability and predictive performance. In particular, models that use Dirichlet processes as gating functions permit straightforward interpretation and automatic selection of the number of experts in a mixture. While the existing models are intuitive and capable of capturing non-stationarity, multi-modality and heteroskedasticity, the simplicity of their gating functions may limit the predictive performance when applied to complex data-generating processes. Capitalising on the recent advancement in the dependent Dirichlet processes literature, we propose a new mixture model of Gaussian process experts based on kernel stick-breaking processes. Our model maintains the intuitive appeal yet improve the performance of the existing models. To make it practical, we design a sampler for posterior computation based on the slice sampling. The model behaviour and improved predictive performance are demonstrated in experiments using six datasets.
Sparsifying Bayesian neural networks with latent binary variables and normalizing flows
Skaaret-Lund, Lars, Storvik, Geir, Hubin, Aliaksandr
Artificial neural networks (ANNs) are powerful machine learning methods used in many modern applications such as facial recognition, machine translation, and cancer diagnostics. A common issue with ANNs is that they usually have millions or billions of trainable parameters, and therefore tend to overfit to the training data. This is especially problematic in applications where it is important to have reliable uncertainty estimates. Bayesian neural networks (BNN) can improve on this, since they incorporate parameter uncertainty. In addition, latent binary Bayesian neural networks (LBBNN) also take into account structural uncertainty by allowing the weights to be turned on or off, enabling inference in the joint space of weights and structures. In this paper, we will consider two extensions to the LBBNN method: Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, we get a more computationally efficient algorithm. More importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, the network learns a more flexible variational posterior distribution than the mean field Gaussian. Experimental results show that this improves predictive power compared to the LBBNN method, while also obtaining more sparse networks. We perform two simulation studies. In the first study, we consider variable selection in a logistic regression setting, where the more flexible variational distribution leads to improved results. In the second study, we compare predictive uncertainty based on data generated from two-dimensional Gaussian distributions. Here, we argue that our Bayesian methods lead to more realistic estimates of predictive uncertainty.
Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration
Jiang, Chentian, Ke, Nan Rosemary, van Hasselt, Hado
To generalize across tasks, an agent should acquire knowledge from past tasks that facilitate adaptation and exploration in future tasks. We focus on the problem of in-context adaptation and exploration, where an agent only relies on context, i.e., history of states, actions and/or rewards, rather than gradient-based updates. Posterior sampling (extension of Thompson sampling) is a promising approach, but it requires Bayesian inference and dynamic programming, which often involve unknowns (e.g., a prior) and costly computations. To address these difficulties, we use a transformer to learn an inference process from training tasks and consider a hypothesis space of partial models, represented as small Markov decision processes that are cheap for dynamic programming. In our version of the Symbolic Alchemy benchmark, our method's adaptation speed and exploration-exploitation balance approach those of an exact posterior sampling oracle. We also show that even though partial models exclude relevant information from the environment, they can nevertheless lead to good policies.
Machine Learning Benchmarks for the Classification of Equivalent Circuit Models from Electrochemical Impedance Spectra
Schaeffer, Joachim, Gasper, Paul, Garcia-Tamayo, Esteban, Gasper, Raymond, Adachi, Masaki, Gaviria-Cardona, Juan Pablo, Montoya-Bedoya, Simon, Bhutani, Anoushka, Schiek, Andrew, Goodall, Rhys, Findeisen, Rolf, Braatz, Richard D., Engelke, Simon
Analysis of Electrochemical Impedance Spectroscopy (EIS) data for electrochemical systems often consists of defining an Equivalent Circuit Model (ECM) using expert knowledge and then optimizing the model parameters to deconvolute various resistance, capacitive, inductive, or diffusion responses. For small data sets, this procedure can be conducted manually; however, it is not feasible to manually define a proper ECM for extensive data sets with a wide range of EIS responses. Automatic identification of an ECM would substantially accelerate the analysis of large sets of EIS data. We showcase machine learning methods to classify the ECMs of 9,300 impedance spectra provided by QuantumScape for the BatteryDEV hackathon. The best-performing approach is a gradient-boosted tree model utilizing a library to automatically generate features, followed by a random forest model using the raw spectral data. A convolutional neural network using boolean images of Nyquist representations is presented as an alternative, although it achieves a lower accuracy. We publish the data and open source the associated code. The approaches described in this article can serve as benchmarks for further studies. A key remaining challenge is the identifiability of the labels, underlined by the model performances and the comparison of misclassified spectra.
Bayesian Reinforcement Learning with Limited Cognitive Load
Arumugam, Dilip, Ho, Mark K., Goodman, Noah D., Van Roy, Benjamin
Cognitive science aims to identify the principles and mechanisms that underlie adaptive behavior. An important part of this endeavor is the development of unifying, normative theories that specify "design principles" that guide or constrain how intelligent systems respond to their environment [Marr, 1982, Anderson, 1990, Lewis et al., 2014, Griffiths et al., 2015, Gershman et al., 2015]. For example, accounts of learning, cognition, and decision-making often posit a function that an organism is optimizing--e.g., maximizing long-term reward or minimizing prediction error--and test plausible algorithms that achieve this--e.g., a particular learning rule or inference process. Historically, normative theories in cognitive science have been developed in tandem with new formal approaches in computer science and statistics. This partnership has been fruitful even given differences in scientific goals (e.g., engineering artificial intelligence versus reverse-engineering biological intelligence). Normative theories play a key role in facilitating cross-talk between different disciplines by providing a shared set of mathematical, analytical, and conceptual tools for describing computational problems and how to solve them [Ho and Griffiths, 2022]. This paper is written in the spirit of such cross-disciplinary fertilization. Here, we review recent work in computer science [Arumugam and Van Roy, 2021a, 2022] that develops a novel approach for unifying three distinct mathematical frameworks that will be familiar to many cognitive scientists (Figure 1).
Tuning Traditional Language Processing Approaches for Pashto Text Classification
Baktash, Jawid Ahmad, Dawodi, Mursal, Joya, Mohammad Zarif, Hassanzada, Nematullah
Today text classification becomes critical task for concerned individuals for numerous purposes. Hence, several researches have been conducted to develop automatic text classification for national and international languages. However, the need for an automatic text categorization system for local languages is felt. The main aim of this study is to establish a Pashto automatic text classification system. In order to pursue this work, we built a Pashto corpus which is a collection of Pashto documents due to the unavailability of public datasets of Pashto text documents. Besides, this study compares several models containing both statistical and neural network machine learning techniques including Multilayer Perceptron (MLP), Support Vector Machine (SVM), K Nearest Neighbor (KNN), decision tree, gaussian na\"ive Bayes, multinomial na\"ive Bayes, random forest, and logistic regression to discover the most effective approach. Moreover, this investigation evaluates two different feature extraction methods including unigram, and Time Frequency Inverse Document Frequency (IFIDF). Subsequently, this research obtained average testing accuracy rate 94% using MLP classification algorithm and TFIDF feature extraction method in this context.
Multiresolution kernel matrix algebra
Harbrecht, H., Multerer, M., Schenk, O., Schwab, Ch.
We propose a sparse algebra for samplet compressed kernel matrices, to enable efficient scattered data analysis. We show the compression of kernel matrices by means of samplets produces optimally sparse matrices in a certain S-format. It can be performed in cost and memory that scale essentially linearly with the matrix size $N$, for kernels of finite differentiability, along with addition and multiplication of S-formatted matrices. We prove and exploit the fact that the inverse of a kernel matrix (if it exists) is compressible in the S-format as well. Selected inversion allows to directly compute the entries in the corresponding sparsity pattern. The S-formatted matrix operations enable the efficient, approximate computation of more complicated matrix functions such as ${\bm A}^\alpha$ or $\exp({\bm A})$. The matrix algebra is justified mathematically by pseudo differential calculus. As an application, efficient Gaussian process learning algorithms for spatial statistics is considered. Numerical results are presented to illustrate and quantify our findings.