Model-Based Reasoning
COMET: An Application of Model-Based Reasoning to Accounting Systems
An important problem faced by auditors is gauging how much reliance can be placed on the accounting systems that process millions of transactions to produce the numbers summarized in a company's financial statements. Accounting sys-ems contain internal controls, procedures designed to detect and correct errors and irregularities that can occur in the processing of transactions. In a complex accounting system, it can be an extremely difficult task for the auditor to anticipate the possible errors that can occur and evaluate the effectiveness of the controls at detecting them. An accurate analysis must take into account the unique features of each company's business processes. To cope with this complexity and variability, the COMET system applies a model-based reasoning approach to the analysis of accounting systems and their controls.
Using Mechanism Design to Prevent False-Name Manipulations
The basic notion of false-name-proofness allows for useful mechanisms under certain circumstances, but in general there are impossibility results that show that false-name-proof mechanisms have severe limitations. One may react to these impossibility results by saying that, since false-name-proof mechanisms are unsatisfactory, we should not run any important mechanisms in highly anonymous settings--unless, perhaps, we can find some methodology that directly prevents false-name manipulation even in such settings, so that we are back in a more typical mechanism design context. However, it seems unlikely that the phenomenon of false-name manipulation will disappear anytime soon. Because the Internet is so attractive as a platform for running certain types of mechanisms, it seems unlikely that the organizations running these mechanisms will take them offline. Moreover, because a goal of these organizations is often to get as many users to participate as possible, they will be reluctant to use high-overhead solutions that discourage users from participating. As a result, perhaps the most promising approaches at this point are those that combine techniques from mechanism design with other techniques discussed in this article.
The Scheduling Job-Set Optimization Problem: A Model-Based Diagnosis Approach
Rodler, Patrick, Teppan, Erich
A common issue for companies is that the volume of product orders may at times exceed the production capacity. We formally introduce two novel problems dealing with the question which orders to discard or postpone in order to meet certain (timeliness) goals, and try to approach them by means of model-based diagnosis. In thorough analyses, we identify many similarities of the introduced problems to diagnosis problems, but also reveal crucial idiosyncracies and outline ways to handle or leverage them. Finally, a proof-of-concept evaluation on industrial-scale problem instances from a well-known scheduling benchmark suite demonstrates that one of the two formalized problems can be well attacked by out-of-the-box model-based diagnosis tools.
A round-up of topology-based papers at ICML 2020
With this year's International Conference on Machine Learning (ICML) being over, it is time to have another instalment of this series. Similar to last year's post, I shall cover several papers that caught my attention because of their use of topological concepts--however, unlike last year, I shall not restrict the selection to papers using topological data analysis (TDA). Caveat lector: I might have missed some promising papers. Any suggestions for additions are more than welcome! Please reach out to me via Twitter or e-mail.
Learning Compact Physics-Aware Delayed Photocurrent Models Using Dynamic Mode Decomposition
Hanson, Joshua, Bochev, Pavel, Paskaleva, Biliana
Radiation-induced photocurrent in semiconductor devices can be simulated using complex physics-based models, which are accurate, but computationally expensive. This presents a challenge for implementing device characteristics in high-level circuit simulations where it is computationally infeasible to evaluate detailed models for multiple individual circuit elements. In this work we demonstrate a procedure for learning compact delayed photocurrent models that are efficient enough to implement in large-scale circuit simulations, but remain faithful to the underlying physics. Our approach utilizes Dynamic Mode Decomposition (DMD), a system identification technique for learning reduced order discrete-time dynamical systems from time series data based on singular value decomposition. To obtain physics-aware device models, we simulate the excess carrier density induced by radiation pulses by solving numerically the Ambipolar Diffusion Equation, then use the simulated internal state as training data for the DMD algorithm. Our results show that the significantly reduced order delayed photocurrent models obtained via this method accurately approximate the dynamics of the internal excess carrier density -- which can be used to calculate the induced current at the device boundaries -- while remaining compact enough to incorporate into larger circuit simulations.
[R] Artificial Intelligence is stupid and causal reasoning won't fix it
If a ML system uses gender information in credit scoring, then gender information is probably relevant for credit scoring. We all know that women, for example, are more risk averse than men on average and that there are more men with very low IQ's; and more men take part in dangerous activities than can maim them. All those things contribute to credit risk. I looked at some actuarial motorcycle accident data from a Swedish insurance company a couple of years ago, and the accident rate of young men (18-25 maybe) was something like 40 times higher than women in the same age interval. Of course, EU law requires us to offer the same rate to men and women, so we have to ignore this; and thus the women pay more than they should if things were fair.
A physics-based method that can predict imminent large solar flares
The sudden release of magnetic energy on the Sun drives powerful solar flares, which are difficult to predict. Kusano et al. derived physics-based thresholds for the onset of large solar flares and show how they can be predicted from routine solar observations (see the Perspective by Veronig). They tested their method using observations of the Sun from 2008 to 2019. In most cases, the method correctly identifies which regions will produce large flares within the next 20 hours, although there are some false positives and false negatives. The method also provides the exact location where each flare will begin and limits on how powerful it will be. Accurate predictions of solar flares could improve forecasts of space weather conditions around Earth. Science , this issue p. [587][1]; see also p. [504][2] Solar flares are highly energetic events in the Sun’s corona that affect Earth’s space weather. The mechanism that drives the onset of solar flares is unknown, hampering efforts to forecast them, which mostly rely on empirical methods. We present the κ -scheme, a physics-based model to predict large solar flares through a critical condition of magnetohydrodynamic instability, triggered by magnetic reconnection. Analysis of the largest (X-class) flares from 2008 to 2019 (during solar cycle 24) shows that the κ -scheme predicts most imminent large solar flares, with a small number of exceptions for confined flares. We conclude that magnetic twist flux density, close to a magnetic polarity inversion line on the solar surface, determines when and where solar flares may occur and how large they can be. [1]: /lookup/doi/10.1126/science.aaz2511 [2]: /lookup/doi/10.1126/science.abb6150
GINNs: Graph-Informed Neural Networks for Multiscale Physics
Hall, Eric J., Taverniers, Søren, Katsoulakis, Markos A., Tartakovsky, Daniel M.
Typically this requires casting the original deterministic physics-based model into a probabilistic framework where inputs or control variables (CVs) are treated as random variables with probability distributions derived from available experimental data, manufacturing constraints, design criteria, expert judgment, and/or other domain knowledge (e.g., see [1]). Running the physics-based model with CVs sampled according to these distributions yields corresponding realizations of the system response as characterized by quantities of interest (QoIs). Analysis of the uncertainty propagation from the CVs to the QoIs informs decision-making, e.g., it informs engineering decisions aimed at improving the quality and reliability of designed products and helps identify potential risks at early stages in the design and manufacturing process. Quantitatively assessing uncertainty propagation presents a fundamental challenge due to the computational cost of the underlying physics-based model. Even for a low number of CVs and QoIs, uncertainty quantification (UQ) for, e.g., accelerating the simulation-aided design of multiscale systems and data-centric engineering tasks more generally ([2]), requires a large number of repeated observations of QoIs to achieve a high degree of confidence in such an analysis. The sampling cost is further exacerbated in real-world applications where distributions on QoIs are typically non-Gaussian, skewed, and/or mutually correlated, and therefore need to be characterized by their full probability density function (PDF) rather than through summary statistics such as mean and variance. The computational cost of nonparametric methods to estimate these densities can become prohibitively high when using a fully-featured physics-based model to compute each sample. One approach to alleviate the computational burden is to derive a cheaper-to-compute surrogate for the physicsbased model's response enabling much faster generation of output data and thus overcoming computational bottlenecks.
Realistic Physics Based Character Controller
Over the course of the last several years there was a strong interest in application of modern optimal control techniques to the field of character animation. This interest was fueled by introduction of efficient learning based algorithms for policy optimization, growth in computation power, and game engine improvements. It was shown that it is possible to generate natural looking control of a character by using two ingredients. First, the simulated agent must adhere to a motion capture dataset. And second, the character aims to track the control input from the user. The paper aims at closing the gap between the researchers and users by introducing an open source implementation of physics based character control in Unity framework that has a low entry barrier and a steep learning curve.
Scientific Machine Learning Paves Way for Rapid Rocket Engine Design - Liwaiwai
"It's not rocket science" may be a tired cliché, but that doesn't mean designing rockets is any less complicated. Time, cost and safety prohibit testing the stability of a test rocket using a physical build "trial and error" approach. But even computational simulations are extremely time consuming. A single analysis of an entire SpaceX Merlin rocket engine, for example, could take weeks, even months, for a supercomputer to provide satisfactory predictions. One group of researchers at The University of Texas at Austin is developing new "scientific machine learning" methods to address this challenge.