Goto

Collaborating Authors

 Böttcher, Lucas


Organizational Selection of Innovation

arXiv.org Artificial Intelligence

Budgetary constraints force organizations to pursue only a subset of possible innovation projects. Identifying which subset is most promising is an error-prone exercise, and involving multiple decision makers may be prudent. This raises the question of how to most effectively aggregate their collective nous. Our model of organizational portfolio selection provides some first answers. We show that portfolio performance can vary widely. Delegating evaluation makes sense when organizations employ the relevant experts and can assign projects to them. In most other settings, aggregating the impressions of multiple agents leads to better performance than delegation. In particular, letting agents rank projects often outperforms alternative aggregation rules -- including averaging agents' project scores as well as counting their approval votes -- especially when organizations have tight budgets and can select only a few project alternatives out of many.


Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications

arXiv.org Artificial Intelligence

The field of neuroscience and the development of artificial neural networks (ANNs) have mutually influenced each other, drawing from and contributing to many concepts initially developed in statistical mechanics. Notably, Hopfield networks and Boltzmann machines are versions of the Ising model, a model extensively studied in statistical mechanics for over a century. In the first part of this chapter, we provide an overview of the principles, models, and applications of ANNs, highlighting their connections to statistical mechanics and statistical learning theory. Artificial neural networks can be seen as high-dimensional mathematical functions, and understanding the geometric properties of their loss landscapes (i.e., the high-dimensional space on which one wishes to find extrema or saddles) can provide valuable insights into their optimization behavior, generalization abilities, and overall performance. Visualizing these functions can help us design better optimization methods and improve their generalization abilities. Thus, the second part of this chapter focuses on quantifying geometric properties and visualizing loss functions associated with deep ANNs.


Control of Medical Digital Twins with Artificial Neural Networks

arXiv.org Artificial Intelligence

The objective of personalized medicine is to tailor interventions to an individual patient's unique characteristics. A key technology for this purpose involves medical digital twins, computational models of human biology that can be personalized and dynamically updated to incorporate patient-specific data collected over time. Certain aspects of human biology, such as the immune system, are not easily captured with physics-based models, such as differential equations. Instead, they are often multi-scale, stochastic, and hybrid. This poses a challenge to existing model-based control and optimization approaches that cannot be readily applied to such models. Recent advances in automatic differentiation and neural-network control methods hold promise in addressing complex control problems. However, the application of these approaches to biomedical systems is still in its early stages. This work introduces dynamics-informed neural-network controllers as an alternative approach to control of medical digital twins. As a first use case for this method, the focus is on agent-based models, a versatile and increasingly common modeling platform in biomedicine. The effectiveness of the proposed neural-network control method is illustrated and benchmarked against other methods with two widely-used agent-based model types. The relevance of the method introduced here extends beyond medical digital twins to other complex dynamical systems.


Visualizing high-dimensional loss landscapes with Hessian directions

arXiv.org Machine Learning

Analyzing geometric properties of high-dimensional loss functions, such as local curvature and the existence of other optima around a certain point in loss space, can help provide a better understanding of the interplay between neural network structure, implementation attributes, and learning performance. In this work, we combine concepts from high-dimensional probability and differential geometry to study how curvature properties in lower-dimensional loss representations depend on those in the original loss space. We show that saddle points in the original space are rarely correctly identified as such in expected lower-dimensional representations if random projections are used. The principal curvature in the expected lower-dimensional representation is proportional to the mean curvature in the original loss space. Hence, the mean curvature in the original loss space determines if saddle points appear, on average, as either minima, maxima, or almost flat regions. We use the connection between expected curvature in random projections and mean curvature in the original space (i.e., the normalized Hessian trace) to compute Hutchinson-type trace estimates without calculating Hessian-vector products as in the original Hutchinson method. Because random projections are not suitable to correctly identify saddle information, we propose to study projections along dominant Hessian directions that are associated with the largest and smallest principal curvatures. We connect our findings to the ongoing debate on loss landscape flatness and generalizability. Finally, for different common image classifiers and a function approximator, we show and compare random and Hessian projections of loss landscapes with up to about $7\times 10^6$ parameters.


Gradient-free training of neural ODEs for system identification and control using ensemble Kalman inversion

arXiv.org Artificial Intelligence

Ensemble Kalman inversion (EKI) is a sequential Monte Carlo method used to solve inverse problems within a Bayesian framework. Unlike backpropagation, EKI is a gradient-free optimization method that only necessitates the evaluation of artificial neural networks in forward passes. In this study, we examine the effectiveness of EKI in training neural ordinary differential equations (neural ODEs) for system identification and control tasks. To apply EKI to optimal control problems, we formulate inverse problems that incorporate a Tikhonov-type regularization term. Our numerical results demonstrate that EKI is an efficient method for training neural ODEs in system identification and optimal control problems, with runtime and quality of solutions that are competitive with commonly used gradient-based optimizers.


Control of Dual-Sourcing Inventory Systems using Recurrent Neural Networks

arXiv.org Artificial Intelligence

A key challenge in inventory management is to identify policies that optimally replenish inventory from multiple suppliers. To solve such optimization problems, inventory managers need to decide what quantities to order from each supplier, given the net inventory and outstanding orders, so that the expected backlogging, holding, and sourcing costs are jointly minimized. Inventory management problems have been studied extensively for over 60 years, and yet even basic dual-sourcing problems, in which orders from an expensive supplier arrive faster than orders from a regular supplier, remain intractable in their general form. In addition, there is an emerging need to develop proactive, scalable optimization algorithms that can adjust their recommendations to dynamic demand shifts in a timely fashion. In this work, we approach dual sourcing from a neural network--based optimization lens and incorporate information on inventory dynamics and its replenishment (i.e., control) policies into the design of recurrent neural networks. We show that the proposed neural network controllers (NNCs) are able to learn near-optimal policies of commonly used instances within a few minutes of CPU time on a regular personal computer. To demonstrate the versatility of NNCs, we also show that they can control inventory dynamics with empirical, non-stationary demand distributions that are challenging to tackle effectively using alternative, state-of-the-art approaches. Our work shows that high-quality solutions of complex inventory management problems with non-stationary demand can be obtained with deep neural-network optimization approaches that directly account for inventory dynamics in their optimization process. As such, our research opens up new ways of efficiently managing complex, high-dimensional inventory dynamics.


Spectrally Adapted Physics-Informed Neural Networks for Solving Unbounded Domain Problems

arXiv.org Artificial Intelligence

Solving analytically intractable partial differential equations (PDEs) that involve at least one variable defined on an unbounded domain arises in numerous physical applications. Accurately solving unbounded domain PDEs requires efficient numerical methods that can resolve the dependence of the PDE on the unbounded variable over at least several orders of magnitude. We propose a solution to such problems by combining two classes of numerical methods: (i) adaptive spectral methods and (ii) physics-informed neural networks (PINNs). The numerical approach that we develop takes advantage of the ability of physics-informed neural networks to easily implement high-order numerical schemes to efficiently solve PDEs and extrapolate numerical solutions at any point in space and time. We then show how recently introduced adaptive techniques for spectral methods can be integrated into PINN-based PDE solvers to obtain numerical solutions of unbounded domain problems that cannot be efficiently approximated by standard PINNs. Through a number of examples, we demonstrate the advantages of the proposed spectrally adapted PINNs in solving PDEs and estimating model parameters from noisy observations in unbounded domains. Keywords: Physics-informed neural networks, PDE models, spectral methods, adaptive methods, unbounded domains 1. Introduction The use of neural networks as universal function approximators [1, 2] led to various applications in simulating [3, 4] and controlling [5, 6, 7, 8] physical, biological, and engineering systems. Training neural networks in function-approximation tasks is typically realized in two steps.


NNC: Neural-Network Control of Dynamical Systems on Graphs

arXiv.org Machine Learning

We study the ability of neural networks to steer or control trajectories of dynamical systems on graphs. In particular, we introduce a neural-network control (NNC) framework, which represents dynamical systems by neural ordinary different equations (neural ODEs), and find that NNC can learn control signals that drive networked dynamical systems into desired target states. To identify the influence of different target states on the NNC performance, we study two types of control: (i) microscopic control and (ii) macroscopic control. Microscopic control minimizes the L2 norm between the current and target state and macroscopic control minimizes the corresponding Wasserstein distance. We find that the proposed NNC framework produces low-energy control signals that are highly correlated with those of optimal control. Our results are robust for a wide range of graph structures and (non-)linear dynamical systems.