Researchers at ETH Zurich and the Frankfurt School have developed an artificial neural network that can solve challenging control problems. The self-learning system can be used for the optimization of supply chains and production processes as well as for smart grids or traffic control systems. Power cuts, financial network failures and supply chain disruptions are just some of the many of problems typically encountered in complex systems that are very difficult or even impossible to control using existing methods. Control systems based on artificial intelligence (AI) can help to optimize complex processes--and can also be used to develop new business models. Together with Professor Lucas Böttcher from the Frankfurt School of Finance and Management, ETH researchers Nino Antulov-Fantulin and Thomas Asikis--both from the Chair of Computational Social Science--have developed a versatile AI-based control system called AI Pontryagin which is designed to steer complex systems and networks towards desired target states.
Optimal control problems can be solved by first applying the Pontryagin maximum principle, followed by computing a solution of the corresponding unconstrained Hamiltonian dynamical system. In this paper, and to achieve a balance between robustness and efficiency, we learn a reduced Hamiltonian of the unconstrained Hamiltonian. This reduced Hamiltonian is learned by going backward in time and by minimizing the loss function resulting from application of the Pontryagin maximum principle's conditions. The robustness of our learning process is then further improved by progressively learning a posterior distribution of reduced Hamiltonians. This leads to a more efficient sampling of the generalized coordinates (position, velocity) of our phase space. Our solution framework applies to not only optimal control problems with finite-dimensional phase (state) spaces but also the infinite-dimensional case.
In this paper we analyze mathematically how human factors can be effectively incorporated into the analysis and control of complex systems. As an example, we focus our discussion around one of the key problems in the Intelligent Transportation Systems (ITS) theory and practice, the problem of speed control, considered here as a decision making process with limited information available. The problem is cast mathematically in the general framework of control problems and is treated in the context of dynamically changing environments where control is coupled to human-centered automation. Since in this case control might not be limited to a small number of control settings, as it is often assumed in the control literature, serious difficulties arise in the solution of this problem. We demonstrate that the problem can be reduced to a set of Hamilton-Jacobi-Bellman equations where human factors are incorporated via estimations of the system Hamiltonian. In the ITS context, these estimations can be obtained with the use of on-board equipment like sensors/receivers/actuators, in-vehicle communication devices, etc. The proposed methodology provides a way to integrate human factor into the solving process of the models for other complex dynamic systems.
Deep learning achieves state-of-the-art results in many areas. However recent works have shown that deep networks can be vulnerable to adversarial perturbations which slightly changes the input but leads to incorrect prediction. Adversarial training is an effective way of improving the robustness to the adversarial examples, typically formulated as a robust optimization problem for network training. To solve it, previous works directly run gradient descent on the "adversarial loss", i.e. replacing the input data with the corresponding adversaries. A major drawback of this approach is the computational overhead of adversary generation, which is much larger than network updating and leads to inconvenience in adversarial defense. To address this issue, we fully exploit structure of deep neural networks and propose a novel strategy to decouple the adversary update with the gradient back propagation. To achieve this goal, we follow the research line considering training deep neural network as an optimal control problem. We formulate the robust optimization as a differential game. This allows us to figure out the necessary conditions for optimality. In this way, we train the neural network via solving the Pontryagin's Maximum Principle (PMP). The adversary is only coupled with the first layer weight in PMP. It inspires us to split the adversary computation from the back propagation gradient computation. As a result, our proposed YOPO (You Only Propagate Once) avoids forward and backward the data too many times in one iteration, and restricts core descent directions computation to the first layer of the network, thus speeding up every iteration significantly. For adversarial example defense, our experiment shows that YOPO can achieve comparable defense accuracy using around 1/5 GPU time of the original projected gradient descent training.
In this paper, we discuss online estimation strategies that model the optimal value function of a typical optimal control problem. We present a general strategy that uses local corridor solutions obtained via dynamic programming to provide local optimal control sequencetraining data for a neural architecture model of the optimal value function.