Kulcsár, Balázs
Less Is More -- On the Importance of Sparsification for Transformers and Graph Neural Networks for TSP
Lischka, Attila, Wu, Jiaming, Basso, Rafael, Chehreghani, Morteza Haghir, Kulcsár, Balázs
Most of the recent studies tackling routing problems like the Traveling Salesman Problem (TSP) with machine learning use a transformer or Graph Neural Network (GNN) based encoder architecture. However, many of them apply these encoders naively by allowing them to aggregate information over the whole TSP instances. We, on the other hand, propose a data preprocessing method that allows the encoders to focus on the most relevant parts of the TSP instances only. In particular, we propose graph sparsification for TSP graph representations passed to GNNs and attention masking for TSP instances passed to transformers where the masks correspond to the adjacency matrices of the sparse TSP graph representations. Furthermore, we propose ensembles of different sparsification levels allowing models to focus on the most promising parts while also allowing information flow between all nodes of a TSP instance. In the experimental studies, we show that for GNNs appropriate sparsification and ensembles of different sparsification levels lead to substantial performance increases of the overall architecture. We also design a new, state-of-the-art transformer encoder with ensembles of attention masking. These transformers increase model performance from a gap of $0.16\%$ to $0.10\%$ for TSP instances of size 100 and from $0.02\%$ to $0.00\%$ for TSP instances of size 50.
Critical Zones for Comfortable Collision Avoidance with a Leading Vehicle
Kovaceva, Jordanka, Murgovski, Nikolce, Kulcsár, Balázs, Wymeersch, Henk, Bärgman, Jonas
This paper provides a general framework for efficiently obtaining the appropriate intervention time for collision avoidance systems to just avoid a rear-end crash. The proposed framework incorporates a driver comfort model and a vehicle model. We show that there is a relationship between driver steering manoeuvres based on acceleration and jerk, and steering angle and steering angle rate profiles. We investigate how four different vehicle models influence the time when steering needs to be initiated to avoid a rear-end collision. The models assessed were: a dynamic bicycle model (DM), a steady-state cornering model (SSCM), a kinematic model (KM) and a point mass model (PMM). We show that all models can be described by a parameter-varying linear system. We provide three algorithms for steering that use a linear system to compute the intervention time efficiently for all four vehicle models. Two of the algorithms use backward reachability simulation and one uses forward simulation. Results show that the SSCM, KM and PMM do not accurately estimate the intervention time for a certain set of vehicle conditions. Due to its fast computation time, DM with a backward reachability algorithm can be used for rapid offline safety benefit assessment, while DM with a forward simulation algorithm is better suited for online real-time usage.
Controlled Descent Training
Andersson, Viktor, Varga, Balázs, Szolnoky, Vincent, Syrén, Andreas, Jörnsten, Rebecka, Kulcsár, Balázs
In this work, a novel and model-based artificial neural network (ANN) training method is developed supported by optimal control theory. The method augments training labels in order to robustly guarantee training loss convergence and improve training convergence rate. Dynamic label augmentation is proposed within the framework of gradient descent training where the convergence of training loss is controlled. First, we capture the training behavior with the help of empirical Neural Tangent Kernels (NTK) and borrow tools from systems and control theory to analyze both the local and global training dynamics (e.g. stability, reachability). Second, we propose to dynamically alter the gradient descent training mechanism via fictitious labels as control inputs and an optimal state feedback policy. In this way, we enforce locally $\mathcal{H}_2$ optimal and convergent training behavior. The novel algorithm, \textit{Controlled Descent Training} (CDT), guarantees local convergence. CDT unleashes new potentials in the analysis, interpretation, and design of ANN architectures. The applicability of the method is demonstrated on standard regression and classification problems.
Short-term traffic prediction using physics-aware neural networks
Pereira, Mike, Lang, Annika, Kulcsár, Balázs
In this work, we propose an algorithm performing short-term predictions of the flux of vehicles on a stretch of road, using past measurements of the flux. This algorithm is based on a physics-aware recurrent neural network. A discretization of a macroscopic traffic flow model (using the so-called Traffic Reaction Model) is embedded in the architecture of the network and yields flux predictions based on estimated and predicted space-time dependent traffic parameters. These parameters are themselves obtained using a succession of LSTM ans simple recurrent neural networks. Besides, on top of the predictions, the algorithm yields a smoothing of its inputs which is also physically-constrained by the macroscopic traffic flow model. The algorithm is tested on raw flux measurements obtained from loop detectors.