-- Compute and memory constraints have historically prevented traffic simulation software users from fully utilizing the predictive models underlying them. When calibrating car-following models, particularly, accommodations have included 1) using sensitivity analysis to limit the number of parameters to be calibrated, and 2) identifying only one set of parameter values using data collected from multiple car-following instances across multiple drivers. Shortcuts are further motivated by insufficient data set sizes, for which a driver may have too few instances to fully account for the variation in their driving behavior . In this paper, we demonstrate that recent technological advances can enable transportation researchers and engineers to overcome these constraints and produce calibration results that 1) outperform industry standard approaches, and 2) allow for a unique set of parameters to be estimated for each driver in a data set, even given a small amount of data. We propose a novel calibration procedure for car-following models based on Bayesian machine learning and probabilistic programming, and apply it to real-world data from a naturalistic driving study. We also discuss how this combination of mathematical and software tools can offer additional benefits such as more informative model validation and the incorporation of true-to-data uncertainty into simulation traces. Traffic simulation software packages are widely used in transportation engineering to estimate the impacts of potential changes to a roadway network and forecast system performance under future scenarios. These packages are underpinned by math-and physics-based models, which are designed to describe behavior at an aggregate (macroscopic) level or at the level of individual drivers (microscopic).
We introduce a Bayesian solution for the problem in forensic speaker recognition, where there may be very little background material for estimating score calibration parameters. We work within the Bayesian paradigm of evidence reporting and develop a principled probabilistic treatment of the problem, which results in a Bayesian likelihood-ratio as the vehicle for reporting weight of evidence. We show in contrast, that reporting a likelihood-ratio distribution does not solve this problem. Our solution is experimentally exercised on a simulated forensic scenario, using NIST SRE'12 scores, which demonstrates a clear advantage for the proposed method compared to the traditional plugin calibration recipe.
Deep Neural Networks (DNNs) have achieved state-of-the-art accuracy performance in many tasks. However, recent works have pointed out that the outputs provided by these models are not well-calibrated, seriously limiting their use in critical decision scenarios. In this work, we propose to use a decoupled Bayesian stage, implemented with a Bayesian Neural Network (BNN), to map the uncalibrated probabilities provided by a DNN to calibrated ones, consistently improving calibration. Our results evidence that incorporating uncertainty provides more reliable probabilistic models, a critical condition for achieving good calibration. We report a generous collection of experimental results using high-accuracy DNNs in standardized image classification benchmarks, showing the good performance, flexibility and robust behavior of our approach with respect to several state-of-the-art calibration methods. Code for reproducibility is provided.
The inference of parameters of expensive computer models from data is a classical problem in Statistics (Sacks et al., 1989). Such a problem is often referred to as calibration (Kennedy and O'Hagan, 2001), and the results of calibration are often of interest to draw conclusions on parameters that may have a direct interpretation of real physical quantities. There are many fundamental difficulties in calibrating expensive computer models, which we can distinguish between computational and statistical. Computational issues arise from the fact that traditional optimization and inference techniques require running the expensive computer model many times for different values of the parameters, which might be unfeasible within a given computational budget. Statistical limitations, instead, arise from the fact that computer models are abstractions of real processes, which might be inaccurate.
Score calibration enables automatic speaker recognizers to make cost-effective accept / reject decisions. Traditional calibration requires supervised data, which is an expensive resource. We propose a 2-component GMM for unsupervised calibration and demonstrate good performance relative to a supervised baseline on NIST SRE'10 and SRE'12. A Bayesian analysis demonstrates that the uncertainty associated with the unsupervised calibration parameter estimates is surprisingly small.