Banerjee, Subho S.
BayesPerf: Minimizing Performance Monitoring Errors Using Bayesian Statistics
Banerjee, Subho S., Jha, Saurabh, Kalbarczyk, Zbigniew T., Iyer, Ravishankar K.
Hardware performance counters (HPCs) that measure low-level architectural and microarchitectural events provide dynamic contextual information about the state of the system. However, HPC measurements are error-prone due to non determinism (e.g., undercounting due to event multiplexing, or OS interrupt-handling behaviors). In this paper, we present BayesPerf, a system for quantifying uncertainty in HPC measurements by using a domain-driven Bayesian model that captures microarchitectural relationships between HPCs to jointly infer their values as probability distributions. We provide the design and implementation of an accelerator that allows for low-latency and low-power inference of the BayesPerf model for x86 and ppc64 CPUs. BayesPerf reduces the average error in HPC measurements from 40.1% to 7.6% when events are being multiplexed. The value of BayesPerf in real-time decision-making is illustrated with a simple example of scheduling of PCIe transfers.
ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection
Jha, Saurabh, Banerjee, Subho S., Tsai, Timothy, Hari, Siva K. S., Sullivan, Michael B., Kalbarczyk, Zbigniew T., Keckler, Stephen W., Iyer, Ravishankar K.
Items (a), (b), and (c) are integrated into a intelligence (AI) and machine learning (ML) to integrate Bayesian network (BN). BNs provide a favorable formalism mechanical, electronic, and computing technologies to make in which to model the propagation of faults across AV system real-time driving decisions. AI enables AVs to navigate through components with an interpretable model. The model, together complex environments while maintaining a safety envelope [1], with fault injection results, can be used to design and assess [2] that is continuously measured and quantified by onboard the safety of AVs. Further, BNs enable rapid probabilistic sensors (e.g., camera, LiDAR, RADAR) [3]-[5]. Clearly, the inference, which allows DriveFI to quickly find safety-critical safety and resilience of AVs are of significant concern, as faults. The Bayesian FI framework can be extended to other exemplified by several headline-making AV crashes [6], [7], safety-critical systems (e.g., surgical robots). The framework as well as prior work characterizing AV resilience during road requires specification of the safety constraints and the system tests [8]. Hence there is a compelling need for a comprehensive software architecture to model causal relationship between assessment of AV technology.