Accuracy
Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data
Zhu, Shixiang, Bukharin, Alexander, Xie, Liyan, Yang, Shihao, Keskinocak, Pinar, Xie, Yao
Recently, the Centers for Disease Control and Prevention (CDC) has worked with other federal agencies to identify counties with increasing coronavirus disease 2019 (COVID-19) incidence (hotspots) and offers support to local health departments to limit the spread of the disease. Understanding the spatio-temporal dynamics of hotspot events is of great importance to support policy decisions and prevent large-scale outbreaks. This paper presents a spatio-temporal Bayesian framework for early detection of COVID-19 hotspots (at the county level) in the United States. We assume both the observed number of cases and hotspots depend on a class of latent random variables, which encode the underlying spatio-temporal dynamics of the transmission of COVID-19. Such latent variables follow a zero-mean Gaussian process, whose covariance is specified by a non-stationary kernel function. The most salient feature of our kernel function is that deep neural networks are introduced to enhance the model's representative power while still enjoying the interpretability of the kernel. We derive a sparse model and fit the model using a variational learning strategy to circumvent the computational intractability for large data sets. Our model demonstrates better interpretability and superior hotspot-detection performance compared to other baseline methods.
Rawlsian Fair Adaptation of Deep Learning Classifiers
Shah, Kulin, Gupta, Pooja, Deshpande, Amit, Bhattacharyya, Chiranjib
Group-fairness in classification aims for equality of a predictive utility across different sensitive sub-populations, e.g., race or gender. Equality or near-equality constraints in group-fairness often worsen not only the aggregate utility but also the utility for the least advantaged sub-population. In this paper, we apply the principles of Pareto-efficiency and least-difference to the utility being accuracy, as an illustrative example, and arrive at the Rawls classifier that minimizes the error rate on the worst-off sensitive sub-population. Our mathematical characterization shows that the Rawls classifier uniformly applies a threshold to an ideal score of features, in the spirit of fair equality of opportunity. In practice, such a score or a feature representation is often computed by a black-box model that has been useful but unfair. Our second contribution is practical Rawlsian fair adaptation of any given black-box deep learning model, without changing the score or feature representation it computes. Given any score function or feature representation and only its second-order statistics on the sensitive sub-populations, we seek a threshold classifier on the given score or a linear threshold classifier on the given feature representation that achieves the Rawls error rate restricted to this hypothesis class. Our technical contribution is to formulate the above problems using ambiguous chance constraints, and to provide efficient algorithms for Rawlsian fair adaptation, along with provable upper bounds on the Rawls error rate. Our empirical results show significant improvement over state-of-the-art group-fair algorithms, even without retraining for fairness.
Fast, Accurate and Interpretable Time Series Classification Through Randomization
Cabello, Nestor, Naghizade, Elham, Qi, Jianzhong, Kulik, Lars
Time series classification (TSC) aims to predict the class label of a given time series, which is critical to a rich set of application areas such as economics and medicine. State-of-the-art TSC methods have mostly focused on classification accuracy and efficiency, without considering the interpretability of their classifications, which is an important property required by modern applications such as appliance modeling and legislation such as the European General Data Protection Regulation. To address this gap, we propose a novel TSC method - the Randomized-Supervised Time Series Forest (r-STSF). r-STSF is highly efficient, achieves state-of-the-art classification accuracy and enables interpretability. r-STSF takes an efficient interval-based approach to classify time series according to aggregate values of discriminatory sub-series (intervals). To achieve state-of-the-art accuracy, r-STSF builds an ensemble of randomized trees using the discriminatory sub-series. It uses four time series representations, nine aggregation functions and a supervised binary-inspired search combined with a feature ranking metric to identify highly discriminatory sub-series. The discriminatory sub-series enable interpretable classifications. Experiments on extensive datasets show that r-STSF achieves state-of-the-art accuracy while being orders of magnitude faster than most existing TSC methods. It is the only classifier from the state-of-the-art group that enables interpretability. Our findings also highlight that r-STSF is the best TSC method when classifying complex time series datasets.
Calibrating sufficiently
Binary classification, in the first place, deals with decision tools (classifiers) that facilitate the prediction of the classes of instances on the basis of the so-called features of the instances. Accordingly, the simplest classifiers are crisp (or discrete) in the sense of having the set {0, 1} as output range: 1 for'predict positive class', 0 for'predict negative class. Scoring (or soft) classifiers provide output in a continuous range, usually with the interpretation that high values indicate high likelihood of the instance belonging to the positive class, while low values suggest that membership of the negative class is more likely. In many applications of classification, there is a need for'calibrated' probabilistic classifiers which reflect the likelihood of the positive class given the features of an instance in a frequentist statistical sense (Platt, 2000; Zadrozny and Elkan, 2002; Cohen and Goldszmidt, 2004; Kull et al., 2017). How to best achieve good calibration and how to measure it are active research areas (Bรถken, 2021; Roelofs et al., 2020).
Improving Entropic Out-of-Distribution Detection using Isometric Distances and the Minimum Distance Score
Macรชdo, David, Ludermir, Teresa
Current out-of-distribution detection approaches usually present special requirements (e.g., collecting outlier data and hyperparameter validation) and produce side effects (classification accuracy drop and slow/inefficient inferences). Recently, entropic out-of-distribution detection has been proposed as a seamless approach (i.e., a solution that avoids all the previously mentioned drawbacks). The entropic out-of-distribution detection solution comprises the IsoMax loss for training and the entropic score for out-of-distribution detection. The IsoMax loss works as a SoftMax loss drop-in replacement because swapping the SoftMax loss with the IsoMax loss requires no changes in the model's architecture or training procedures/hyperparameters. In this paper, we propose to perform what we call an isometrization of the distances used in the IsoMax loss. Additionally, we propose to replace the entropic score with the minimum distance score. Our experiments showed that these simple modifications increase out-of-distribution detection performance while keeping the solution seamless.
The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider
Aarrestad, T., van Beekveld, M., Bona, M., Boveia, A., Caron, S., Davies, J., De Simone, A., Doglioni, C., Duarte, J. M., Farbin, A., Gupta, H., Hendriks, L., Heinrich, L., Howarth, J., Jawahar, P., Jueid, A., Lastow, J., Leinweber, A., Mamuzic, J., Merรฉnyi, E., Morandini, A., Moskvitina, P., Nellist, C., Ngadiuba, J., Ostdiek, B., Pierini, M., Ravina, B., de Austri, R. Ruiz, Sekmen, S., Touranakou, M., Vaลกkeviฤiลซte, M., Vilalta, R., Vlimant, J. R., Verheyen, R., White, M., Wulff, E., Wallin, E., Wozniak, K. A., Zhang, Z.
We describe the outcome of a data challenge conducted as part of the Dark Machines Initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims at detecting signals of new physics at the LHC using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 Billion simulated LHC events corresponding to $10~\rm{fb}^{-1}$ of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.
Robust Regularization with Adversarial Labelling of Perturbed Samples
Guo, Xiaohui, Zhang, Richong, Zheng, Yaowei, Mao, Yongyi
Recent researches have suggested that the predictive accuracy of neural network may contend with its adversarial robustness. This presents challenges in designing effective regularization schemes that also provide strong adversarial robustness. Revisiting Vicinal Risk Minimization (VRM) as a unifying regularization principle, we propose Adversarial Labelling of Perturbed Samples (ALPS) as a regularization scheme that aims at improving the generalization ability and adversarial robustness of the trained model. ALPS trains neural networks with synthetic samples formed by perturbing each authentic input sample towards another one along with an adversarially assigned label. The ALPS regularization objective is formulated as a min-max problem, in which the outer problem is minimizing an upper-bound of the VRM loss, and the inner problem is L$_1$-ball constrained adversarial labelling on perturbed sample. The analytic solution to the induced inner maximization problem is elegantly derived, which enables computational efficiency. Experiments on the SVHN, CIFAR-10, CIFAR-100 and Tiny-ImageNet datasets show that the ALPS has a state-of-the-art regularization performance while also serving as an effective adversarial training scheme.
Detecting Adversarial Examples with Bayesian Neural Network
Li, Yao, Tang, Tongyi, Hsieh, Cho-Jui, Lee, Thomas C. M.
In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate output distribution of deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATer, to improve the performance of adversarial example detection. In specific, we study the distributional difference of hidden layer output between natural and adversarial examples, and propose to use the randomness of Bayesian neural network (BNN) to simulate hidden layer output distribution and leverage the distribution dispersion to detect adversarial examples. The advantage of BNN is that the output is stochastic while neural networks without random components do not have such characteristics. Empirical results on several benchmark datasets against popular attacks show that the proposed BATer outperforms the state-of-the-art detectors in adversarial example detection.
California County Hopes Artificial Intelligence Can Mitigate Wildfire Risk
At this time of year, periodic rain showers on the north coast of California give way to months of daily sunshine and a wildfire risk that grows in severity until the next fall rains arrive. In Sonoma County, a new set of eyes is watching over the forest. Those eyes will be able to tap into an artificial intelligence program to make sure emergency dispatchers are alerted to actual fires instead of mist rising off the forest floor or steam from the region's numerous natural geysers. The county has entered into a $300,000 contract with South Korea technology firm Alchera to provide artificial intelligence software that can alert fire dispatchers to the precise location of flames or smoke. The two-year pilot project is funded through $3 million in hazard mitigation grants that the Federal Emergency Management Agency awarded to the county.
Characterizing the SLOPE Trade-off: A Variational Perspective and the Donoho-Tanner Limit
Bu, Zhiqi, Klusowski, Jason, Rush, Cynthia, Su, Weijie J.
Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this paper, we study how this relatively new regularization technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion (TPP) or, equivalently, between measures of type I error and power. Assuming a regime of linear sparsity and working under Gaussian random designs, we obtain an upper bound on the optimal trade-off for SLOPE, showing its capability of breaking the Donoho-Tanner power limit. To put it into perspective, this limit is the highest possible power that the Lasso, which is perhaps the most popular l1-based method, can achieve even with arbitrarily strong effect sizes. Next, we derive a tight lower bound that delineates the fundamental limit of sorted l1 regularization in optimally trading the FDP off for the TPP. Finally, we show that on any problem instance, SLOPE with a certain regularization sequence outperforms the Lasso, in the sense of having a smaller FDP, larger TPP and smaller l2 estimation risk simultaneously. Our proofs are based on a novel technique that reduces a variational calculus problem to a class of infinite-dimensional convex optimization problems and a very recent result from approximate message passing theory.