Goto

Collaborating Authors

 deviation




Supplement to " Estimating Riemannian Metric with Noise-Contaminated Intrinsic Distance "

Neural Information Processing Systems

Unlike distance metric learning where the subsequent tasks utilizing the estimated distance metric is the usual focus, the proposal focuses on the estimated metric characterizing the geometry structure. Despite the illustrated taxi and MNIST examples, it is still open to finding more compelling applications that target the data space geometry. Interpreting mathematical concepts such as Riemannian metric and geodesic in the context of potential application (e.g., cognition and perception research where similarity measures are common) could be inspiring. Our proposal requires sufficiently dense data, which could be demanding, especially for high-dimensional data due to the curse of dimensionality. Dimensional reduction (e.g., manifold embedding as in the MNIST example) can substantially alleviate the curse of dimensionality, and the dense data requirement will more likely hold true.


Supplementary Material Cal-DETR: Calibrated Detection Transformer

Neural Information Processing Systems

Then, we present the error bar plots with mean D-ECE and std deviation (Sec. The error in particular detection is computed as it satisfies the false positive criteria. We report D-ECE on these challenging out-domain scenarios. (Figure 1). We show the bar plots depicting mean D-ECE with respective standard deviations.




Causes and Effects of Unanticipated Numerical Deviations in Neural Network Inference Frameworks

Neural Information Processing Systems

Hardware-specific optimizations in machine learning (ML) frameworks can cause numerical deviations of inference results. Quite surprisingly, despite using a fixed trained model and fixed input data, inference results are not consistent across platforms, and sometimes not even deterministic on the same platform. We study the causes of these numerical deviations for convolutional neural networks (CNN) on realistic end-to-end inference pipelines and in isolated experiments. Results from 75 distinct platforms suggest that the main causes of deviations on CPUs are differences in SIMD use, and the selection of convolution algorithms at runtime on GPUs. We link the causes and propagation effects to properties of the ML model and evaluate potential mitigations. We make our research code publicly available.


Supplementary Material

Neural Information Processing Systems

The supplementary material is organized as follows. We give details of the definitions and notation in Section B.1 . Then, we provide the technical details of the lower bound (Lemma 3.3). In Section D.4 we provide insights into auto-labeling using This suggests, in these settings auto-labeling using active learning followed by selective classification is expected to work well. This idea is captured by the Chow's excess risk [ Nevertheless, it would be interesting future work to explore the connections between auto-labeling and active learning with abstention.



Supplementary Materials

Neural Information Processing Systems

Finally, the data was subsampled by a factor of 2. Data augmentation TX features were augmented by adding two types of artificial noise. Each session day has its own affine transform layer. RNN training hyperparameters The hyperparameters for RNN training are listed in Table 1. Table 1: RNN training hyperparameters Description Hyperparameter Learning rate 0.01 Batch size 48 Number of training batches 20000 Number of hidden units in the GRU 512 Number of GRU layers 2 Dropout rate in the GRU 0.4 Optimizer Adam Learning rate decay schedule Linear L2 weight regularization 1e-5 Maximum gradient norm for clipping 10 1.2 Language model training details Out-of-vocabulary words were mapped to a special token. In our case, T contains all 26 English letters, 5 punctuation marks, and the CTC blank symbol.