Goto

Collaborating Authors

 Tomasi, Carlo


Training Over a Distribution of Hyperparameters for Enhanced Performance and Adaptability on Imbalanced Classification

arXiv.org Artificial Intelligence

Although binary classification is a well-studied problem, training reliable classifiers under severe class imbalance remains a challenge. Recent techniques mitigate the ill effects of imbalance on training by modifying the loss functions or optimization methods. We observe that different hyperparameter values on these loss functions perform better at different recall values. We propose to exploit this fact by training one model over a distribution of hyperparameter values-instead of a single value-via Loss Conditional Training (LCT). Experiments show that training over a distribution of hyperparameters not only approximates the performance of several models but actually improves the overall performance of models on both CIFAR and real medical imaging applications, such as melanoma and diabetic retinopathy detection. Furthermore, training models with LCT is more efficient because some hyperparameter tuning can be conducted after training to meet individual needs without needing to retrain from scratch. Consider a classifier that takes images of skin lesions and predicts whether they are melanoma or benign (Rotemberg et al., 2020). Such a system could be especially valuable in underdeveloped countries where expert resources for screening are scarce (Cassidy et al., 2022). The dataset for this problem, along with many other practical problems, is inherently imbalanced (i.e., there are far more benign samples than melanoma samples). Furthermore, there are un-even costs associated with misclassifying the two classes because predicting a benign lesion as melanoma would result in the cost of a biopsy while predicting a melanoma lesion as benign could result in the melanoma spreading before the patient can receive appropriate treatment. Unfortunately, the exact difference in the misclassification costs may not be known a priori and may even change after deployment. For example, the costs may change depending on the amount of biopsy resources available or the prior may change depending on the age and condition of the patient. Thus, a good classifier for this problem should (a) have good performance across a wide range of Precision-Recall tradeoffs and (b) be able to adapt to changes in the prior or misclassification costs.


Optimizing for ROC Curves on Class-Imbalanced Data by Training over a Family of Loss Functions

arXiv.org Artificial Intelligence

Although binary classification is a well-studied problem in computer vision, training reliable classifiers under severe class imbalance remains a challenging problem. Recent work has proposed techniques that mitigate the effects of training under imbalance by modifying the loss functions or optimization methods. While this work has led to significant improvements in the overall accuracy in the multi-class case, we observe that slight changes in hyperparameter values of these methods can result in highly variable performance in terms of Receiver Operating Characteristic (ROC) curves on binary problems with severe Figure 1: Distribution of Area Under the ROC Curve (AUC) imbalance. To reduce the sensitivity to hyperparameter values obtained by training the same model on the SIIM-choices and train more general models, ISIC Melanoma classification dataset with 48 different combinations we propose training over a family of loss functions, of hyperparameters on VS loss. Results are shown instead of a single loss function. We develop at three different imbalance ratios. As the imbalance becomes a method for applying Loss Conditional more severe, model performance drops and the Training (LCT) to an imbalanced classification variance in performance drastically increases.


UFD-PRiME: Unsupervised Joint Learning of Optical Flow and Stereo Depth through Pixel-Level Rigid Motion Estimation

arXiv.org Artificial Intelligence

Both optical flow and stereo disparities are image matches and can therefore benefit from joint training. Depth and 3D motion provide geometric rather than photometric information and can further improve optical flow. Accordingly, we design a first network that estimates flow and disparity jointly and is trained without supervision. A second network, trained with optical flow from the first as pseudo-labels, takes disparities from the first network, estimates 3D rigid motion at every pixel, and reconstructs optical flow again. A final stage fuses the outputs from the two networks. In contrast with previous methods that only consider camera motion, our method also estimates the rigid motions of dynamic objects, which are of key interest in applications. This leads to better optical flow with visibly more detailed occlusions and object boundaries as a result. Our unsupervised pipeline achieves 7.36% optical flow error on the KITTI-2015 benchmark and outperforms the previous state-of-the-art 9.38% by a wide margin. It also achieves slightly better or comparable stereo depth results. Code will be made available.


SemARFlow: Injecting Semantics into Unsupervised Optical Flow Estimation for Autonomous Driving

arXiv.org Artificial Intelligence

Unsupervised optical flow estimation is especially hard near occlusions and motion boundaries and in low-texture regions. We show that additional information such as semantics and domain knowledge can help better constrain this problem. We introduce SemARFlow, an unsupervised optical flow network designed for autonomous driving data that takes estimated semantic segmentation masks as additional inputs. This additional information is injected into the encoder and into a learned upsampler that refines the flow output. In addition, a simple yet effective semantic augmentation module provides self-supervision when learning flow and its boundaries for vehicles, poles, and sky. Together, these injections of semantic information improve the KITTI-2015 optical flow test error rate from 11.80% to 8.38%. We also show visible improvements around object boundaries as well as a greater ability to generalize across datasets. Code is available at https://github.com/duke-vision/semantic-unsup-flow-release.


Identity Connections in Residual Nets Improve Noise Stability

arXiv.org Machine Learning

Residual Neural Networks (ResNets) achieve state-of-the-art performance in many computer vision problems. Compared to plain networks without residual connections (PlnNets), ResNets train faster, generalize better, and suffer less from the so-called degradation problem. We introduce simplified (but still nonlinear) versions of ResNets and PlnNets for which these discrepancies still hold, although to a lesser degree. We establish a 1-1 mapping between simplified ResNets and simplified PlnNets, and show that they are exactly equivalent to each other in expressive power for the same computational complexity. We conjecture that ResNets generalize better because they have better noise stability, and empirically support it for both simplified and fully-fledged networks.


Distance Minimization for Reward Learning from Scored Trajectories

AAAI Conferences

Many planning methods rely on the use of an immediate reward function as a portable and succinct representation of desired behavior. Rewards are often inferred from demonstrated behavior that is assumed to be near-optimal. We examine a framework, Distance Minimization IRL (DM-IRL), for learning reward functions from scores an expert assigns to possibly suboptimal demonstrations. By changing the expert’s role from a demonstrator to a judge, DM-IRL relaxes some of the assumptions present in IRL, enabling learning from the scoring of arbitrary demonstration trajectories with unknown transition functions. DM-IRL complements existing IRL approaches by addressing different assumptions about the expert. We show that DM-IRL is robust to expert scoring error and prove that finding a policy that produces maximally informative trajectories for an expert to score is strongly NP-hard. Experimentally, we demonstrate that the reward function DM-IRL learns from an MDP with an unknown transition model can transfer to an agent with known characteristics in a novel environment, and we achieve successful learning with limited available training data.