Goto

Collaborating Authors

 Performance Analysis



Weighted ROC Curve in Cost Space: Extending AUC to Cost-Sensitive Learning

Neural Information Processing Systems

Receiver Operating Characteristics (ROC) is a popular tool to describe the trade-off between the True Positive Rate (TPR) and False Positive Rate (FPR) of a scoring function.



Towards Automated Circuit Discovery for Mechanistic Interpretability

Neural Information Processing Systems

Through considerable effort and intuition, several recent works have reverse-engineered nontrivial behaviors of transformer models. This paper systematizes the mechanistic interpretability process they followed. First, researchers choose a metric and dataset that elicit the desired model behavior.


Expert load matters: operating networks at high accuracy and low manual effort

Neural Information Processing Systems

In human-AI collaboration systems for critical applications, in order to ensure minimal error, users should set an operating point based on model confidence to determine when the decision should be delegated to human experts.



Overleaf Example

Neural Information Processing Systems

Experiments show that the proposed ReBalanced Adversarial Training (ReBA T) can attain good robustness and does not suffer from robust overfitting even after very long training.