Goto

Collaborating Authors

 terrier


1 Details for Dataset Partitioning Here we provide the dataset partitioning results for ImageNet [

Neural Information Processing Systems

Novel categories names:['High_Jump', 'Front_Crawl', 'Pole_V ault', 'Hammer_Throw', All experiments are conducted under the 16-shot setting. An incremental bayesian approach tested on 101 object categories. Conditional prompt learning for vision-language models.


Benchmarking Robustness to Adversarial Image Obfuscations

Neural Information Processing Systems

Advances in in computer vision have lead to classifiers that nearly match human performance in many applications. However, while the human visual system is remarkably versatile in extracting semantic meaning out of even degraded and heavily obfuscated images, today's visual classifiers significantly lag behind in emulating the same robustness, and often yield incorrect outputs in the presence of natural and adversarial degradations.


Automated Classification of Model Errors on ImageNet

Neural Information Processing Systems

While the ImageNet dataset has been driving computer vision research over the past decade, significant label noise and ambiguity have made top-1 accuracy an insufficient measure of further progress.


Is the Rat War Over?

The New Yorker

Is the Rat War Over? In New York, a rat czar and new methods have brought down complaints. We may even be ready to appreciate the creatures. Rats were leaving Manhattan, hurrying across the bridges in single-file lines. Some went to Westchester, some to Brooklyn. It was the pandemic, and the rats, which had been living off the nourishing trash of New York's densest borough for generations, were as panicked about the closure of restaurants as we were. People were eating three meals a day at home, and the rats were hungry. At least that was the story going around.




1 Details for Dataset Partitioning Here we provide the dataset partitioning results for ImageNet [

Neural Information Processing Systems

Novel categories names:['High_Jump', 'Front_Crawl', 'Pole_V ault', 'Hammer_Throw', All experiments are conducted under the 16-shot setting. An incremental bayesian approach tested on 101 object categories. Conditional prompt learning for vision-language models.


Terrier: A Deep Learning Repeat Classifier

arXiv.org Artificial Intelligence

Repetitive DNA sequences underpin genome architecture and evolutionary processes, yet they remain challenging to classify accurately. Terrier is a deep learning model designed to overcome these challenges by classifying repetitive DNA sequences using a publicly available, curated repeat sequence library trained under the RepeatMasker schema. Existing tools often struggle to classify divergent taxa due to biases in reference libraries, limiting our understanding of repeat evolution and function. Terrier overcomes these challenges by leveraging deep learning for improved accuracy. Trained on RepBase, which includes over 100,000 repeat families -- four times more than Dfam -- Terrier maps 97.1% of RepBase sequences to RepeatMasker categories, offering the most comprehensive classification system available. When benchmarked against DeepTE, TERL, and TEclass2 in model organisms (rice and fruit flies), Terrier achieved superior accuracy while classifying a broader range of sequences. Further validation in non-model amphibian and flatworm genomes highlights its effectiveness in improving classification in non-model species, facilitating research on repeat-driven evolution, genomic instability, and phenotypic variation.


Benchmarking Robustness to Adversarial Image Obfuscations

arXiv.org Artificial Intelligence

Automated content filtering and moderation is an important tool that allows online platforms to build striving user communities that facilitate cooperation and prevent abuse. Unfortunately, resourceful actors try to bypass automated filters in a bid to post content that violate platform policies and codes of conduct. To reach this goal, these malicious actors may obfuscate policy violating images (e.g. overlay harmful images by carefully selected benign images or visual patterns) to prevent machine learning models from reaching the correct decision. In this paper, we invite researchers to tackle this specific issue and present a new image benchmark. This benchmark, based on ImageNet, simulates the type of obfuscations created by malicious actors. It goes beyond ImageNet-$\textrm{C}$ and ImageNet-$\bar{\textrm{C}}$ by proposing general, drastic, adversarial modifications that preserve the original content intent. It aims to tackle a more common adversarial threat than the one considered by $\ell_p$-norm bounded adversaries. We evaluate 33 pretrained models on the benchmark and train models with different augmentations, architectures and training methods on subsets of the obfuscations to measure generalization. We hope this benchmark will encourage researchers to test their models and methods and try to find new approaches that are more robust to these obfuscations.


Automated Classification of Model Errors on ImageNet

arXiv.org Artificial Intelligence

While the ImageNet dataset has been driving computer vision research over the past decade, significant label noise and ambiguity have made top-1 accuracy an insufficient measure of further progress. To address this, new label-sets and evaluation protocols have been proposed for ImageNet showing that state-of-the-art models already achieve over 95% accuracy and shifting the focus on investigating why the remaining errors persist. Recent work in this direction employed a panel of experts to manually categorize all remaining classification errors for two selected models. However, this process is time-consuming, prone to inconsistencies, and requires trained experts, making it unsuitable for regular model evaluation thus limiting its utility. To overcome these limitations, we propose the first automated error classification framework, a valuable tool to study how modeling choices affect error distributions. We use our framework to comprehensively evaluate the error distribution of over 900 models. Perhaps surprisingly, we find that across model architectures, scales, and pre-training corpora, top-1 accuracy is a strong predictor for the portion of all error types. In particular, we observe that the portion of severe errors drops significantly with top-1 accuracy indicating that, while it underreports a model's true performance, it remains a valuable performance metric.