Collaborating Authors

Do ImageNet Classifiers Generalize to ImageNet? Machine Learning

We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been the focus of intense research for almost a decade, raising the danger of overfitting to excessively re-used test sets. By closely following the original dataset creation processes, we test to what extent current classification models generalize to new data. We evaluate a broad range of models and find accuracy drops of 3% - 15% on CIFAR-10 and 11% - 14% on ImageNet. However, accuracy gains on the original test sets translate to larger gains on the new test sets. Our results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets.

Artificial Intelligence for Social Good: A Survey Artificial Intelligence

Its impact is drastic and real: Youtube's AIdriven recommendation system would present sports videos for days if one happens to watch a live baseball game on the platform [1]; email writing becomes much faster with machine learning (ML) based auto-completion [2]; many businesses have adopted natural language processing based chatbots as part of their customer services [3]. AI has also greatly advanced human capabilities in complex decision-making processes ranging from determining how to allocate security resources to protect airports [4] to games such as poker [5] and Go [6]. All such tangible and stunning progress suggests that an "AI summer" is happening. As some put it, "AI is the new electricity" [7]. Meanwhile, in the past decade, an emerging theme in the AI research community is the so-called "AI for social good" (AI4SG): researchers aim at developing AI methods and tools to address problems at the societal level and improve the wellbeing of the society.

GPT-3 Creative Fiction


What if I told a story here, how would that story start?" Thus, the summarization prompt: "My second grader asked me what this passage means: …" When a given prompt isn't working and GPT-3 keeps pivoting into other modes of completion, that may mean that one hasn't constrained it enough by imitating a correct output, and one needs to go further; writing the first few words or sentence of the target output may be necessary.

Revisiting the Importance of Individual Units in CNNs via Ablation Artificial Intelligence

We revisit the importance of the individual units in Convolutional Neural Networks (CNNs) for visual recognition. By conducting unit ablation experiments on CNNs trained on large scale image datasets, we demonstrate that, though ablating any individual unit does not hurt overall classification accuracy, it does lead to significant damage on the accuracy of specific classes. This result shows that an individual unit is specialized to encode information relevant to a subset of classes. We compute the correlation between the accuracy drop under unit ablation and various attributes of an individual unit such as class selectivity and weight L1 norm. We confirm that unit attributes such as class selectivity are a poor predictor for impact on overall accuracy as found previously in recent work \cite{morcos2018importance}. However, our results show that class selectivity along with other attributes are good predictors of the importance of one unit to individual classes. We evaluate the impact of random rotation, batch normalization, and dropout to the importance of units to specific classes. Our results show that units with high selectivity play an important role in network classification power at the individual class level. Understanding and interpreting the behavior of these units is necessary and meaningful.