Goto

Collaborating Authors

 supervised learning


Using Imperfect Surrogates for Downstream Inference: Design-based Supervised Learning for Social Science Applications of Large Language Models

Neural Information Processing Systems

In computational social science (CSS), researchers analyze documents to explain social and political phenomena. In most scenarios, CSS researchers first obtain labels for documents and then explain labels using interpretable regression analyses in the second step. One increasingly common way to annotate documents cheaply at scale is through large language models (LLMs). However, like other scalable ways of producing annotations, such surrogate labels are often imperfect and biased. We present a new algorithm for using imperfect annotation surrogates for downstream statistical analyses while guaranteeing statistical properties--like asymptotic unbiasedness and proper uncertainty quantification--which are fundamental to CSS research.



Learnability of Linear Thresholds from Label Proportions

Neural Information Processing Systems

We study the problem of properly learning linear threshold functions (LTFs) in the learning from label proportions (LLP) framework. In this, the learning is on a collection of bags of feature-vectors with only the proportion of labels available for each bag. First, we provide an algorithm that, given a collection of such bags each of size at most two whose label proportions are consistent with (i.e., the bags are satisfied by) an unknown LTF, efficiently produces an LTF that satisfies at least (2/5)-fraction of the bags. If all the bags are non-monochromatic (i.e., bags of size two with differently labeled feature-vectors) the algorithm satisfies at least (1/2)-fraction of them. For the special case of OR over the d-dimensional boolean vectors, we give an algorithm which computes an LTF achieving an additional Ω(1/d) in accuracy for the two cases.




What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights

Neural Information Processing Systems

Severe data imbalance naturally exists among web-scale vision-language datasets. Despite this, we find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning, and demonstrates significant effectiveness in learning generalizable representations. With an aim to investigate the reasons behind this finding, we conduct controlled experiments to study various underlying factors, and reveal that CLIP's pretext task forms a dynamic classification problem wherein only a subset of classes is present in training. This isolates the bias from dominant classes and implicitly balances the learning signal. Furthermore, the robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts, which are inaccessible to supervised learning. Our study not only uncovers the mechanisms behind CLIP's generalizability beyond data imbalance but also provides transferable insights for the research community. The findings are validated in both supervised and self-supervised learning, enabling models trained on imbalanced data to achieve CLIP-level performance on diverse recognition tasks.


Neural Universal Discrete Denoiser

Neural Information Processing Systems

We present a new framework of applying deep neural networks (DNN) to devise a universal discrete denoiser. Unlike other approaches that utilize supervised learning for denoising, we do not require any additional training data. In such setting, while the ground-truth label, i.e., the clean data, is not available, we devise ``pseudo-labels'' and a novel objective function such that DNN can be trained in a same way as supervised learning to become a discrete denoiser. We experimentally show that our resulting algorithm, dubbed as Neural DUDE, significantly outperforms the previous state-of-the-art in several applications with a systematic rule of choosing the hyperparameter, which is an attractive feature in practice.


The robots who predict the future

MIT Technology Review

Three books unpack our infatuation with prediction, and what we lose when we outsource this task to machines. To be human is, fundamentally, to be a forecaster. Trying to see the future, whether through the lens of past experience or the logic of cause and effect, has helped us hunt, avoid hunted, plant crops, forge social bonds, and in general survive in a world that does not prioritize our survival. Indeed, as the tools of divination have changed over the centuries, from tea leaves to data sets, our conviction that the future can be known (and therefore controlled) has only grown stronger. Today, we are awash in a sea of predictions so vast and unrelenting that most of us barely even register them. As I write this sentence, algorithms on some remote server are busy trying to guess my next word based on those I have already typed.



Consistency-based Semi-supervised Learning for Object detection

Neural Information Processing Systems

While the object detection task requires a huge number of annotated samples to guarantee its performance, placing bounding boxes for every object in each sample is time-consuming and costs alot.