Goto

Collaborating Authors

 Lopes, Raphael Gontijo


Language Model Cascades

arXiv.org Artificial Intelligence

Prompted models have demonstrated impressive In this position paper, we argue that a useful unifying few-shot learning abilities. Repeated interactions framework for understanding and extending this disparate at test-time with a single model, or the body of work is in terms of probabilistic programming languages composition of multiple models together, further (PPL) extended to work with strings, instead of expands capabilities. These compositions are more atomic data types like integers and floats. That is, probabilistic models, and may be expressed in we use a PPL to define a joint probability model on stringvalued the language of graphical models with random random variables, parameterized using LMs, and variables whose values are complex data types then condition this model on string-valued observations in such as strings. Cases with control flow and dynamic order to compute a posterior over string-valued unknowns, structure require techniques from probabilistic which we can then infer. We call such a probabilistic programming, which allow implementing program a language model cascade. We show that this disparate model structures and inference strategies framework captures many recent approaches, and also allows in a unified language. We formalize several us to tackle more complex multi-step reasoning problems.


A Fourier Perspective on Model Robustness in Computer Vision

arXiv.org Machine Learning

Achieving robustness to distributional shift is a longstanding and challenging goal of computer vision. Data augmentation is a commonly used approach for improving robustness, however robustness gains are typically not uniform across corruption types. Indeed increasing performance in the presence of random noise is often met with reduced performance on other corruptions such as contrast change. Understanding when and why these sorts of trade-offs occur is a crucial step towards mitigating them. Towards this end, we investigate recently observed trade-offs caused by Gaussian data augmentation and adversarial training. We find that both methods improve robustness to corruptions that are concentrated in the high frequency domain while reducing robustness to corruptions that are concentrated in the low frequency domain. This suggests that one way to mitigate these trade-offs via data augmentation is to use a more diverse set of augmentations. Towards this end we observe that AutoAugment, a recently proposed data augmentation policy optimized for clean accuracy, achieves state-of-the-art robustness on the CIFAR-10-C and ImageNet-C benchmarks.


Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation

arXiv.org Machine Learning

Deploying machine learning systems in the real world requires both high accuracy on clean data and robustness to naturally occurring corruptions. While architectural advances have led to improved accuracy, building robust models remains challenging. Prior work has argued that there is an inherent trade-off between robustness and accuracy, which is exemplified by standard data augment techniques such as Cutout, which improves clean accuracy but not robustness, and additive Gaussian noise, which improves robustness but hurts accuracy. To overcome this trade-off, we introduce Patch Gaussian, a simple augmentation scheme that adds noise to randomly selected patches in an input image. Models trained with Patch Gaussian achieve state of the art on the CIFAR-10 and ImageNetCommon Corruptions benchmarks while also improving accuracy on clean data. We find that this augmentation leads to reduced sensitivity to high frequency noise(similar to Gaussian) while retaining the ability to take advantage of relevant high frequency information in the image (similar to Cutout). Finally, we show that Patch Gaussian can be used in conjunction with other regularization methods and data augmentation policies such as AutoAugment, and improves performance on the COCO object detection benchmark.


A Learned Representation for Scalable Vector Graphics

arXiv.org Machine Learning

Dramatic advances in generative models have resulted in near photographic quality for artificially rendered faces, animals and other objects in the natural world. In spite of such advances, a higher level understanding of vision and imagery does not arise from exhaustively modeling an object, but instead identifying higher-level attributes that best summarize the aspects of an object. In this work we attempt to model the drawing process of fonts by building sequential generative models of vector graphics. This model has the benefit of providing a scale-invariant representation for imagery whose latent representation may be systematically manipulated and exploited to perform style propagation. We demonstrate these results on a large dataset of fonts and highlight how such a model captures the statistical dependencies and richness of this dataset. We envision that our model can find use as a tool for graphic designers to facilitate font design.