model generalize
Supplement to Drawing out of Distribution with Symbolic Generative Models A Details
All dataset images are scaled to 50x50 in grayscale, with dataset-specific configuration list below. For one-shot classification ( 3.2), we use the original task-split, as found on It has 20 episodes, each a 20-way, 1-shot, within-alphabet classification task. Bézier curves are parametric curves commonly used in computer graphics to define smooth, continuous curves. As an effect of this rasterizing procedure, the pixel intensity can be arbitrarily large. B.3 Neural Network Configurations DooD and AIR in our experiments share the overall neural components.
The in-context inductive biases of vision-language models differ across modalities
Allen, Kelsey, Dasgupta, Ishita, Kosoy, Eliza, Lampinen, Andrew K.
Inductive biases are what allow learners to make guesses in the absence of conclusive evidence. These biases have often been studied in cognitive science using concepts or categories -- e.g. by testing how humans generalize a new category from a few examples that leave the category boundary ambiguous. We use these approaches to study generalization in foundation models during in-context learning. Modern foundation models can condition on both vision and text, and differences in how they interpret and learn from these different modalities is an emerging area of study. Here, we study how their generalizations vary by the modality in which stimuli are presented, and the way the stimuli are described in text. We study these biases with three different experimental paradigms, across three different vision-language models. We find that the models generally show some bias towards generalizing according to shape over color. This shape bias tends to be amplified when the examples are presented visually. By contrast, when examples are presented in text, the ordering of adjectives affects generalization. However, the extent of these effects vary across models and paradigms. These results help to reveal how vision-language models represent different types of inputs in context, and may have practical implications for the use of vision-language models.
The Impact of Depth and Width on Transformer Language Model Generalization
Petty, Jackson, van Steenkiste, Sjoerd, Dasgupta, Ishita, Sha, Fei, Garrette, Dan, Linzen, Tal
To process novel sentences, language models (LMs) must generalize compositionally -- combine familiar elements in new ways. What aspects of a model's structure promote compositional generalization? Focusing on transformers, we test the hypothesis, motivated by recent theoretical and empirical work, that transformers generalize more compositionally when they are deeper (have more layers). Because simply adding layers increases the total number of parameters, confounding depth and size, we construct three classes of models which trade off depth for width such that the total number of parameters is kept constant (41M, 134M and 374M parameters). We pretrain all models as LMs and fine-tune them on tasks that test for compositional generalization. We report three main conclusions: (1) after fine-tuning, deeper models generalize better out-of-distribution than shallower models do, but the relative benefit of additional layers diminishes rapidly; (2) within each family, deeper models show better language modeling performance, but returns are similarly diminishing; (3) the benefits of depth for compositional generalization cannot be attributed solely to better performance on language modeling or on in-distribution data.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > Jordan (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (2 more...)
Data Augmentation: Transforming Your Training Data from Meh to Marvelous
Today, we're going to talk about one of my favorite topics: Data Augmentation. Yes, I know, it may sound a bit dry and technical at first, but trust me, this is one of the most exciting and creative aspects of deep learning. So buckle up and let's dive in! First things first, let's define our terms. Data augmentation is a technique used in deep learning to increase the amount of training data by creating new examples from the existing ones.
Demystifying the Random Forest. Deconstructing and Understanding this…
In classical Machine Learning, Random Forests have been a silver bullet type of model. In this post, I want to better understand the components that make up a Random Forest. To accomplish this, I am going to deconstruct the Random Forest into its most basic components and explain what is going on in each level of computation. By the end, we will have attained a much deeper understanding of how Random Forests work and how to work with them with more intuition. The examples we will use will be focused on classification, but many of the principles apply to the regression scenarios as well. Let's start by invoking a classic Random Forest pattern.
Detection of Surface Cracks in Concrete Structures using Deep Learning
We used Adam as the optimizer and train the model for 6 epochs. We use transfer learning to then train the model on the training data set while measuring loss and accuracy on the validation set. As shown by the loss and accuracy numbers below, the model trains very quickly. After the 1st epoch, train accuracy is 87% and validation accuracy is 97%!. This is the power of transfer learning. Our final model has a validation accuracy of 98.4%.
- Materials > Construction Materials (0.44)
- Construction & Engineering (0.44)