AITopics | model generalize

Collaborating Authors

model generalize

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

6248a3b8279a39b3668a8a7c0e29164d-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 09:45:45 GMT

dataset, dood, model generalize, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Supplement to Drawing out of Distribution with Symbolic Generative Models A Details

Neural Information Processing SystemsAug-15-2025, 07:04:54 GMT

All dataset images are scaled to 50x50 in grayscale, with dataset-specific configuration list below. For one-shot classification ( 3.2), we use the original task-split, as found on It has 20 episodes, each a 20-way, 1-shot, within-alphabet classification task. Bézier curves are parametric curves commonly used in computer graphics to define smooth, continuous curves. As an effect of this rasterizing procedure, the pixel intensity can be arbitrarily large. B.3 Neural Network Configurations DooD and AIR in our experiments share the overall neural components.

dataset, dood, model generalize, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The in-context inductive biases of vision-language models differ across modalities

Allen, Kelsey, Dasgupta, Ishita, Kosoy, Eliza, Lampinen, Andrew K.

arXiv.org Artificial IntelligenceFeb-3-2025

Inductive biases are what allow learners to make guesses in the absence of conclusive evidence. These biases have often been studied in cognitive science using concepts or categories -- e.g. by testing how humans generalize a new category from a few examples that leave the category boundary ambiguous. We use these approaches to study generalization in foundation models during in-context learning. Modern foundation models can condition on both vision and text, and differences in how they interpret and learn from these different modalities is an emerging area of study. Here, we study how their generalizations vary by the modality in which stimuli are presented, and the way the stimuli are described in text. We study these biases with three different experimental paradigms, across three different vision-language models. We find that the models generally show some bias towards generalizing according to shape over color. This shape bias tends to be amplified when the examples are presented visually. By contrast, when examples are presented in text, the ordering of adjectives affects generalization. However, the extent of these effects vary across models and paradigms. These results help to reveal how vision-language models represent different types of inputs in context, and may have practical implications for the use of vision-language models.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.0153

Country: North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

The Impact of Depth and Width on Transformer Language Model Generalization

Petty, Jackson, van Steenkiste, Sjoerd, Dasgupta, Ishita, Sha, Fei, Garrette, Dan, Linzen, Tal

arXiv.org Artificial IntelligenceOct-30-2023

To process novel sentences, language models (LMs) must generalize compositionally -- combine familiar elements in new ways. What aspects of a model's structure promote compositional generalization? Focusing on transformers, we test the hypothesis, motivated by recent theoretical and empirical work, that transformers generalize more compositionally when they are deeper (have more layers). Because simply adding layers increases the total number of parameters, confounding depth and size, we construct three classes of models which trade off depth for width such that the total number of parameters is kept constant (41M, 134M and 374M parameters). We pretrain all models as LMs and fine-tune them on tasks that test for compositional generalization. We report three main conclusions: (1) after fine-tuning, deeper models generalize better out-of-distribution than shallower models do, but the relative benefit of additional layers diminishes rapidly; (2) within each family, deeper models show better language modeling performance, but returns are similarly diminishing; (3) the benefits of depth for compositional generalization cannot be attributed solely to better performance on language modeling or on in-distribution data.

compositional generalization, generalization, perplexity, (13 more...)

arXiv.org Artificial Intelligence

2310.19956

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Data Augmentation: Transforming Your Training Data from Meh to Marvelous

#artificialintelligenceMar-4-2023, 12:55:17 GMT

Today, we're going to talk about one of my favorite topics: Data Augmentation. Yes, I know, it may sound a bit dry and technical at first, but trust me, this is one of the most exciting and creative aspects of deep learning. So buckle up and let's dive in! First things first, let's define our terms. Data augmentation is a technique used in deep learning to increase the amount of training data by creating new examples from the existing ones.

augmentation, data augmentation, transformation, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback

Demystifying the Random Forest. Deconstructing and Understanding this…

#artificialintelligenceFeb-7-2023, 16:25:22 GMT

In classical Machine Learning, Random Forests have been a silver bullet type of model. In this post, I want to better understand the components that make up a Random Forest. To accomplish this, I am going to deconstruct the Random Forest into its most basic components and explain what is going on in each level of computation. By the end, we will have attained a much deeper understanding of how Random Forests work and how to work with them with more intuition. The examples we will use will be focused on classification, but many of the principles apply to the regression scenarios as well. Let's start by invoking a classic Random Forest pattern.

entropy, node, random forest, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Detection of Surface Cracks in Concrete Structures using Deep Learning

#artificialintelligenceJan-24-2020, 10:08:43 GMT

We used Adam as the optimizer and train the model for 6 epochs. We use transfer learning to then train the model on the training data set while measuring loss and accuracy on the validation set. As shown by the loss and accuracy numbers below, the model trains very quickly. After the 1st epoch, train accuracy is 87% and validation accuracy is 97%!. This is the power of transfer learning. Our final model has a validation accuracy of 98.4%.

accuracy, concrete structure, detection, (13 more...)

#artificialintelligence

Industry:

Materials > Construction Materials (0.44)
Construction & Engineering (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback