AITopics | critical period

Collaborating Authors

critical period

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

One Period to Rule Them All: Identifying Critical Learning Periods in Deep Networks

Fukase, Vinicius Yuiti, Gama, Heitor, Bueno, Barbara, Libanio, Lucas, Costa, Anna Helena Reali, Jordao, Artur

arXiv.org Artificial IntelligenceNov-11-2025

Critical Learning Periods comprehend an important phenomenon involving deep learning, where early epochs play a decisive role in the success of many training recipes, such as data augmentation. Existing works confirm the existence of this phenomenon and provide useful insights. However, the literature lacks efforts to precisely identify when critical periods occur. In this work, we fill this gap by introducing a systematic approach for identifying critical periods during the training of deep neural networks, focusing on eliminating computationally intensive regularization techniques and effectively applying mechanisms for reducing computational costs, such as data pruning. Our method leverages generalization prediction mechanisms to pinpoint critical phases where training recipes yield maximum benefits to the predictive ability of models. By halting resource-intensive recipes beyond these periods, we significantly accelerate the learning phase and achieve reductions in training time, energy consumption, and CO$_2$ emissions. Experiments on standard architectures and benchmarks confirm the effectiveness of our method. Specifically, we achieve significant milestones by reducing the training time of popular architectures by up to 59.67%, leading to a 59.47% decrease in CO$_2$ emissions and a 60% reduction in financial costs, without compromising performance. Our work enhances understanding of training dynamics and paves the way for more sustainable and efficient deep learning practices, particularly in resource-constrained environments. In the era of the race for foundation models, we believe our method emerges as a valuable framework. The repository is available at https://github.com/baunilhamarga/critical-periods

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.procs.2025.07.138

2506.15954

Country: South America > Brazil (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Energy (1.00)
Education > Instructional Theory (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence

Neural Information Processing SystemsOct-3-2025, 04:02:14 GMT

Regularization is typically understood as improving generalization by altering the landscape of local extrema to which the model eventually converges.

artificial intelligence, machine learning, regularization, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.48)

Industry: Education > Educational Setting > Preschool (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

How Syntax Specialization Emerges in Language Models

Duan, Xufeng, Yao, Zhaoqian, Zhang, Yunhao, Wang, Shaonan, Cai, Zhenguang G.

arXiv.org Artificial IntelligenceMay-27-2025

Large language models (LLMs) have been found to develop surprising internal specializations: Individual neurons, attention heads, and circuits become selectively sensitive to syntactic structure, reflecting patterns observed in the human brain. While this specialization is well-documented, how it emerges during training and what influences its development remains largely unknown. In this work, we tap into the black box of specialization by tracking its formation over time. By quantifying internal syntactic consistency across minimal pairs from various syntactic phenomena, we identify a clear developmental trajectory: Syntactic sensitivity emerges gradually, concentrates in specific layers, and exhibits a 'critical period' of rapid internal specialization. This process is consistent across architectures and initialization parameters (e.g., random seeds), and is influenced by model scale and training data. We therefore reveal not only where syntax arises in LLMs but also how some models internalize it during training. To support future research, we will release the code, models, and training checkpoints upon acceptance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.19548

Country:

Europe (0.28)
Asia > Middle East > UAE (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition

Mita, Masato, Yoshida, Ryo, Oseki, Yohei

arXiv.org Artificial IntelligenceFeb-16-2025

Large language models possess general linguistic abilities but acquire language less efficiently than humans. This study proposes a method for integrating the developmental characteristics of working memory during the critical period, a stage when human language acquisition is particularly efficient, into the training process of language models. The proposed method introduces a mechanism that initially constrains working memory during the early stages of training and gradually relaxes this constraint in an exponential manner as learning progresses. Targeted syntactic evaluation shows that the proposed method outperforms conventional methods without memory constraints or with static memory constraints. These findings not only provide new directions for designing data-efficient language models but also offer indirect evidence supporting the role of the developmental characteristics of working memory as the underlying mechanism of the critical period in language acquisition.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.04795

Country:

North America > United States (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning

Chen, Xiaodan, Pitti, Alexandre, Quoy, Mathias, Chen, Nancy F

arXiv.org Artificial IntelligenceDec-23-2024

Understanding how infants perceive speech sounds and language structures is still an open problem. Previous research in artificial neural networks has mainly focused on large dataset-dependent generative models, aiming to replicate language-related phenomena such as ''perceptual narrowing''. In this paper, we propose a novel approach using a small-sized generative neural network equipped with a continual learning mechanism based on predictive coding for mono-and bilingual speech sound learning (referred to as language sound acquisition during ''critical period'') and a compositional optimization mechanism for generation where no learning is involved (later infancy sound imitation). Our model prioritizes interpretability and demonstrates the advantages of online learning: Unlike deep networks requiring substantial offline training, our model continuously updates with new data, making it adaptable and responsive to changing inputs. Through experiments, we demonstrate that if second language acquisition occurs during later infancy, the challenges associated with learning a foreign language after the critical period amplify, replicating the perceptual narrowing effect.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-72350-6_2

2412.17456

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Education (0.66)
Law > Litigation (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Do Language Models Have a Critical Period for Language Acquisition?

Constantinescu, Ionut, Pimentel, Tiago, Cotterell, Ryan, Warstadt, Alex

arXiv.org Artificial IntelligenceJul-27-2024

Humans appear to have a critical period (CP) for language acquisition: Second language (L2) acquisition becomes harder after early childhood, and ceasing exposure to a first language (L1) after this period (but not before) typically does not lead to substantial loss of L1 proficiency. It is unknown whether these CP effects result from innately determined brain maturation or as a stabilization of neural connections naturally induced by experience. In this study, we use language models (LMs) to test the extent to which these phenomena are peculiar to humans, or shared by a broader class of language learners. We vary the age of exposure by training LMs on language pairs in various experimental conditions, and find that LMs, which lack any direct analog to innate maturational stages, do not show CP effects when trained sequentially on L1 and L2. Our results contradict the claim that CP effects are an inevitable result of learning in statistical learners, and they are consistent with an innate mechanism for CP effects. We show that we can reverse-engineer the CP by introducing a regularizer partway through training to simulate a maturational decrease in plasticity. All in all, our results suggest that L1 learning on its own may not be enough to induce a CP, and additional engineering is necessary to make language models more cognitively plausible.

computational linguistic, language acquisition, plasticity, (15 more...)

arXiv.org Artificial Intelligence

2407.19325

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

The Deep learning model of upstream and downstream brain regions Based on Memory Generation-Consolidation-Loss, Synaptic Strength Rebalance and mnemonic spiral

Tao, Jun-Bo, Sun, Bai-Qing, Zhu, Wei-Dong, Qu, Shi-You, Chen, Ling-Kun, Li, Jia-Qiang, Li, Guo-Qi, Wu, Chong, Xiong, Yu, Zhou, Jiaxuan

arXiv.org Machine LearningOct-15-2023

In addition to the shared weights of the synaptic connections, our new neural network includes the synaptic effective range weights for both the forward and back propagation. We try to simulate the functions of prefrontal lobe, amygdala, and hippocampus by the Deep learning model of upstream and downstream brain regions(DLMOUADBR). Along the forward propagation, the negative memory gradually increases. Along the back propagation, the optimization order will increase. Memory flow may be considered to be the transmission of the rate of change of the architecture, then the nth cortex is the nth derivative of brain plasticity. Astrocytic cortex memory persistence factor and astrocytes phagocytose synapses inhibit local synaptic accumulation, and the model inspires experiments. The memory Generation-Consolidation-Loss model tries to explain 15 phenomena of Alzheimer's disease based on the DLMOUADBR and reverse turbulence. We consider the Heart-Brain model to reference to non-classical quantum entanglement experiments. And turbulent movement of brain regions through mnemonic spiral. The study first showed that mnemonic architecture formula-logarithmic spiral, turbulent movement in brain regions is only energy loss and memory engrams are approximate. This explains the dynamics cause of shaping in the geometry of the brain, related to the turbulent movement of the logarithmic spiral of the brain. In simulation, it is possible that thicker cortices and more diverse individuals within the brain could have high IQ, but thickest cortices and most diverse individuals may have low IQ in simulation and tries to give the mechanism of Cognitive impairment.

artificial intelligence, effective range, machine learning, (15 more...)

arXiv.org Machine Learning

2203.1174

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.27)
North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Critical Learning Periods for Multisensory Integration in Deep Networks

Kleinman, Michael, Achille, Alessandro, Soatto, Stefano

arXiv.org Artificial IntelligenceSep-14-2023

We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training. Interfering with the learning process during this initial stage can permanently impair the development of a skill, both in artificial and biological systems where the phenomenon is known as a critical learning period. We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations. This evidence challenges the view, engendered by analysis of wide and shallow networks, that early learning dynamics of neural networks are simple, akin to those of a linear model. Indeed, we show that even deep linear networks exhibit critical learning periods for multi-source integration, while shallow networks do not. To better understand how the internal representations change according to disturbances or sensory deficits, we introduce a new measure of source sensitivity, which allows us to track the inhibition and integration of sources during training. Our analysis of inhibition suggests cross-source reconstruction as a natural auxiliary training objective, and indeed we show that architectures trained with cross-sensor reconstruction objectives are remarkably more resilient to critical periods. Our findings suggest that the recent success in self-supervised multi-modal training compared to previous supervised efforts may be in part due to more robust learning dynamics and not solely due to better architectures and/or more data.

deficit, information, unit number, (17 more...)

arXiv.org Artificial Intelligence

2210.04643

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.93)
Education > Instructional Theory (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Critical Learning Periods Emerge Even in Deep Linear Networks

Kleinman, Michael, Achille, Alessandro, Soatto, Stefano

arXiv.org Artificial IntelligenceAug-23-2023

Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations. Despite the radical differences between biological and artificial networks, critical learning periods have been empirically observed in both systems. This suggests that critical periods may be fundamental to learning and not an accident of biology. Yet, why exactly critical periods emerge in deep networks is still an open question, and in particular it is unclear whether the critical periods observed in both systems depend on particular architectural or optimization details. To isolate the key underlying factors, we focus on deep linear network models, and show that, surprisingly, such networks also display much of the behavior seen in biology and artificial networks, while being amenable to analytical treatment. We show that critical periods depend on the depth of the model and structure of the data distribution. We also show analytically and in simulations that the learning of features is tied to competition between sources. Finally, we extend our analysis to multi-task learning to show that pre-training on certain tasks can damage the transfer performance on new tasks, and show how this depends on the relationship between tasks and the duration of the pre-training stage. To the best of our knowledge, our work provides the first analytically tractable model that sheds light into why critical learning periods emerge in biological and artificial networks.

artificial intelligence, deficit, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.12221

Country:

North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.50)

Industry:

Education > Instructional Theory (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Psychedelic drugs may reopen critical learning periods in the brain

New ScientistNov-17-2022, 20:45:08 GMT

Psychedelic drugs have been used in adult mice to reopen so-called "critical periods", which are crucial windows of development and learning in the brain that usually happen in adolescence. During critical periods, the brain is highly plastic and capable of learning specific skills such as language. Once this window closes, it's nearly impossible to acquire certain abilities. For instance, children who aren't exposed to language in their first year of life may never fully grasp sentence structure.

brain, psychedelic drug, reopen critical learning period, (1 more...)

New Scientist

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.71)
Health & Medicine > Pharmaceuticals & Biotechnology (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback