AITopics | ancestry model

Collaborating Authors

ancestry model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cluster-Learngene: InheritingAdaptiveClustersfor VisionTransformers

Neural Information Processing SystemsFeb-10-2026, 12:15:00 GMT

Recent studies [42,56]have visualized the mean attention distance of ViTs, offering deeper insights into weight redundancy among attention heads across different layers.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China (0.04)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Cluster-Learngene: Inheriting Adaptive Clusters for Vision Transformers

Neural Information Processing SystemsDec-24-2025, 18:31:03 GMT

In recent years, the merging of vast datasets with powerful computational resources has led to the emergence of large pre-trained models in the field of deep learning.

artificial intelligence, cluster-learngene, machine learning, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Add feedback

2e53c02ea028cbf603f4b6b47fef3d97-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:13:51 GMT

attention head, centroid, descendant model, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(4 more...)

Add feedback

Cluster-Learngene: Inheriting Adaptive Clusters for Vision Transformers

Neural Information Processing SystemsMay-26-2025, 20:17:48 GMT

In recent years, the merging of vast datasets with powerful computational resources has led to the emergence of large pre-trained models in the field of deep learning. To mitigate this issue, we propose \textbf{Cluster-Learngene}, which effectively clusters critical internal modules from a large ancestry model and then inherits them to initialize descendant models of elastic scales. Specifically, based on the density characteristics of attention heads, our method adaptively clusters attention heads of each layer and position-wise feed-forward networks (FFNs) in the ancestry model as the learngene. Moreover, we introduce priority weight-sharing and learnable parameter transformations that expand the learngene to initialize descendant models of elastic scales. Through extensive experimentation, we demonstrate that Cluster-Learngene not only is more efficient compared to other initialization methods but also customizes models of elastic scales according to downstream task resources.

artificial intelligence, cluster-learngene, machine learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)

Add feedback

Building Variable-sized Models via Learngene Pool

Shi, Boyu, Xia, Shiyu, Yang, Xu, Chen, Haokun, Kou, Zhiqiang, Geng, Xin

arXiv.org Artificial IntelligenceDec-11-2023

Recently, Stitchable Neural Networks (SN-Net) is proposed to stitch some pre-trained networks for quickly building numerous networks with different complexity and performance trade-offs. In this way, the burdens of designing or training the variable-sized networks, which can be used in application scenarios with diverse resource constraints, are alleviated. However, SN-Net still faces a few challenges. 1) Stitching from multiple independently pre-trained anchors introduces high storage resource consumption. 2) SN-Net faces challenges to build smaller models for low resource constraints. 3). SN-Net uses an unlearned initialization method for stitch layers, limiting the final performance. To overcome these challenges, motivated by the recently proposed Learngene framework, we propose a novel method called Learngene Pool. Briefly, Learngene distills the critical knowledge from a large pre-trained model into a small part (termed as learngene) and then expands this small part into a few variable-sized models. In our proposed method, we distill one pretrained large model into multiple small models whose network blocks are used as learngene instances to construct the learngene pool. Since only one large model is used, we do not need to store more large models as SN-Net and after distilling, smaller learngene instances can be created to build small models to satisfy low resource constraints. We also insert learnable transformation matrices between the instances to stitch them into variable-sized models to improve the performance of these models. Exhaustive experiments have been implemented and the results validate the effectiveness of the proposed Learngene Pool compared with SN-Net.

ancestry model, auxiliary model, learngene pool, (12 more...)

arXiv.org Artificial Intelligence

2312.05743

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Learngene: Inheriting Condensed Knowledge from the Ancestry Model to Descendant Models

Wang, Qiufeng, Yang, Xu, Lin, Shuxia, Wang, Jing, Geng, Xin

arXiv.org Artificial IntelligenceJun-29-2023

During the continuous evolution of one organism's ancestry, its genes accumulate extensive experiences and knowledge, enabling newborn descendants to rapidly adapt to their specific environments. Motivated by this observation, we propose a novel machine learning paradigm Learngene to enable learning models to incorporate three key characteristics of genes. (i) Accumulating: the knowledge is accumulated during the continuous learning of an ancestry model. (ii) Condensing: the extensive accumulated knowledge is condensed into a much more compact information piece, i.e., learngene. (iii) Inheriting: the condensed learngene is inherited to make it easier for descendant models to adapt to new environments. Since accumulating has been studied in well-established paradigms like large-scale pre-training and lifelong learning, we focus on condensing and inheriting, which induces three key issues and we provide the preliminary solutions to these issues in this paper: (i) Learngene Form: the learngene is set to a few integral layers that can preserve significance. (ii) Learngene Condensing: we identify which layers among the ancestry model have the most similarity as one pseudo descendant model. (iii) Learngene Inheriting: to construct distinct descendant models for the specific downstream tasks, we stack some randomly initialized layers to the learngene layers. Extensive experiments across various settings, including using different network architectures like Vision Transformer (ViT) and Convolutional Neural Networks (CNNs) on different datasets, are carried out to confirm four advantages of Learngene: it makes the descendant models 1) converge more quickly, 2) exhibit less sensitivity to hyperparameters, 3) perform better, and 4) require fewer training samples to converge.

artificial intelligence, descendant model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.02279

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Oceania > Australia (0.04)
North America > Canada > Quebec > Montreal (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Continuing Education (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback