AITopics | primer

2f3c6a4cd8af177f6456e7e51a916ff3-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 08:28:51 GMT

"Name" is the name of the operation in our search space. "TFFunction" is the TensorFlow function that the name is mapped to when a DNA instruction is being converted to a line of TensorFlow code. "Argument Mapping" describes how the values in a DNA's argument set are mapped to the corresponding TensorFlow function arguments. This vocabulary is largely constructed from the lowest level TF operations needed to create Transformers (see Appendix A.5). We also add commonly used math primitives such as SIN and ABS. Here we provide additional implementation details. Relative Dimensions: We use relative dimensions [13] instead of absolute dimensions for each instruction's "dimension size" argument. This allows us to resize the models to fit within our parameter limits (32M to 38M parameters). The vocabulary for these relative dimensions is [1, 2, 4, 8, 12, 16, 24, 32, 48, 64].

dim, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Primer: Searching for Efficient Transformers for Language Modeling

Neural Information Processing SystemsApr-25-2026, 08:28:47 GMT

Large Transformer models have been central to recent advances in natural language processing. The training and inference costs of these models, however, have grown rapidly and become prohibitively expensive. Here we aim to reduce the costs of Transformers by searching for a more efficient variant. Compared to previous approaches, our search is performed at a lower level, over the primitives that define a Transformer TensorFlow program. We identify an architecture, named Primer, that has a smaller training cost than the original Transformer and other variants for auto-regressive language modeling.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

2f3c6a4cd8af177f6456e7e51a916ff3-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 02:37:16 GMT

dim, in0, in1, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Oklahoma (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Primer: SearchingforEfficientTransformers forLanguageModeling

Neural Information Processing SystemsFeb-8-2026, 02:37:12 GMT

Weidentify anarchitecture, named Primer, that has a smaller training cost than the original Transformer and other variants for auto-regressive language modeling.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Searching for Efficient Transformers for Language Modeling

Neural Information Processing SystemsDec-23-2025, 22:59:01 GMT

Large Transformer models have been central to recent advances in natural language processing. The training and inference costs of these models, however, have grown rapidly and become prohibitively expensive. Here we aim to reduce the costs of Transformers by searching for a more efficient variant. Compared to previous approaches, our search is performed at a lower level, over the primitives that define a Transformer TensorFlow program. We identify an architecture, named Primer, that has a smaller training cost than the original Transformer and other variants for auto-regressive language modeling.

efficient transformer, name change, transformer, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.44)

Add feedback

A Primer on Causal and Statistical Dataset Biases for Fair and Robust Image Analysis

Jones, Charles, Glocker, Ben

arXiv.org Machine LearningSep-5-2025

Machine learning methods often fail when deployed in the real world. Worse still, they fail in high-stakes situations and across socially sensitive lines. These issues have a chilling effect on the adoption of machine learning methods in settings such as medical diagnosis, where they are arguably best-placed to provide benefits if safely deployed. In this primer, we introduce the causal and statistical structures which induce failure in machine learning methods for image analysis. We highlight two previously overlooked problems, which we call the \textit{no fair lunch} problem and the \textit{subgroup separability} problem. We elucidate why today's fair representation learning methods fail to adequately solve them and propose potential paths forward for the field.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

2509.04295

Genre:

Research Report (0.64)
Overview (0.42)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Primer C-VAE: An interpretable deep learning primer design method to detect emerging virus variants

Wang, Hanyu, Tsinda, Emmanuel K., Dunn, Anthony J., Chikweto, Francis, Zemkoho, Alain B.

arXiv.org Artificial IntelligenceMar-3-2025

Motivation: PCR is more economical and quicker than Next Generation Sequencing for detecting target organisms, with primer design being a critical step. In epidemiology with rapidly mutating viruses, designing effective primers is challenging. Traditional methods require substantial manual intervention and struggle to ensure effective primer design across different strains. For organisms with large, similar genomes like Escherichia coli and Shigella flexneri, differentiating between species is also difficult but crucial. Results: We developed Primer C-VAE, a model based on a Variational Auto-Encoder framework with Convolutional Neural Networks to identify variants and generate specific primers. Using SARS-CoV-2, our model classified variants (alpha, beta, gamma, delta, omicron) with 98% accuracy and generated variant-specific primers. These primers appeared with >95% frequency in target variants and <5% in others, showing good performance in in-silico PCR tests. For Alpha, Delta, and Omicron, our primer pairs produced fragments <200 bp, suitable for qPCR detection. The model also generated effective primers for organisms with longer gene sequences like E. coli and S. flexneri. Conclusion: Primer C-VAE is an interpretable deep learning approach for developing specific primer pairs for target organisms. This flexible, semi-automated and reliable tool works regardless of sequence completeness and length, allowing for qPCR applications and can be applied to organisms with large and highly similar genomes.

primer, sequence, variant, (14 more...)

arXiv.org Artificial Intelligence

2503.01459

Country:

North America > United States > Texas > Harris County > Houston (0.14)
South America > Uruguay > Maldonado > Maldonado (0.04)
Asia > Malaysia (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Water & Waste Management > Water Management > Constituents > Bacteria (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Primer on Large Language Models and their Limitations

Johnson, Sandra, Hyland-Wood, David

arXiv.org Artificial IntelligenceDec-2-2024

The world of artificial intelligence (AI) is increasingly penetrating all aspects of our personal and professional lives. This proliferation of AI tools and applications are being met with a mixture of excitement, scepticism and even dread [78]. Excitement at the seemingly endless potential of AI applications such as LLMs, especially when they are integrated "within broader systems" [13], scepticism as the realisation dawns that LLMs are in fact fallible as evidenced by hallucinations and hence not the golden bullet that can solve all problems [19, 21], and a feeling of dread for those who believe that LLMs and AI have the potential to detrimentally impact our lives and make people redundant [78]. The ability of some LLMs to pass Theory of Mind (ToM) [64][32] and Turing Tests [7][42] suggests support for the Computational Theory of Mind (CTM), that cognition may be substrate independent. These findings challenge biological essentialism and open new avenues for creating sophisticated AI systems capable of human-like reasoning and interaction.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.04503

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(4 more...)

Genre: Overview (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

Ramu, Pritika, Goswami, Koustava, Saxena, Apoorv, Srinivasan, Balaji Vasan

arXiv.org Artificial IntelligenceNov-23-2024

Accurately attributing answer text to its source document is crucial for developing a reliable question-answering system. However, attribution for long documents remains largely unexplored. Post-hoc attribution systems are designed to map answer text back to the source document, yet the granularity of this mapping has not been addressed. Furthermore, a critical question arises: What exactly should be attributed? This involves identifying the specific information units within an answer that require grounding. In this paper, we propose and investigate a novel approach to the factual decomposition of generated answers for attribution, employing template-based in-context learning. To accomplish this, we utilize the question and integrate negative sampling during few-shot in-context learning for decomposition. This approach enhances the semantic understanding of both abstractive and extractive answers. We examine the impact of answer decomposition by providing a thorough examination of various attribution approaches, ranging from retrieval-based techniques to LLM-based attributors.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.17073

Country:

North America > United States (0.28)
North America > Mexico > Mexico City > Mexico City (0.04)
North America > Canada > Ontario > Toronto (0.04)
(5 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Searching for Efficient Transformers for Language Modeling

Neural Information Processing SystemsOct-9-2024, 23:08:06 GMT

Large Transformer models have been central to recent advances in natural language processing. The training and inference costs of these models, however, have grown rapidly and become prohibitively expensive. Here we aim to reduce the costs of Transformers by searching for a more efficient variant. Compared to previous approaches, our search is performed at a lower level, over the primitives that define a Transformer TensorFlow program. We identify an architecture, named Primer, that has a smaller training cost than the original Transformer and other variants for auto-regressive language modeling.

efficient transformer, primer, transformer, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.80)

Add feedback

Filters

Collaborating Authors

primer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

2f3c6a4cd8af177f6456e7e51a916ff3-Supplemental.pdf

Primer: Searching for Efficient Transformers for Language Modeling

2f3c6a4cd8af177f6456e7e51a916ff3-Supplemental.pdf

Primer: SearchingforEfficientTransformers forLanguageModeling

Searching for Efficient Transformers for Language Modeling

A Primer on Causal and Statistical Dataset Biases for Fair and Robust Image Analysis

Primer C-VAE: An interpretable deep learning primer design method to detect emerging virus variants

A Primer on Large Language Models and their Limitations

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

Searching for Efficient Transformers for Language Modeling