AITopics | subcomponent

Collaborating Authors

subcomponent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f13ceb1b94145aad0e54186373cc86d7-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 19:51:23 GMT

constraint, generative model, subcomponent, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.70)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.70)

Add feedback

Decomposition of Small Transformer Models

Christensen, Casper L., Riggs, Logan

arXiv.org Artificial IntelligenceDec-10-2025

Recent work in mechanistic interpretability has shown that decomposing models in parameter space may yield clean handles for analysis and intervention. Previous methods have demonstrated successful applications on a wide range of toy models, but the gap to "real models" has not yet been bridged. In this work, we extend Stochastic Parameter Decomposition (SPD) to Transformer models, proposing an updated causal importance function suited for sequential data and a new loss function. We demonstrate that SPD can successfully decompose a toy induction-head model and recover the expected 2-step circuit. We also show that applying SPD to GPT-2-small can successfully locate subcomponents corresponding to interpretable concepts like "golf" and "basketball". These results take the first step in the direction of extending SPD to modern models, and show that we can use the method to surface interpretable parameter-space mechanisms.

artificial intelligence, machine learning, subcomponent, (16 more...)

arXiv.org Artificial Intelligence

2511.08854

Country: North America (0.28)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Stochastic Parameter Decomposition

Bushnaq, Lucius, Braun, Dan, Sharkey, Lee

arXiv.org Artificial IntelligenceSep-5-2025

A key step in reverse engineering neural networks is to decompose them into simpler parts that can be studied in relative isolation. Linear parameter decomposition -- a framework that has been proposed to resolve several issues with current decomposition methods -- decomposes neural network parameters into a sum of sparsely used vectors in parameter space. However, the current main method in this framework, Attribution-based Parameter Decomposition (APD), is impractical on account of its computational cost and sensitivity to hyperparameters. In this work, we introduce \textit{Stochastic Parameter Decomposition} (SPD), a method that is more scalable and robust to hyperparameters than APD, which we demonstrate by decomposing models that are slightly larger and more complex than was possible to decompose with APD. We also show that SPD avoids other issues, such as shrinkage of the learned parameters, and better identifies ground truth mechanisms in toy models. By bridging causal mediation analysis and network decomposition methods, this demonstration opens up new research possibilities in mechanistic interpretability by removing barriers to scaling linear parameter decomposition methods to larger models. We release a library for running SPD and reproducing our experiments at https://github.com/goodfire-ai/spd/tree/spd-paper.

artificial intelligence, machine learning, subcomponent, (18 more...)

arXiv.org Artificial Intelligence

2506.2079

Country: North America > United States (0.28)

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

f13ceb1b94145aad0e54186373cc86d7-Paper-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 18:54:00 GMT

constraint, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.70)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Music Genre Classification: Ensemble Learning with Subcomponents-level Attention

Liu, Yichen, Dasgupta, Abhijit, He, Qiwei

arXiv.org Artificial IntelligenceDec-20-2024

Music Genre Classification is one of the most popular topics in the fields of Music Information Retrieval (MIR) and digital signal processing. Deep Learning has emerged as the top performer for classifying music genres among various methods. The letter introduces a novel approach by combining ensemble learning with attention to sub-components, aiming to enhance the accuracy of identifying music genres. The core innovation of our work is the proposal to classify the subcomponents of the music pieces separately, allowing our model to capture distinct characteristics from those sub components. By applying ensemble learning techniques to these individual classifications, we make the final classification decision on the genre of the music. The proposed method has superior advantages in terms of accuracy compared to the other state-of-the-art techniques trained and tested on the GTZAN dataset.

artificial intelligence, classification, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2412.15602

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report > Promising Solution (0.54)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Luban: Building Open-Ended Creative Agents via Autonomous Embodied Verification

Guo, Yuxuan, Peng, Shaohui, Guo, Jiaming, Huang, Di, Zhang, Xishan, Zhang, Rui, Hao, Yifan, Li, Ling, Tian, Zikang, Gao, Mingju, Li, Yutai, Gan, Yiming, Liang, Shuai, Zhang, Zihao, Du, Zidong, Guo, Qi, Hu, Xing, Chen, Yunji

arXiv.org Artificial IntelligenceMay-24-2024

Building open agents has always been the ultimate goal in AI research, and creative agents are the more enticing. Existing LLM agents excel at long-horizon tasks with well-defined goals (e.g., `mine diamonds' in Minecraft). However, they encounter difficulties on creative tasks with open goals and abstract criteria due to the inability to bridge the gap between them, thus lacking feedback for self-improvement in solving the task. In this work, we introduce autonomous embodied verification techniques for agents to fill the gap, laying the groundwork for creative tasks. Specifically, we propose the Luban agent target creative building tasks in Minecraft, which equips with two-level autonomous embodied verification inspired by human design practices: (1) visual verification of 3D structural speculates, which comes from agent synthesized CAD modeling programs; (2) pragmatic verification of the creation by generating and verifying environment-relevant functionality programs based on the abstract criteria. Extensive multi-dimensional human studies and Elo ratings show that the Luban completes diverse creative building tasks in our proposed benchmark and outperforms other baselines ($33\%$ to $100\%$) in both visualization and pragmatism. Additional demos on the real-world robotic arm show the creation potential of the Luban in the physical world.

annotation, subcomponent, verification, (16 more...)

arXiv.org Artificial Intelligence

2405.15414

Country:

North America > United States (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (0.73)
Leisure & Entertainment > Games > Chess (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Robots (0.87)
(2 more...)

Add feedback

Undesirable Biases in NLP: Addressing Challenges of Measurement

van der Wal, Oskar, Bachmann, Dominik, Leidinger, Alina, van Maanen, Leendert, Zuidema, Willem, Schulz, Katrin

arXiv.org Artificial IntelligenceJan-14-2024

As Large Language Models and Natural Language Processing (NLP) technology rapidly develop and spread into daily life, it becomes crucial to anticipate how their use could harm people. One problem that has received a lot of attention in recent years is that this technology has displayed harmful biases, from generating derogatory stereotypes to producing disparate outcomes for different social groups. Although a lot of effort has been invested in assessing and mitigating these biases, our methods of measuring the biases of NLP models have serious problems and it is often unclear what they actually measure. In this paper, we provide an interdisciplinary approach to discussing the issue of NLP model bias by adopting the lens of psychometrics -- a field specialized in the measurement of concepts like bias that are not directly observable. In particular, we will explore two central notions from psychometrics, the construct validity and the reliability of measurement tools, and discuss how they can be applied in the context of measuring model bias. Our goal is to provide NLP practitioners with methodological tools for designing better bias measures, and to inspire them more generally to explore tools from psychometrics when working on bias measurement tools.

bias measure, computational linguistic, validity, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1.15195

2211.13709

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(11 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)
Education > Assessment & Standards (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Distributed Latency Profiling through Critical Path Tracing

Communications of the ACMDec-21-2022, 02:25:49 GMT

The collection of ProducerModules creates a graph of subcomponents that are executed by the framework to process the request. For this example, the framework knows which of A2, B1, and B2 was the last to block execution to produce A1's output. Since the framework is aware of the subcomponent dependencies, it can record the critical path. For Google Search, subcomponent-level traces are collected from several software frameworks in multiple programming languages. Framework-level implementation is essential for scalability, since it allows relatively small teams of developers to provide detailed critical path traces for code written by thousands of other people.

critical path, latency, subcomponent, (15 more...)

Communications of the ACM

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence (1.00)
Information Technology > Information Management > Search (0.69)

Add feedback

Learning Causal Graphs in Manufacturing Domains using Structural Equation Models

Kertel, Maximilian, Harmeling, Stefan, Pauly, Markus

arXiv.org Artificial IntelligenceOct-26-2022

Many production processes are characterized by numerous and complex cause-and-effect relationships. Since they are only partially known they pose a challenge to effective process control. In this work we present how Structural Equation Models can be used for deriving cause-and-effect relationships from the combination of prior knowledge and process data in the manufacturing domain. Compared to existing applications, we do not assume linear relationships leading to more informative results.

artificial intelligence, knowledge, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2210.14573

Country:

Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Industry:

Energy > Energy Storage (0.47)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.48)

Add feedback

Accelerating the Genetic Algorithm for Large-scale Traveling Salesman Problems by Cooperative Coevolutionary Pointer Network with Reinforcement Learning

Zhong, Rui, Zhang, Enzhi, Munetomo, Masaharu

arXiv.org Artificial IntelligenceSep-26-2022

In this paper, we propose a two-stage optimization strategy for solving the Large-scale Traveling Salesman Problems (LSTSPs) named CCPNRL-GA. First, we hypothesize that the participation of a well-performed individual as an elite can accelerate the convergence of optimization. Based on this hypothesis, in the first stage, we cluster the cities and decompose the LSTSPs into multiple subcomponents, and each subcomponent is optimized with a reusable Pointer Network (PtrNet). After subcomponents optimization, we combine all sub-tours to form a valid solution, this solution joins the second stage of optimization with GA. We validate the performance of our proposal on 10 LSTSPs and compare it with traditional EAs. Experimental results show that the participation of an elite individual can greatly accelerate the optimization of LSTSPs, and our proposal has broad prospects for dealing with LSTSPs.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2209.13077

Country:

Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.05)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback