AITopics | subfunction

Collaborating Authors

subfunction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization

Neural Information Processing SystemsJun-15-2026, 08:40:40 GMT

We consider a decentralized setup in which the participants collaboratively train and serve a large neural network, and where each participant only processes a subset of the model. In this setup, we explore the possibility of unmaterializable weights, where a full weight set is never available to any one participant. We introduce Unextractable Protocol Models (UPMs): a training and inference framework that leverages the sharded model setup to ensure model shards (i.e., subsets) held by participants are incompatible at different time steps. UPMs periodically inject timevarying, random, invertible transforms at participant boundaries; preserving the overall network function yet rendering cross-time assemblies incoherent. On Qwen2.5-0.5B and Llama-3.2-1B, 10 000 transforms leave FP32 perplexity unchanged ( PPL< 0.01; Jensen-Shannon drift < 4 10 5), and we show how to control growth for lower precision datatypes. Applying a transform every 30s adds 3% latency, 0.1% bandwidth, and 10% GPU-memory overhead at inference, while training overhead falls to 1.6% time and < 1% memory. We consider several attacks, showing that the requirements of direct attacks are impractical and easy to defend against, and that gradient-based fine-tuning of stitched partitions consumes 60% of the tokens required to train from scratch. By enabling models to be collaboratively trained yet not extracted, UPMs make it practical to embed programmatic incentive mechanisms in community-driven decentralized training.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits

Ahmad, Areeb, Joshi, Abhinav, Modi, Ashutosh

arXiv.org Artificial IntelligenceNov-26-2025

Transformer-based language models exhibit complex and distributed behavior, yet their internal computations remain poorly understood. Existing mechanistic interpretability methods typically treat attention heads and multilayer perceptron layers (MLPs) (the building blocks of a transformer architecture) as indivisible units, overlooking possibilities of functional substructure learned within them. In this work, we introduce a more fine-grained perspective that decomposes these components into orthogonal singular directions, revealing superposed and independent computations within a single head or MLP. We validate our perspective on widely used standard tasks like Indirect Object Identification (IOI), Gender Pronoun (GP), and Greater Than (GT), showing that previously identified canonical functional heads, such as the name mover, encode multiple overlapping subfunctions aligned with distinct singular directions. Nodes in a computational graph, that are previously identified as circuit elements show strong activation along specific low-rank directions, suggesting that meaningful computations reside in compact subspaces. While some directions remain challenging to interpret fully, our results highlight that transformer computations are more distributed, structured, and compositional than previously assumed. This perspective opens new avenues for fine-grained mechanistic interpretability and a deeper understanding of model internals.

large language model, machine learning, singular direction, (20 more...)

arXiv.org Artificial Intelligence

2511.20273

Country:

North America > United States (0.28)
Asia (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Complex System Diagnostics Using a Knowledge Graph-Informed and Large Language Model-Enhanced Framework

Marandi, Saman, Hu, Yu-Shu, Modarres, Mohammad

arXiv.org Artificial IntelligenceSep-1-2025

In this paper, we present a novel diagnostic framework that integrates Knowledge Graphs (KGs) and Large Language Models (LLMs) to support system diagnostics in high-reliability systems such as nuclear power plants. Traditional diagnostic modeling struggles when systems become too complex, making functional modeling a more attractive approach. Our approach introduces a diagnostic framework grounded in the functional modeling principles of the Dynamic Master Logic (DML) model. It incorporates two coordinated LLM components, including an LLM-based workflow for automated construction of DML logic from system documentation and an LLM agent that facilitates interactive diagnostics. The generated logic is encoded into a structured KG, referred to as KG-DML, which supports hierarchical fault reasoning. Expert knowledge or operational data can also be incorporated to refine the model's precision and diagnostic depth. In the interaction phase, users submit natural language queries, which are interpreted by the LLM agent. The agent selects appropriate tools for structured reasoning, including upward and downward propagation across the KG-DML. Rather than embedding KG content into every prompt, the LLM agent distinguishes between diagnostic and interpretive tasks. For diagnostics, the agent selects and executes external tools that perform structured KG reasoning. For general queries, a Graph-based Retrieval-Augmented Generation (Graph-RAG) approach is used, retrieving relevant KG segments and embedding them into the prompt to generate natural explanations. A case study on an auxiliary feedwater system demonstrated the framework's effectiveness, with over 90% accuracy in key elements and consistent tool and argument extraction, supporting its use in safety-critical diagnostics.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/app15179428

2505.21291

Country: North America > United States (0.46)

Genre:

Workflow (0.67)
Research Report (0.64)

Industry: Energy > Power Industry > Utilities > Nuclear (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Subfunction Structure Matters: A New Perspective on Local Optima Networks

Thomson, S. L., Przewozniczek, M. W.

arXiv.org Artificial IntelligenceApr-28-2025

Local optima networks (LONs) capture fitness landscape information. They are typically constructed in a black-box manner; information about the problem structure is not utilised. This also applies to the analysis of LONs: knowledge about the problem, such as interaction between variables, is not considered. We challenge this status-quo with an alternative approach: we consider how LON analysis can be improved by incorporating subfunction-based information - this can either be known a-priori or learned during search. To this end, LONs are constructed for several benchmark pseudo-boolean problems using three approaches: firstly, the standard algorithm; a second algorithm which uses deterministic grey-box crossover; and a third algorithm which selects perturbations based on learned information about variable interactions. Metrics related to subfunction changes in a LON are proposed and compared with metrics from previous literature which capture other aspects of a LON. Incorporating problem structure in LON construction and analysing it can bring enriched insight into optimisation dynamics. Such information may be crucial to understanding the difficulty of solving a given problem with state-of-the-art linkage learning optimisers. In light of the results, we suggest incorporation of problem structure as an alternative paradigm in landscape analysis for problems with known or suspected subfunction structure.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3712256.3726426

2504.17799

Country:

North America > United States (0.47)
Europe > United Kingdom (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Shi, Yuling, Wang, Songsong, Wan, Chengcheng, Gu, Xiaodong

arXiv.org Artificial IntelligenceOct-5-2024

While large language models have made significant strides in code generation, the pass rate of the generated code is bottlenecked on subtle errors, often requiring human intervention to pass tests, especially for complex problems. Existing LLM-based debugging systems treat generated programs as monolithic units, failing to address bugs at multiple levels of granularity, from low-level syntax errors to high-level algorithmic flaws. In this paper, we introduce Multi-Granularity Debugger (MGDebugger), a hierarchical code debugger by isolating, identifying, and resolving bugs at various levels of granularity. MGDebugger decomposes problematic code into a hierarchical tree structure of subfunctions, with each level representing a particular granularity of error. During debugging, it analyzes each subfunction and iteratively resolves bugs in a bottom-up manner. To effectively test each subfunction, we propose an LLM-simulated Python executor, which traces code execution and tracks important variable states to pinpoint errors accurately. Extensive experiments demonstrate that MGDebugger outperforms existing debugging systems, achieving an 18.9% improvement in accuracy over seed generations in HumanEval and a 97.6% repair success rate in HumanEvalFix. Furthermore, MGDebugger effectively fixes bugs across different categories and difficulty levels, demonstrating its robustness and effectiveness.

mgdebugger, subfunction, test case, (17 more...)

arXiv.org Artificial Intelligence

2410.01215

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre:

Workflow (0.93)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Achieving interpretable machine learning by functional decomposition of black-box models into explainable predictor effects

Köhler, David, Rügamer, David, Schmid, Matthias

arXiv.org Machine LearningJul-26-2024

Machine learning (ML) has increased greatly in both popularity and significance, driven by an increase in methods, computing power and data availability [33]. On July 5, 2024, a search on Web of Science for publications including the term "machine learning" yielded more than 350,000 results, corresponding to an average annual increase by more than 20% since 2006. ML models are often characterized by their high generalizability, making them particularly successful when used for supervised learning tasks like classification and risk prediction. In recent years, ML models based on deep artificial neural networks (ANNs) have led to groundbreaking results in the development of high-performing prediction models. The high prediction accuracy of modern ML models is usually achieved by optimizing complex "black-box" architectures with thousands of parameters. As a consequence, they often result in predictions that are difficult, if not impossible, to interpret. This interpretability problem has been hindering the use of ML in fields like medicine, ecology and insurance, where an understanding of the model and its inner workings is paramount to ensure user acceptance and fairness. In a recent environmental study, for example, we explored the use of ML to derive predictions of stream biological condition in the Chesapeake Bay watershed of the mid-Atlantic coast of North America [26]. Clearly, if these predictions are intended to inform future management policies (projecting, e.g., changes in land use, climate and watershed characteristics), they are required to be interpretable in terms of relevant features as well as the directions and strengths of the feature effects.

algorithm, decomposition, orthogonality, (15 more...)

arXiv.org Machine Learning

2407.1865

Country:

North America > United States > Virginia (0.24)
North America > United States > Maryland (0.24)
Atlantic Ocean > North Atlantic Ocean > Chesapeake Bay (0.24)
(3 more...)

Genre: Research Report (1.00)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

NK Hybrid Genetic Algorithm for Clustering

Tinós, Renato, Zhao, Liang, Chicano, Francisco, Whitley, Darrell

arXiv.org Artificial IntelligenceFeb-6-2024

The NK hybrid genetic algorithm for clustering is proposed in this paper. In order to evaluate the solutions, the hybrid algorithm uses the NK clustering validation criterion 2 (NKCV2). NKCV2 uses information about the disposition of $N$ small groups of objects. Each group is composed of $K+1$ objects of the dataset. Experimental results show that density-based regions can be identified by using NKCV2 with fixed small $K$. In NKCV2, the relationship between decision variables is known, which in turn allows us to apply gray box optimization. Mutation operators, a partition crossover, and a local search strategy are proposed, all using information about the relationship between decision variables. In partition crossover, the evaluation function is decomposed into $q$ independent components; partition crossover then deterministically returns the best among $2^q$ possible offspring with computational complexity $O(N)$. The NK hybrid genetic algorithm allows the detection of clusters with arbitrary shapes and the automatic estimation of the number of clusters. In the experiments, the NK hybrid genetic algorithm produced very good results when compared to another genetic algorithm approach and to state-of-art clustering algorithms.

algorithm, criterion, dataset, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TEVC.2018.2828643

2402.03813

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > Colorado > Larimer County > Fort Collins (0.04)
North America > United States > Arizona (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Mainline Automatic Train Horn and Brake Performance Metric

Tagiew, Rustam

arXiv.org Artificial IntelligenceJul-5-2023

This paper argues for the introduction of a mainline rail-oriented performance metric for driver-replacing on-board perception systems. Perception at the head of a train is divided into several subfunctions. This article presents a preliminary submetric for the obstacle detection subfunction. To the best of the author's knowledge, no other such proposal for obstacle detection exists. A set of submetrics for the subfunctions should facilitate the comparison of perception systems among each other and guide the measurement of human driver performance. It should also be useful for a standardized prediction of the number of accidents for a given perception system in a given operational design domain. In particular, for the proposal of the obstacle detection submetric, the professional readership is invited to provide their feedback and quantitative information to the author. The analysis results of the feedback will be published separately later.

artificial intelligence, detection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.02586

Country: Europe > Germany (0.28)

Genre: Research Report (0.70)

Industry:

Transportation > Ground > Rail (1.00)
Transportation > Infrastructure & Services (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

Problem examination for AI methods in product design

Rosenthal, Philipp, Niggemann, Oliver

arXiv.org Artificial IntelligenceJan-19-2022

Artificial Intelligence (AI) has significant potential for product design: AI can check technical and non-technical constraints on products, it can support a quick design of new product variants and new AI methods may also support creativity. But currently product design and AI are separate communities fostering different terms and theories. This makes a mapping of AI approaches to product design needs difficult and prevents new solutions. As a solution, this paper first clarifies important terms and concepts for the interdisciplinary domain of AI methods in product design. A key contribution of this paper is a new classification of design problems using the four characteristics decomposability, inter-dependencies, innovation and creativity. Definitions of these concepts are given where they are lacking. Early mappings of these concepts to AI solutions are sketched and verified using design examples. The importance of creativity in product design and a corresponding gap in AI is pointed out for future research.

design problem, product design, synthesis, (17 more...)

arXiv.org Artificial Intelligence

2201.07642

Country:

North America > United States > Wisconsin (0.04)
North America > United States > New York (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.48)

Add feedback

Predicting Unreliable Predictions by Shattering a Neural Network

Ji, Xu, Pascanu, Razvan, Hjelm, Devon, Vedaldi, Andrea, Lakshminarayanan, Balaji, Bengio, Yoshua

arXiv.org Machine LearningJun-15-2021

Piecewise linear neural networks can be split into subfunctions, each with its own activation pattern, domain, and empirical error. Empirical error for the full network can be written as an expectation over empirical error of subfunctions. Constructing a generalization bound on subfunction empirical error indicates that the more densely a subfunction is surrounded by training samples in representation space, the more reliable its predictions are. Further, it suggests that models with fewer activation regions generalize better, and models that abstract knowledge to a greater degree generalize better, all else equal. We propose not only a theoretical framework to reason about subfunction error bounds but also a pragmatic way of approximately evaluating it, which we apply to predicting which samples the network will not successfully generalize to. We test our method on detection of misclassification and out-of-distribution samples, finding that it performs competitively in both cases. In short, some network activation patterns are associated with higher reliability than others, and these can be identified using subfunction error bounds.

activation region, neural network, subfunction, (14 more...)

arXiv.org Machine Learning

2106.08365

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback