AITopics | model graph

Collaborating Authors

model graph

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Architectural Backdoors for Within-Batch Data Stealing and Model Inference Manipulation

Küchler, Nicolas, Petrov, Ivan, Grobler, Conrad, Shumailov, Ilia

arXiv.org Artificial IntelligenceMay-27-2025

For nearly a decade the academic community has investigated backdoors in neural networks, primarily focusing on classification tasks where adversaries manipulate the model prediction. While demonstrably malicious, the immediate real-world impact of such prediction-altering attacks has remained unclear. In this paper we introduce a novel and significantly more potent class of backdoors that builds upon recent advancements in architectural backdoors. We demonstrate how these backdoors can be specifically engineered to exploit batched inference, a common technique for hardware utilization, enabling large-scale user data manipulation and theft. By targeting the batching process, these architectural backdoors facilitate information leakage between concurrent user requests and allow attackers to fully control model responses directed at other users within the same batch. In other words, an attacker who can change the model architecture can set and steal model inputs and outputs of other users within the same batch. We show that such attacks are not only feasible but also alarmingly effective, can be readily injected into prevalent model architectures, and represent a truly malicious threat to user privacy and system integrity. Critically, to counteract this new class of vulnerabilities, we propose a deterministic mitigation strategy that provides formal guarantees against this new attack vector, unlike prior work that relied on Large Language Models to find the backdoors. Our mitigation strategy employs a novel Information Flow Control mechanism that analyzes the model graph and proves non-interference between different user inputs within the same batch. Using our mitigation strategy we perform a large scale analysis of models hosted through Hugging Face and find over 200 models that introduce (unintended) information leakage between batch entries due to the use of dynamic quantization.

backdoor, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.18323

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Critical Study of What Code-LLMs (Do Not) Learn

Anand, Abhinav, Verma, Shweta, Narasimhan, Krishna, Mezini, Mira

arXiv.org Artificial IntelligenceJun-17-2024

Large Language Models trained on code corpora (code-LLMs) have demonstrated impressive performance in various coding assistance tasks. However, despite their increased size and training dataset, code-LLMs still have limitations such as suggesting codes with syntactic errors, variable misuse etc. Some studies argue that code-LLMs perform well on coding tasks because they use self-attention and hidden representations to encode relations among input tokens. However, previous works have not studied what code properties are not encoded by code-LLMs. In this paper, we conduct a fine-grained analysis of attention maps and hidden representations of code-LLMs. Our study indicates that code-LLMs only encode relations among specific subsets of input tokens. Specifically, by categorizing input tokens into syntactic tokens and identifiers, we found that models encode relations among syntactic tokens and among identifiers, but they fail to encode relations between syntactic tokens and identifiers. We also found that fine-tuned models encode these relations poorly compared to their pre-trained counterparts. Additionally, larger models with billions of parameters encode significantly less information about code than models with only a few hundred million parameters.

graph, relation, representation, (15 more...)

arXiv.org Artificial Intelligence

2406.1193

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(16 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

On the Origin of Llamas: Model Tree Heritage Recovery

Horwitz, Eliahu, Shul, Asaf, Hoshen, Yedid

arXiv.org Artificial IntelligenceMay-28-2024

The rapid growth of neural network models shared on the internet has made model weights an important data modality. However, this information is underutilized as the weights are uninterpretable, and publicly available models are disorganized. Inspired by Darwin's tree of life, we define the Model Tree which describes the origin of models i.e., the parent model that was used to fine-tune the target model. Similarly to the natural world, the tree structure is unknown. In this paper, we introduce the task of Model Tree Heritage Recovery (MoTHer Recovery) for discovering Model Trees in the ever-growing universe of neural networks. Our hypothesis is that model weights encode this information, the challenge is to decode the underlying tree structure given the weights. Beyond the immediate application of model authorship attribution, MoTHer recovery holds exciting long-term applications akin to indexing the internet by search engines. Practically, for each pair of models, this task requires: i) determining if they are related, and ii) establishing the direction of the relationship. We find that certain distributional properties of the weights evolve monotonically during training, which enables us to classify the relationship between two given models. MoTHer recovery reconstructs entire model hierarchies, represented by a directed tree, where a parent model gives rise to multiple child models through additional training. Our approach successfully reconstructs complex Model Trees, as well as the structure of "in-the-wild" model families such as Llama 2 and Stable Diffusion.

arxiv preprint arxiv, model graph, model tree, (12 more...)

arXiv.org Artificial Intelligence

2405.18432

Country:

Oceania > Australia (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition

Zhou, Wei, Beck, Eugen, Berger, Simon, Schlüter, Ralf, Ney, Hermann

arXiv.org Artificial IntelligenceMay-28-2023

Modern public ASR tools usually provide rich support for training various sequence-to-sequence (S2S) models, but rather simple support for decoding open-vocabulary scenarios only. For closed-vocabulary scenarios, public tools supporting lexical-constrained decoding are usually only for classical ASR, or do not support all S2S models. To eliminate this restriction on research possibilities such as modeling unit choice, we present RASR2 in this work, a research-oriented generic S2S decoder implemented in C++. It offers a strong flexibility/compatibility for various S2S models, language models, label units/topologies and neural network architectures. It provides efficient decoding for both open- and closed-vocabulary scenarios based on a generalized search framework with rich support for different search modes and settings. We evaluate RASR2 with a wide range of experiments on both switchboard and Librispeech corpora. Our source code is public online.

hypothesis, proc, speech recognition, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2023-1062

2305.17782

Country:

Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
North America > United States > Georgia > Chatham County > Savannah (0.04)
Europe > Austria > Styria > Graz (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Computing H-Partitions in ASP and Datalog

Capon, Chloé, Lecomte, Nicolas, Wijsen, Jef

arXiv.org Artificial IntelligenceFeb-8-2022

Answer Set Programming (ASP) is a powerful programming paradigm that allows for an easy encoding of decision problems in NP. If the answer to a problem in NP is "yes," then, by definition, there is a "yes"-certificate that can be checked in polynomial time. In an ASP guess-and-check program, a programmer first declares the format of such a certificate, and then specifies the constraints that a well-formatted certificate should obey in order to be a "yes"- certificate. For example, for the well-known problem SAT, an ASP-programmer can first declare that certificates take the form of truth assignments, and then specify that "yes"-certificates are those certificates that leave no clause unsatisfied. While ASP guess-and-check programs are typically oriented towards NP-complete problems, they can also be used for problems in P. For example, the previously mentioned encoding of SAT also solves 2SAT, which is known to be in P. This raises the following issue which will be addressed in this paper. Assume that we have an answer set solver at our disposal, and that we have written a guess-and-check ASP program for a particular problem that is NPcomplete in general (for example, SAT). Assume furthermore that we know that under some restrictions, the problem can be solved in polynomial time (for example, the restriction of SAT to 2SAT).

graph, model graph, vertex, (13 more...)

arXiv.org Artificial Intelligence

2202.0373

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Belgium (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.66)

Add feedback

What is the TensorFlow machine intelligence platform?

@machinelearnbotNov-14-2017, 12:18:22 GMT

TensorFlow is an open source software library for numerical computation using data-flow graphs. It was originally developed by the Google Brain Team within Google's Machine Intelligence research organization for machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well. It reached version 1.0 in February 2017, and has continued rapid development, with 21,000 commits thus far, many from outside contributors. This article introduces TensorFlow, its open source community and ecosystem, and highlights some interesting TensorFlow open sourced models. It runs on nearly everything: GPUs and CPUs--including mobile and embedded platforms--and even tensor processing units (TPUs), which are specialized hardware to do tensor math on.

artificial intelligence, machine learning, tensorflow, (16 more...)

@machinelearnbot

Industry: Information Technology > Services (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

How to Use the Keras Functional API for Deep Learning - Machine Learning Mastery

@machinelearnbotOct-26-2017, 23:15:05 GMT

The Keras Python library makes creating deep learning models fast and easy. The sequential API allows you to create models layer-by-layer for most problems. It is limited in that it does not allow you to create models that share layers or have multiple inputs or outputs. The functional API in Keras is an alternate way of creating models that offers a lot more flexibility, including creating more complex models. In this tutorial, you will discover how to use the more flexible functional API in Keras to define deep learning models.

artificial intelligence, deep learning, machine learning, (12 more...)

@machinelearnbot

Genre: Instructional Material > Course Syllabus & Notes (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Plan-Time Multi-Model Switching for Motion Planning

Styler, Breelyn Melissa Kane (Carnegie Mellon University) | Simmons, Reid (Carnegie Mellon University)

AAAI ConferencesJun-14-2017

Robot navigation through non-uniform environments requires reliable motion plan generation. The choice of planning model fidelity can significantly impact performance. Prior research has shown that reducing model fidelity saves planning time, but sacrifices execution reliability. While current adaptive hierarchical motion planning techniques are promising, we present a framework that leverages a richer set of robot motion models at plan-time. The framework chooses when to switch models and what model is most applicable within a single trajectory. For instance, more complex environment locales require higher fidelity models, while lower fidelity models are sufficient for simpler parts of the planning space, thus saving plan time. Our algorithm continuously aims to pick the model that best handles the current local environment. This effectively generates a single, mixed-fidelity plan. We present results for a simulated mobile robot with attached trailer in a hospital domain. We compare using a single motion planning model to switching with our framework of multiple models. Our results demonstrate that multi-fidelity model switching increases plan-time efficiency without sacrificing execution reliability.

fidelity model, infeasibility, robot, (16 more...)

AAAI Conferences

Twenty-Seventh International Conference on Automated Planning and Scheduling

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.92)

Add feedback

Saul: Towards Declarative Learning Based Programming

Kordjamshidi, Parisa (University of Illinois at Urbana-Champaign) | Roth, Dan (University of Illinois at Urbana-Champaign) | Wu, Hao (University of Illinois at Urbana-Champaign)

AAAI ConferencesNov-1-2015

We present Saul, a new probabilistic programming language designed to address some of the shortcomings of programming languages that aim at advancing and simplifying the development of AI systems. Such languages need to interact with messy, naturally occurring data, to allow a programmer to specify what needs to be done at an appropriate level of abstraction rather than at the data level, to be developed on a solid theory that supports moving to and reasoning at this level of abstraction and, finally, to support flexible integration of these learning and inference models within an application program. Saul is an object-functional programming language written in Scala that facilitates these by (1) allowing a programmer to learn, name and manipulate named abstractions over relational data; (2) supporting seamless incorporation of trainable (probabilistic or discriminative) components into the program, and (3) providing a level of inference over trainable models to support composition and make decisions that respect domain and application constraints. Saul is developed over a declaratively defined relational data model, can use piecewise learned factor graphs with declaratively specified learning and inference objectives, and it supports inference over probabilistic models augmented with declarative knowledge-based constraints.We describe the key constructs of Saul and exemplify its use in developing applications that require relational feature engineering and structured output prediction.

inference, programmer, saul, (17 more...)

AAAI Conferences

2015 AAAI Fall Symposium Series

Country: