AITopics | zorro

Collaborating Authors

zorro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ZORRO: Zero-Knowledge Robustness and Privacy for Split Learning (Full Version)

Sheybani, Nojan, Pegoraro, Alessandro, Knauer, Jonathan, Rieger, Phillip, Mollakuqe, Elissa, Koushanfar, Farinaz, Sadeghi, Ahmad-Reza

arXiv.org Artificial IntelligenceSep-15-2025

Split Learning (SL) is a distributed learning approach that enables resource-constrained clients to collaboratively train deep neural networks (DNNs) by offloading most layers to a central server while keeping in- and output layers on the client-side. This setup enables SL to leverage server computation capacities without sharing data, making it highly effective in resource-constrained environments dealing with sensitive data. However, the distributed nature enables malicious clients to manipulate the training process. By sending poisoned intermediate gradients, they can inject backdoors into the shared DNN. Existing defenses are limited by often focusing on server-side protection and introducing additional overhead for the server. A significant challenge for client-side defenses is enforcing malicious clients to correctly execute the defense algorithm. We present ZORRO, a private, verifiable, and robust SL defense scheme. Through our novel design and application of interactive zero-knowledge proofs (ZKPs), clients prove their correct execution of a client-located defense algorithm, resulting in proofs of computational integrity attesting to the benign nature of locally trained DNN portions. Leveraging the frequency representation of model partitions enables ZORRO to conduct an in-depth inspection of the locally trained models in an untrusted environment, ensuring that each client forwards a benign checkpoint to its succeeding client. In our extensive evaluation, covering different model architectures as well as various attack strategies and data scenarios, we show ZORRO's effectiveness, as it reduces the attack success rate to less than 6\% while causing even for models storing \numprint{1000000} parameters on the client-side an overhead of less than 10 seconds.

artificial intelligence, machine learning, zorro, (15 more...)

arXiv.org Artificial Intelligence

2509.09787

Country:

Asia (1.00)
North America > United States > California (0.93)
Europe (0.68)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law (0.92)
Government > Military (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Zorro: A Flexible and Differentiable Parametric Family of Activation Functions That Extends ReLU and GELU

Roodschild, Matias, Gotay-Sardiñas, Jorge, Jimenez, Victor A., Will, Adrian

arXiv.org Artificial IntelligenceSep-28-2024

Even in recent neural network architectures such as Transformers and Extended LSTM (xLSTM), and traditional ones like Convolutional Neural Networks, Activation Functions are an integral part of nearly all neural networks. They enable more effective training and capture nonlinear data patterns. More than 400 functions have been proposed over the last 30 years, including fixed or trainable parameters, but only a few are widely used. ReLU is one of the most frequently used, with GELU and Swish variants increasingly appearing. However, ReLU presents non-differentiable points and exploding gradient issues, while testing different parameters of GELU and Swish variants produces varying results, needing more parameters to adapt to datasets and architectures. This article introduces a novel set of activation functions called Zorro, a continuously differentiable and flexible family comprising five main functions fusing ReLU and Sigmoid. Zorro functions are smooth and adaptable, and serve as information gates, aligning with ReLU in the 0-1 range, offering an alternative to ReLU without the need for normalization, neuron death, or gradient explosions. Zorro also approximates functions like Swish, GELU, and DGELU, providing parameters to adjust to different datasets and architectures. We tested it on fully connected, convolutional, and transformer architectures to demonstrate its effectiveness.

activation function, architecture, variant, (15 more...)

arXiv.org Artificial Intelligence

2409.19239

Country:

South America > Argentina (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Italy > Sardinia (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Neural Language Models as Cognitive Models of Language Acquisition

Martínez, Héctor Javier Vázquez, Heuser, Annika Lea, Yang, Charles, Kodner, Jordan

arXiv.org Artificial IntelligenceOct-30-2023

The success of neural language models (LMs) on many technological tasks has brought about their potential relevance as scientific theories of language despite some clear differences between LM training and child language acquisition. In this paper we argue that some of the most prominent benchmarks for evaluating the syntactic capacities of LMs may not be sufficiently rigorous. In particular, we show that the template-based benchmarks lack the structural diversity commonly found in the theoretical and psychological studies of language. When trained on small-scale data modeling child language acquisition, the LMs can be readily matched by simple baseline models. We advocate for the use of the readily available, carefully curated datasets that have been evaluated for gradient acceptability by large pools of native speakers and are designed to probe the structural basis of grammar specifically. On one such dataset, the LI-Adger dataset, LMs evaluate sentences in a way inconsistent with human language users. We conclude with suggestions for better connecting LMs with the empirical study of child language acquisition.

acquisition, computational linguistic, paradigm, (17 more...)

arXiv.org Artificial Intelligence

2310.20093

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Virginia (0.04)
North America > United States > Pennsylvania (0.04)
(12 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.64)

Add feedback

Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks

Funke, Thorben, Khosla, Megha, Anand, Avishek

arXiv.org Artificial IntelligenceMay-18-2021

With the ever-increasing popularity and applications of graph neural networks, several proposals have been made to interpret and understand the decisions of a GNN model. Explanations for a GNN model differ in principle from other input settings. It is important to attribute the decision to input features and other related instances connected by the graph structure. We find that the previous explanation generation approaches that maximize the mutual information between the label distribution produced by the GNN model and the explanation to be restrictive. Specifically, existing approaches do not enforce explanations to be predictive, sparse, or robust to input perturbations. In this paper, we lay down some of the fundamental principles that an explanation method for GNNs should follow and introduce a metric fidelity as a measure of the explanation's effectiveness. We propose a novel approach Zorro based on the principles from rate-distortion theory that uses a simple combinatorial procedure to optimize for fidelity. Extensive experiments on real and synthetic datasets reveal that Zorro produces sparser, stable, and more faithful explanations than existing GNN explanation approaches.

explanation, fidelity, node, (15 more...)

arXiv.org Artificial Intelligence

2105.08621

Country:

Europe > Germany > Lower Saxony > Hanover (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback