AITopics | rob

Collaborating Authors

rob

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SupplementaryMaterial: BetterSafeThanSorry: PreventingDelusiveAdversarieswith AdversarialTraining

Neural Information Processing SystemsFeb-9-2026, 17:15:19 GMT

The initial learning rate is set to 0.1. A.2 AdversarialTraining Unless otherwise specified, we perform adversarial training to train robust classifiers by following Madry etal.[74]. Specifically,we train against aprojected gradient descent (PGD) adversary, starting from a random initial perturbation of the training data. Unless otherwise specified, we use the values of provided in Table 5 to train our models. We use 7 steps of PGD with a step size of/5. A.3 DelusiveAdversaries Six delusive attacks are considered to validate our proposed defense.

artificial intelligence, machine learning, rnat, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

1d35a777e932235b115645d5141e0342-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 00:31:19 GMT

adversarial training, hc 2, rob, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Baltimore (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

EXP-CAM: Explanation Generation and Circuit Discovery Using Classifier Activation Matching

Suhail, Pirzada, Anand, Aditya, Sethi, Amit

arXiv.org Artificial IntelligenceDec-2-2025

Machine learning models, by virtue of training, learn a large repertoire of decision rules for any given input, and any one of these may suffice to justify a prediction. However, in high-dimensional input spaces, such rules are difficult to identify and interpret. In this paper, we introduce EXP-CAM: an explanation generation and circuit discovery approach using Classifier Activation Matching. EXP-CAM can generate minimal and faithful explanations for the decisions of pre-trained image classifiers that not only preserve the model's decision but are also concise and human-readable. We aim to identify minimal explanations that not only preserve the model's decision but are also concise and human-readable. To achieve this, we train a lightweight auto-encoder to produce binary masks that learns to highlight the decision-wise critical regions of an image while discarding irrelevant background. The training objective integrates activation alignment across multiple layers, consistency at the output label, priors that encourage sparsity, and compactness, along with a robustness constraint that enforces faithfulness. The minimal explanations so generated also lead us to mechanistically interpreting the model internals. In this regard we also introduce a circuit readout procedure wherein using the explanation's forward pass and gradients, we identify active channels and construct a channel-level graph, scoring inter-layer edges by ingress weight magnitude times source activation and feature-to-class links by classifier weight magnitude times feature activation. Together, these contributions provide a practical bridge between minimal input-level explanations and a mechanistic understanding of the internal computations driving model decisions.

explanation, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.25686

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Amulet: a Python Library for Assessing Interactions Among ML Defenses and Risks

Waheed, Asim, Duddu, Vasisht, Zhang, Rui, Szyller, Sebastian

arXiv.org Artificial IntelligenceNov-10-2025

Machine learning (ML) models are susceptible to various risks to security, privacy, and fairness. Most defenses are designed to protect against each risk individually (intended interactions) but can inadvertently affect susceptibility to other unrelated risks (unintended interactions). We introduce Amulet, the first Python library for evaluating both intended and unintended interactions among ML defenses and risks. Amulet is comprehensive by including representative attacks, defenses, and metrics; extensible to new modules due to its modular design; consistent with a user-friendly API template for inputs and outputs; and applicable for evaluating novel interactions. By satisfying all four properties, Amulet offers a unified foundation for studying how defenses interact, enabling the first systematic evaluation of unintended interactions across multiple risks.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.12386

Country:

North America > United States (0.29)
North America > Canada (0.28)
Asia > China (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Robust Federated Inference

Dhasade, Akash, Farhadkhani, Sadegh, Guerraoui, Rachid, Gupta, Nirupam, Jacovella, Maxime, Kermarrec, Anne-Marie, Pinot, Rafael

arXiv.org Artificial IntelligenceOct-21-2025

Federated inference, in the form of one-shot federated learning, edge ensembles, or federated ensembles, has emerged as an attractive solution to combine predictions from multiple models. This paradigm enables each model to remain local and proprietary while a central server queries them and aggregates predictions. Yet, the robustness of federated inference has been largely neglected, leaving them vulnerable to even simple attacks. To address this critical gap, we formalize the problem of robust federated inference and provide the first robustness analysis of this class of methods. Our analysis of averaging-based aggregators shows that the error of the aggregator is small either when the dissimilarity between honest responses is small or the margin between the two most probable classes is large. Moving beyond linear averaging, we show that problem of robust federated inference with non-linear aggregators can be cast as an adversarial machine learning problem. We then introduce an advanced technique using the DeepSet aggregation model, proposing a novel composition of adversarial training and test-time robust aggregation to robustify non-linear aggregators. Our composition yields significant improvements, surpassing existing robust aggregation methods by 4.7 - 22.2% in accuracy points across diverse benchmarks.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.0031

Country:

Europe (0.67)
North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

1d35a777e932235b115645d5141e0342-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:11:59 GMT

adversarial training, hc 2, rob, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Baltimore (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning

Sheng, Lijun, Liang, Jian, Wang, Zilei, He, Ran

arXiv.org Artificial IntelligenceAug-28-2025

Vision-language models (VLMs), such as CLIP, have gained significant popularity as foundation models, with numerous fine-tuning methods developed to enhance performance on downstream tasks. However, due to their inherent vulnerability and the common practice of selecting from a limited set of open-source models, VLMs suffer from a higher risk of adversarial attacks than traditional vision models. Existing defense techniques typically rely on adversarial fine-tuning during training, which requires labeled data and lacks of flexibility for downstream tasks. T o address these limitations, we propose robust test-time prompt tuning (R-TPT), which mitigates the impact of adversarial attacks during the inference stage. W e first reformulate the classic marginal entropy objective by eliminating the term that introduces conflicts under adversarial conditions, retaining only the pointwise entropy minimization. Furthermore, we introduce a plug-and-play reliability-based weighted ensembling strategy, which aggregates useful information from reliable augmented views to strengthen the defense. R-TPT enhances defense against adversarial attacks without requiring labeled training data while offering high flexibility for inference tasks. Extensive experiments on widely used benchmarks with various attacks demonstrate the effectiveness of R-TPT.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.11195

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.77)
Government > Military (0.77)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Off-Switching Not Guaranteed

Neth, Sven

arXiv.org Artificial IntelligenceFeb-12-2025

We have seen rapid progress in the field of Artificial Intelligence (AI). If this progress continues, perhaps one day we will create powerful artificial agents. If we do so, how do we ensure that such AI agents do not go out of control? One approach is to make sure that we can switch off AI agents when they act against our interests. Put another way, we want to make sure that AI agents will defer to us.

ai agent, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.08864

Country:

North America > United States > Minnesota (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > California > Monterey County > Pacific Grove (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Add feedback

Explaining Categorical Feature Interactions Using Graph Covariance and LLMs

Shen, Cencheng, Edge, Darren, Larson, Jonathan, Priebe, Carey E.

arXiv.org Machine LearningJan-24-2025

Modern datasets often consist of numerous samples with abundant features and associated timestamps. Analyzing such datasets to uncover underlying events typically requires complex statistical methods and substantial domain expertise. A notable example, and the primary data focus of this paper, is the global synthetic dataset from the Counter Trafficking Data Collaborative (CTDC) -- a global hub of human trafficking data containing over 200,000 anonymized records spanning from 2002 to 2022, with numerous categorical features for each record. In this paper, we propose a fast and scalable method for analyzing and extracting significant categorical feature interactions, and querying large language models (LLMs) to generate data-driven insights that explain these interactions. Our approach begins with a binarization step for categorical features using one-hot encoding, followed by the computation of graph covariance at each time. This graph covariance quantifies temporal changes in dependence structures within categorical data and is established as a consistent dependence measure under the Bernoulli distribution. We use this measure to identify significant feature pairs, such as those with the most frequent trends over time or those exhibiting sudden spikes in dependence at specific moments. These extracted feature pairs, along with their timestamps, are subsequently passed to an LLM tasked with generating potential explanations of the underlying events driving these dependence changes. The effectiveness of our method is demonstrated through extensive simulations, and its application to the CTDC dataset reveals meaningful feature pairs and potential data stories underlying the observed feature interactions.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Machine Learning

2501.14932

Country: