AITopics | aaa

5fa29a2f163ce2020769eca8956e2d77-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 08:46:06 GMT

accuracy, attacker, international conference, (11 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry: Information Technology (0.95)

Technology:

Information Technology > Security & Privacy (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks

Neural Information Processing SystemsDec-24-2025, 07:35:47 GMT

The score-based query attacks (SQAs) pose practical threats to deep neural networks by crafting adversarial perturbations within dozens of queries, only using the model's output scores. Nonetheless, we note that if the loss trend of the outputs is slightly perturbed, SQAs could be easily misled and thereby become much less effective. Following this idea, we propose a novel defense, namely Adversarial Attack on Attackers (AAA), to confound SQAs towards incorrect attack directions by slightly modifying the output logits. In this way, (1) SQAs are prevented regardless of the model's worst-case robustness; (2) the original model predictions are hardly changed, i.e., no degradation on clean accuracy; (3) the calibration of confidence scores can be improved simultaneously. Extensive experiments are provided to verify the above advantages. For example, by setting $\ell_\infty=8/255$ on CIFAR-10, our proposed AAA helps WideResNet-28 secure 80.59% accuracy under Square attack (2500 queries), while the best prior defense (i.e., adversarial training) only attains 67.44%. Since AAA attacks SQA's general greedy strategy, such advantages of AAA over 8 defenses can be consistently observed on 8 CIFAR-10/ImageNet models under 6 SQAs, using different attack targets, bounds, norms, losses, and strategies.

adversarial attack, attack, mitigate black-box score-based query attack, (8 more...)

Neural Information Processing Systems

Industry:

Information Technology > Security & Privacy (0.66)
Government > Military (0.66)
Transportation > Air (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

5fa29a2f163ce2020769eca8956e2d77-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 15:50:35 GMT

accuracy, defense performance, github, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

074177d3eb6371e32c16c55a3b8f706b-AuthorFeedback.pdf

Neural Information Processing SystemsOct-1-2025, 23:15:59 GMT

artificial intelligence, ihda, test error, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks Sizhe Chen

Neural Information Processing SystemsAug-15-2025, 05:52:56 GMT

Following this idea, we propose a novel defense, namely Adversarial Attack on Attackers (AAA), to confound SQAs towards incorrect attack directions by slightly modifying the output logits.

accuracy, attacker, international conference, (11 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry:

Information Technology > Security & Privacy (0.86)
Government > Military (0.62)

Technology:

Information Technology > Security & Privacy (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Minerva: A Programmable Memory Test Benchmark for Language Models

Xia, Menglin, Ruehle, Victor, Rajmohan, Saravan, Shokri, Reza

arXiv.org Artificial IntelligenceFeb-5-2025

How effectively can LLM-based AI assistants utilize their memory (context) to perform various tasks? Traditional data benchmarks, which are often manually crafted, suffer from several limitations: they are static, susceptible to overfitting, difficult to interpret, and lack actionable insights--failing to pinpoint the specific capabilities a model lacks when it does not pass a test. In this paper, we present a framework for automatically generating a comprehensive set of tests to evaluate models' abilities to use their memory effectively. Our framework extends the range of capability tests beyond the commonly explored (passkey, key-value, needle in the haystack) search, a dominant focus in the literature. Specifically, we evaluate models on atomic tasks such as searching, recalling, editing, matching, comparing information in context memory, and performing basic operations when inputs are structured into distinct blocks, simulating real-world data. Additionally, we design composite tests to investigate the models' ability to maintain state while operating on memory. Our benchmark enables an interpretable, detailed assessment of memory capabilities of LLMs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.03358

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks

Neural Information Processing SystemsOct-11-2024, 07:08:38 GMT

The score-based query attacks (SQAs) pose practical threats to deep neural networks by crafting adversarial perturbations within dozens of queries, only using the model's output scores. Nonetheless, we note that if the loss trend of the outputs is slightly perturbed, SQAs could be easily misled and thereby become much less effective. Following this idea, we propose a novel defense, namely Adversarial Attack on Attackers (AAA), to confound SQAs towards incorrect attack directions by slightly modifying the output logits. In this way, (1) SQAs are prevented regardless of the model's worst-case robustness; (2) the original model predictions are hardly changed, i.e., no degradation on clean accuracy; (3) the calibration of confidence scores can be improved simultaneously. Extensive experiments are provided to verify the above advantages.

adversarial attack, attacker, mitigate black-box score-based query attack, (5 more...)

Neural Information Processing Systems

Industry:

Information Technology > Security & Privacy (0.64)
Government > Military (0.64)
Transportation > Air (0.40)

Technology:

Information Technology > Security & Privacy (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

FATRER: Full-Attention Topic Regularizer for Accurate and Robust Conversational Emotion Recognition

Mao, Yuzhao, Lu, Di, Wang, Xiaojie, Zhang, Yang

arXiv.org Artificial IntelligenceJul-23-2023

This paper concentrates on the understanding of interlocutors' emotions evoked in conversational utterances. Previous studies in this literature mainly focus on more accurate emotional predictions, while ignoring model robustness when the local context is corrupted by adversarial attacks. To maintain robustness while ensuring accuracy, we propose an emotion recognizer augmented by a full-attention topic regularizer, which enables an emotion-related global view when modeling the local context in a conversation. A joint topic modeling strategy is introduced to implement regularization from both representation and loss perspectives. To avoid over-regularization, we drop the constraints on prior distributions that exist in traditional topic modeling and perform probabilistic approximations based entirely on attention alignment. Experiments show that our models obtain more favorable results than state-of-the-art models, and gain convincing robustness under three types of adversarial attacks.

modeling, representation, robustness, (14 more...)

arXiv.org Artificial Intelligence

2307.12221

Country:

Asia > Middle East > Yemen (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.69)
Government > Military (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.65)

Add feedback

Did ChatGPT Just Lie To Me? - The Scholarly Kitchen

#artificialintelligenceJan-13-2023, 17:25:13 GMT

To understand how Artificial Intelligence (AI) is affecting science publishing, we need to push these systems to their extremes, analyze how they perform, and expose their vulnerabilities. Only then can we discuss how they will transform our industry. Earlier this week, Todd Carpenter asked ChatGPT some generic questions about the potential role of AI in scientific communication and, as you can imagine, it generated some generic, hedged, inoffensive output. I wanted to see how ChatGPT would perform with scientific controversies -- situations in which the scientific community supported one belief and the public another. Or, in situations where there was no consensus in the scientific community.

artificial intelligence, chatgpt, health & medicine, (18 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks

Chen, Sizhe, Huang, Zhehao, Tao, Qinghua, Wu, Yingwen, Xie, Cihang, Huang, Xiaolin

arXiv.org Artificial IntelligenceDec-15-2022

The score-based query attacks (SQAs) pose practical threats to deep neural networks by crafting adversarial perturbations within dozens of queries, only using the model's output scores. Nonetheless, we note that if the loss trend of the outputs is slightly perturbed, SQAs could be easily misled and thereby become much less effective. Following this idea, we propose a novel defense, namely Adversarial Attack on Attackers (AAA), to confound SQAs towards incorrect attack directions by slightly modifying the output logits. In this way, (1) SQAs are prevented regardless of the model's worst-case robustness; (2) the original model predictions are hardly changed, i.e., no degradation on clean accuracy; (3) the calibration of confidence scores can be improved simultaneously. Extensive experiments are provided to verify the above advantages. For example, by setting $\ell_\infty=8/255$ on CIFAR-10, our proposed AAA helps WideResNet-28 secure 80.59% accuracy under Square attack (2500 queries), while the best prior defense (i.e., adversarial training) only attains 67.44%. Since AAA attacks SQA's general greedy strategy, such advantages of AAA over 8 defenses can be consistently observed on 8 CIFAR-10/ImageNet models under 6 SQAs, using different attack targets, bounds, norms, losses, and strategies. Moreover, AAA calibrates better without hurting the accuracy. Our code is available at https://github.com/Sizhe-Chen/AAA.

accuracy, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2205.12134

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.86)
Government > Military (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Filters

Collaborating Authors

aaa

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

5fa29a2f163ce2020769eca8956e2d77-Paper-Conference.pdf

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks

5fa29a2f163ce2020769eca8956e2d77-Supplemental-Conference.pdf

074177d3eb6371e32c16c55a3b8f706b-AuthorFeedback.pdf

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks Sizhe Chen

Minerva: A Programmable Memory Test Benchmark for Language Models

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks

FATRER: Full-Attention Topic Regularizer for Accurate and Robust Conversational Emotion Recognition

Did ChatGPT Just Lie To Me? - The Scholarly Kitchen

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks