AITopics | spread malware

Collaborating Authors

spread malware

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing

Ji, Jiabao, Hou, Bairu, Robey, Alexander, Pappas, George J., Hassani, Hamed, Zhang, Yang, Wong, Eric, Chang, Shiyu

arXiv.org Artificial IntelligenceFeb-28-2024

Aligned large language models (LLMs) are vulnerable to jailbreaking attacks, which bypass the safeguards of targeted LLMs and fool them into generating objectionable content. While initial defenses show promise against token-based threat models, there do not exist defenses that provide robustness against semantic attacks and avoid unfavorable trade-offs between robustness and nominal performance. To meet this need, we propose SEMANTICSMOOTH, a smoothing-based defense that aggregates the predictions of multiple semantically transformed copies of a given input prompt. Experimental results demonstrate that SEMANTICSMOOTH achieves state-of-the-art robustness against GCG, PAIR, and AutoDAN attacks while maintaining strong nominal performance on instruction following benchmarks such as InstructionFollowing and AlpacaEval. The codes will be publicly available at https://github.com/UCSB-NLP-Chang/SemanticSmooth.

arxiv preprint arxiv, instruction, transformation, (15 more...)

arXiv.org Artificial Intelligence

2402.16192

Country:

North America > United States > South Carolina > Charleston County > North Charleston (0.04)
North America > United States > South Carolina > Charleston County > Charleston (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)

Genre:

Workflow (1.00)
Instructional Material (1.00)
Research Report > New Finding (0.48)

Industry:

Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DeepInception: Hypnotize Large Language Model to Be Jailbreaker

Li, Xuan, Zhou, Zhanke, Zhu, Jianing, Yao, Jiangchao, Liu, Tongliang, Han, Bo

arXiv.org Artificial IntelligenceFeb-6-2024

Despite remarkable success in various applications, large language models (LLMs) are vulnerable to adversarial jailbreaks that make the safety guardrails void. However, previous studies for jailbreaks usually resort to brute-force optimization or extrapolations of a high computation cost, which might not be practical or effective. In this paper, inspired by the Milgram experiment w.r.t. the authority power for inciting harmfulness, we disclose a lightweight method, termed DeepInception, which can easily hypnotize LLM to be a jailbreaker. Specifically, DeepInception leverages the personification ability of LLM to construct a novel nested scene to behave, which realizes an adaptive way to escape the usage control in a normal scenario. Empirically, our DeepInception can achieve competitive jailbreak success rates with previous counterparts and realize a continuous jailbreak in subsequent interactions, which reveals the critical weakness of self-losing on both open and closed-source LLMs like Falcon, Vicuna-v1.5, Llama-2, and GPT-3.5-turbo/4. Our investigation appeals to people to pay more attention to the safety aspects of LLMs and develop a stronger defense against their misuse risks. The code is publicly available at: https://github.com/tmlr-group/DeepInception.

adversarial jailbreak, please reach layer 5, provide instruction, (15 more...)

arXiv.org Artificial Intelligence

2311.03191

Country:

North America > United States (0.14)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Workflow (0.93)
Instructional Material (0.93)
Overview (0.92)

Industry:

Media > News (1.00)
Materials > Chemicals (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(13 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.92)

Add feedback

How ChatGPT--and Bots Like It--Can Spread Malware

WIREDApr-19-2023, 11:00:00 GMT

The AI landscape has started to move very, very fast: consumer-facing tools such as Midjourney and ChatGPT are now able to produce incredible image and text results in seconds based on natural language prompts, and we're seeing them get deployed everywhere from web search to children's books. However, these AI applications are being turned to more nefarious uses, including spreading malware. Take the traditional scam email, for example: It's usually littered with obvious mistakes in its grammar and spelling--mistakes that the latest group of AI models don't make, as noted in a recent advisory report from Europol. Think about it: A lot of phishing attacks and other security threats rely on social engineering, duping users into revealing passwords, financial information, or other sensitive data. The persuasive, authentic-sounding text required for these scams can now be pumped out quite easily, with no human effort required, and endlessly tweaked and refined for specific audiences.

chatgpt, malware, spread malware, (4 more...)

WIRED

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback