Goto

Collaborating Authors

 cygnet


LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

arXiv.org Artificial Intelligence

Recent large language model (LLM) defenses have greatly improved models' ability to refuse harmful queries, even when adversarially attacked. However, LLM defenses are primarily evaluated against automated adversarial attacks in a single turn of conversation, an insufficient threat model for real-world malicious use. We demonstrate that multi-turn human jailbreaks uncover significant vulnerabilities, exceeding 70% attack success rate (ASR) on HarmBench against defenses that report single-digit ASRs with automated single-turn attacks. Human jailbreaks also reveal vulnerabilities in machine unlearning defenses, successfully recovering dual-use biosecurity knowledge from unlearned models. We compile these results into Multi-Turn Human Jailbreaks (MHJ), a dataset of 2,912 prompts across 537 multi-turn jailbreaks. We publicly release MHJ alongside a compendium of jailbreak tactics developed across dozens of commercial red teaming engagements, supporting research towards stronger LLM defenses.


Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

arXiv.org Artificial Intelligence

Large knowledge graphs often grow to store temporal facts that model the dynamic relations or interactions of entities along the timeline. Since such temporal knowledge graphs often suffer from incompleteness, it is important to develop time-aware representation learning models that help to infer the missing temporal facts. While the temporal facts are typically evolving, it is observed that many facts often show a repeated pattern along the timeline, such as economic crises and diplomatic activities. This observation indicates that a model could potentially learn much from the known facts appeared in history. To this end, we propose a new representation learning model for temporal knowledge graphs, namely CyGNet, based on a novel timeaware copy-generation mechanism. CyGNet is not only able to predict future facts from the whole entity vocabulary, but also capable of identifying facts with repetition and accordingly predicting such future facts with reference to the known facts in the past. We evaluate the proposed method on the knowledge graph completion task using five benchmark datasets. Extensive experiments demonstrate the effectiveness of CyGNet for predicting future facts with repetition as well as de novo fact prediction.


Achieve Sustainable Success with RPA Driven Digital Transformation - Cygnet

#artificialintelligence

The unbridled enthusiasm for digital transformation has prompted enterprises to storyboard roadmap to embrace the next wave of technology. The pace at which technology is mushrooming around, enterprises are in the dire need of automating labor-intensive business processes. This is where Robotic Process Automation-led digital transformation has acquired the center stage. RPA successfully replicates human actions to handle heaps of repetitive tasks that normally involves multiple human resources fully engrossed in completing them without adding noticeable business value. Needless to mention AI, machine learning and other technologies have emerged on the scene to make this happen, and yet Robotic Process Automation has proved to be impactful as it can smartly mimic human actions and perform mundane technical tasks and thereby evolving workforce to perform other high-value tasks. It can quickly fix perpetual legacy system problems such as integration, migration and information sharing issues leaving fragmented customer experience behind.


Blockchain for KYC: As a FinTech Problem Solver - Cygnet

#artificialintelligence

The most significant fears for financial institutions and banks are regulatory compliances. In the past, regulation was seen as a barrier to enter into Financial Services. Compliances were complex, difficult to comply with, and impossibly intricate for new organizations to adopt. It is a mandate for financial institutions to clearly identify and create a risk profile for each of their customers. Let's think of a situation, where a financial organization's KYC (Know-your-customer), which is a critical part of client onboarding, fails to show up a suspicious transaction done by another financial institution due to insufficient validation of the primary documents.