Goto

Collaborating Authors

 partial compliance


Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search

arXiv.org Artificial Intelligence

We introduce Siege, a multi-turn adversarial framework that models the gradual erosion of Large Language Model (LLM) safety through a tree search perspective. Unlike single-turn jailbreaks that rely on one meticulously engineered prompt, Siege expands the conversation at each turn in a breadth-first fashion, branching out multiple adversarial prompts that exploit partial compliance from previous responses. By tracking these incremental policy leaks and re-injecting them into subsequent queries, Siege reveals how minor concessions can accumulate into fully disallowed outputs. Evaluations on the JailbreakBench dataset show that Siege achieves a 100% success rate on GPT-3.5-turbo and 97% on GPT-4 in a single multi-turn run, using fewer queries than baselines such as Crescendo or GOAT. This tree search methodology offers an in-depth view of how model safeguards degrade over successive dialogue turns, underscoring the urgency of robust multi-turn testing procedures for language models.


Towards a Formal Framework for Partial Compliance of Business Processes

arXiv.org Artificial Intelligence

Binary "YES-NO" notions of process compliance are not very helpful to managers for assessing the operational performance of their company because a large number of cases fall in the grey area of partial compliance. Hence, it is necessary to have ways to quantify partial compliance in terms of metrics and be able to classify actual cases by assigning a numeric value of compliance to them. In this paper, we formulate an evaluation framework to quantify the level of compliance of business processes across different levels of abstraction (such as task, trace and process level) and across multiple dimensions of each task(such as temporal, monetary, role-, data-, and quality-related) to provide managers more useful information about their operations and to help them improve their decision making processes. Our approach can also add social value by making social services provided by local, state and federal governments more flexible and improving the lives of citizens.


Fair Machine Learning Under Partial Compliance

arXiv.org Machine Learning

Typically, fair machine learning research focuses on a single decisionmaker and assumes that the underlying population is stationary. However, many of the critical domains motivating this work are characterized by competitive marketplaces with many decisionmakers. Realistically, we might expect only a subset of them to adopt any non-compulsory fairness-conscious policy, a situation that political philosophers call partial compliance. This possibility raises important questions: how does the strategic behavior of decision subjects in partial compliance settings affect the allocation outcomes? If k% of employers were to voluntarily adopt a fairness-promoting intervention, should we expect k% progress (in aggregate) towards the benefits of universal adoption, or will the dynamics of partial compliance wash out the hoped-for benefits? How might adopting a global (versus local) perspective impact the conclusions of an auditor? In this paper, we propose a simple model of an employment market, leveraging simulation as a tool to explore the impact of both interaction effects and incentive effects on outcomes and auditing metrics. Our key findings are that at equilibrium: (1) partial compliance (k% of employers) can result in far less than proportional (k%) progress towards the full compliance outcomes; (2) the gap is more severe when fair employers match global (vs local) statistics; (3) choices of local vs global statistics can paint dramatically different pictures of the performance vis-a-vis fairness desiderata of compliant versus non-compliant employers; and (4) partial compliance to local parity measures can induce extreme segregation.