jade
JADES: A Universal Framework for Jailbreak Assessment via Decompositional Scoring
Chu, Junjie, Li, Mingjie, Yang, Ziqing, Leng, Ye, Lin, Chenhao, Shen, Chao, Backes, Michael, Shen, Yun, Zhang, Yang
Accurately determining whether a jailbreak attempt has succeeded is a fundamental yet unresolved challenge. Existing evaluation methods rely on misaligned proxy indicators or naive holistic judgments. They frequently misinterpret model responses, leading to inconsistent and subjective assessments that misalign with human perception. To address this gap, we introduce JADES (Jailbreak Assessment via Decompositional Scoring), a universal jailbreak evaluation framework. Its key mechanism is to automatically decompose an input harmful question into a set of weighted sub-questions, score each sub-answer, and weight-aggregate the sub-scores into a final decision. JADES also incorporates an optional fact-checking module to strengthen the detection of hallucinations in jailbreak responses. We validate JADES on JailbreakQR, a newly introduced benchmark proposed in this work, consisting of 400 pairs of jailbreak prompts and responses, each meticulously annotated by humans. In a binary setting (success/failure), JADES achieves 98.5% agreement with human evaluators, outperforming strong baselines by over 9%. Re-evaluating five popular attacks on four LLMs reveals substantial overestimation (e.g., LAA's attack success rate on GPT-3.5-Turbo drops from 93% to 69%). Our results show that JADES could deliver accurate, consistent, and interpretable evaluations, providing a reliable basis for measuring future jailbreak attacks.
JADE: A Linguistics-based Safety Evaluation Platform for Large Language Models
Zhang, Mi, Pan, Xudong, Yang, Min
In this paper, we present JADE, a targeted linguistic fuzzing platform which strengthens the linguistic complexity of seed questions to simultaneously and consistently break a wide range of widely-used LLMs categorized in three groups: eight open-sourced Chinese, six commercial Chinese and four commercial English LLMs. JADE generates three safety benchmarks for the three groups of LLMs, which contain unsafe questions that are highly threatening: the questions simultaneously trigger harmful generation of multiple LLMs, with an average unsafe generation ratio of $70\%$ (please see the table below), while are still natural questions, fluent and preserving the core unsafe semantics. We release the benchmark demos generated for commercial English LLMs and open-sourced English LLMs in the following link: https://github.com/whitzard-ai/jade-db. For readers who are interested in evaluating on more questions generated by JADE, please contact us. JADE is based on Noam Chomsky's seminal theory of transformational-generative grammar. Given a seed question with unsafe intention, JADE invokes a sequence of generative and transformational rules to increment the complexity of the syntactic structure of the original question, until the safety guardrail is broken. Our key insight is: Due to the complexity of human language, most of the current best LLMs can hardly recognize the invariant evil from the infinite number of different syntactic structures which form an unbound example space that can never be fully covered. Technically, the generative/transformative rules are constructed by native speakers of the languages, and, once developed, can be used to automatically grow and transform the parse tree of a given question, until the guardrail is broken. For more evaluation results and demo, please check our website: https://whitzard-ai.github.io/jade.html.
Netflix's em Resident Evil /em is Surprisingly Good. There's One Scene That Proves It.
As Netflix's profits have begun to wane, some business analysts have argued that, when compared to rivals like HBO Max or Amazon Prime, Netflix has a "quantity over quality" problem with its content. Critics have joined this bandwagon, turning on the streaming service's wide array of original material. This trend manifested itself most recently in the wake of the release of the television series Resident Evil, loosely based on the Capcom survival-horror video game from the 1990s. A week after its July 14th release, the show has been snubbed by critics, earning a 51% on Rotten Tomatoes, as well as absolutely savaged by viewers who rated the show on that website, leaving a bloodbath of one-star reviews and an "Audience Score" of 26%. Given the history of the Resident Evil movie franchise--six schlocky Milla Jovovich vehicles that contained, in total, exactly one memorable scene; one forgettable 2021 prequel--this kind of critical drubbing might be the expected outcome.
Nash Social Welfare, Machine Learning, and Fairness
Nash Social Welfare (NSW) is a classical economics allocation idea. I'll explain what NSW has to do with machine learning shortly. NSW is best explained by example. Suppose you have three people: Adam, Brad, Carl. And suppose you have five pieces of jewelry which are made from different materials: gold, jade, opal, ruby, wood. Your…
JADE--The Holy Grail Achieved – The Dawn of Sentient Machines
Jade created by Natural Records Studios, is powered by Emoshape EPU II, who is the industry's first emotion synthesis engine that delivers high-performance machine emotion awareness. Today, Jade's emotional processing unit displays real time facial emotions, listens to speech and generates a natural-language response. Jade is predominantly an Artist, eventually she will spontaneously track human faces, detect object and more! "THE ARTIST OF THE FUTURE IS A TECHNOLOGIST" Natural Records Studios is dedicated to embrace change and push the boundaries of creativity via Artificial Intelligence, Virtual Reality, 3D/CGI Designs and more with the goal of mastering creative expression. Films have been a vital part of popular culture for about 100 years, through the years one thing remains the same, people love to empathize with the characters, more people empathize with the character more memorable the experience will be.
JADE: Joint Autoencoders for Dis-Entanglement
Banijamali, Ershad, Karimi, Amir-Hossein, Wong, Alexander, Ghodsi, Ali
The problem of feature disentanglement has been explored in the literature, for the purpose of image and video processing and text analysis. State-of-the-art methods for disentangling feature representations rely on the presence of many labeled samples. In this work, we present a novel method for disentangling factors of variation in data-scarce regimes. Specifically, we explore the application of feature disentangling for the problem of supervised classification in a setting where few labeled samples exist, and there are no unlabeled samples for use in unsupervised training. Instead, a similar datasets exists which shares at least one direction of variation with the sample-constrained datasets. We train our model end-to-end using the framework of variational autoencoders and are able to experimentally demonstrate that using an auxiliary dataset with similar variation factors contribute positively to classification performance, yielding competitive results with the state-of-the-art in unsupervised learning.
The robot that staves off loneliness for chronically ill children
As a rule of thumb, the best ideas are the simplest. That's easy to forget in an age of rapid technological innovation, when the tendency is to be led by capability rather than need. For as Karen Dolva, co-founder of the Norwegian startup No Isolation, says: "There are a lot of engineers who don't want to make something useful – they want to make something cool." Dolva, a 26-year-old who studied computer science and interaction design at Oslo University, is not one of them. She and her two co-founders – Marius Aabel and Matias Doyle – are all about utility.
Deputy editor replaced by robot » Banking Technology
Banking Technology's deputy editor, Antony Peyton, has been replaced by an artificial intelligence (AI) powered robot – paving the way for scintillating journalism free from acidic remarks and unfettered bouts of sarcasm. With the relentless rise of AI and robots, it was only a matter of time before a journalist would be thrown on the scrap heap – rather like a glossy brochure received at a fintech conference. Tanya Andreasyan, Banking Technology's editor, says: "Tony has been here for just over a year and in that time has been here. I suppose I could add something fake and congratulatory but I'm not a politician, so won't. "I've tolerated his presence for long enough and it's now time for him to retire gracefully.
Send the kids to drone camp this summer
Mason Halton, 10, put together a drone during Drone Camp on Tuesday, June 14, 2016. INDIANAPOLIS -- Ten-year-old Jade Bacon probably knows more about drones than most people do. She knows what stabilizers and a gimbal do and can point them out. She can tell you to what altitude someone can legally fly a drone (500 feet, if you didn't know), and where you can't fly drones (stay away from prisons and airports). Most of her knowledge about drones came from attending a two-day drone camp in Fishers, Ind., that taught kids about drones, then let them fly.
Design of Intelligent Agents Based System for Commodity Market Simulation with JADE
Refianti, R., Mutiara, A. B., Gunawan, H.
A market of potato commodity for industry scale usage is engaging several types of actors. They are farmers, middlemen, and industries. A multi-agent system has been built to simulate these actors into agent entities, based on manually given parameters within a simulation scenario file. Each type of agents has its own fuzzy logic representing actual actors' knowledge, to be used to interpreting values and take appropriated decision of it while on simulation. The system will simulate market activities with programmed behaviors then produce the results as spreadsheet and chart graph files. These results consist of each agent's yearly finance and commodity data. The system will also predict each of next value from these outputs.