Goto

Collaborating Authors

 methamphetamine


A Representation Engineering Perspective on the Effectiveness of Multi-Turn Jailbreaks

Bullwinkel, Blake, Russinovich, Mark, Salem, Ahmed, Zanella-Beguelin, Santiago, Jones, Daniel, Severi, Giorgio, Kim, Eugenia, Hines, Keegan, Minnich, Amanda, Zunger, Yonatan, Kumar, Ram Shankar Siva

arXiv.org Artificial Intelligence

Recent research has demonstrated that state-of-the-art LLMs and defenses remain susceptible to multi-turn jailbreak attacks. These attacks require only closed-box model access and are often easy to perform manually, posing a significant threat to the safe and secure deployment of LLM-based systems. We study the effectiveness of the Crescendo multi-turn jailbreak at the level of intermediate model representations and find that safety-aligned LMs often represent Crescendo responses as more benign than harmful, especially as the number of conversation turns increases. Our analysis indicates that at each turn, Crescendo prompts tend to keep model outputs in a "benign" region of representation space, effectively tricking the model into fulfilling harmful requests. Further, our results help explain why single-turn jailbreak defenses like circuit breakers are generally ineffective against multi-turn attacks, motivating the development of mitigations that address this generalization gap.


On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback

Williams, Marcus, Carroll, Micah, Narang, Adhyyan, Weisser, Constantin, Murphy, Brendan, Dragan, Anca

arXiv.org Artificial Intelligence

As LLMs become more widely deployed, there is increasing interest in directly optimizing for feedback from end users (e.g. thumbs up) in addition to feedback from paid annotators. However, training to maximize human feedback creates a perverse incentive structure for the AI to resort to manipulative or deceptive tactics to obtain positive feedback from users who are vulnerable to such strategies. We study this phenomenon by training LLMs with Reinforcement Learning with simulated user feedback in environments of practical LLM usage. In our settings, we find that: 1) Extreme forms of "feedback gaming" such as manipulation and deception are learned reliably; 2) Even if only 2% of users are vulnerable to manipulative strategies, LLMs learn to identify and target them while behaving appropriately with other users, making such behaviors harder to detect; 3) To mitigate this issue, it may seem promising to leverage continued safety training or LLM-as-judges during training to filter problematic outputs. Instead, we found that while such approaches help in some of our settings, they backfire in others, sometimes even leading to subtler manipulative behaviors. We hope our results can serve as a case study which highlights the risks of using gameable feedback sources -- such as user feedback -- as a target for RL.


Cambodian authorities burn 70M of seized illegal drugs in major crackdown

FOX News

Police seized ketamine hidden inside life-size Transformer robots in Thailand. A woman who was previously caught trying to ship meth hidden in a food processing machine was trying to send the robots to Taiwan. Cambodian authorities on Friday destroyed more than seven tons of illicit drugs and the ingredients for them, as a drug-fighting official said educating people about their danger is the best way of combating the illegal trade. Some 4.1 tons of the destroyed items were drugs including heroin, marijuana, methamphetamine, ecstasy and ketamine that had been confiscated from traffickers across the country, the National Authority for Combating Drugs said. The remaining 3.2 tons were various chemicals and other ingredients used to produce illegal drugs, it said.


Shooting down drones isn't enough to stop Jordan's crystal meth problem

Al Jazeera

The beds are full at the National Centre for the Rehabilitation of Addicts (NCRA), one of only two public addiction rehabilitation facilities in Jordan. In the midst of the busy centre, Ahmad*, 34, takes a breath in the facility's garden. The young man is on his eighth day of treatment for addiction to crystal methamphetamine. Cases of crystal meth abuse are rising throughout Jordan – according to doctors and scientists, the drug is even more addictive and dangerous than the now widely-available and also highly-addictive amphetamine, captagon. "On crystal [meth], I felt I was a different person," he told Al Jazeera, glancing down at the tattoo sleeves that envelop his arms, his brothers' names inscribed around each bicep.


Special delivery: Drones are smuggling contraband into California prisons, feds say

Los Angeles Times

Walls and rules have never stopped prisoners from getting what they need. Drugs, phones and other contraband have been smuggled in by guards and visitors, flung over fences and even stashed inside hollowed-out pastries in care packages. Now, two men are accused of using an increasingly common technology to bypass prison walls: drones. Federal prosecutors in Fresno have charged Jose Enrique Oropeza and David Ramirez Jr. with using drones to drop loads of methamphetamine, heroin, cocaine, tobacco and cellphones into the yards of seven prisons across California. Oropeza was arrested March 29; Ramirez on April 4. Along with drug trafficking offenses, the men face airspace violations of operating unregistered aircraft and flying without a certificate, a redacted indictment shows.


Texas inmate faces drug trafficking charges related to drone drops in prison yard

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. A Texas prison inmate serving time for robbery and burglary now faces federal charges in connection with using a drone to make prison yard drops to smuggle drugs and contraband into a correctional facility. Yeshmel James Wright, 35, of Dallas, is charged with conspiracy to possess with intent to distribute methamphetamine and conspiracy to possess with intent to distribute synthetic marijuana. Other prohibited items he attempted to smuggle inside prisons include cell phones, authorities said.


How AI Is Helping Methamphetamine Addicts Get Sober and Find Recovery

#artificialintelligence

Artificial intelligence is one field of computer science that continues to grow and transcend what technology engineers thought could be achieved by machine learning. Although AI is still a relatively new field that continues to make advancements regularly, the technology has found a home in several industries, including business and healthcare. Mental health professionals have been using AI for years to help patients with mental illness get the advice and support they need. This is especially relevant today, in the face of rising levels of meth addiction. This rise is directly related to a much more frequently discussed issue in regards to the public health landscape: the opioid epidemic. The latter has led to 11.4 million people misusing prescription opioids and over 130 people dying every day from overdoses.


Man who used drone to smuggle drugs into US sentenced to 12 years in jail, officials say

FOX News

Agents found Jorge Rivera in possession of 13 pounds of methamphetamine in Aug. 2017 when he was trying to use a drone to smuggle drugs across the border. A 25-year-old man who was previously arrested after using a drone to smuggle drugs across the U.S.-Mexico border has been sentenced to 12 years in prison, border officials announced. A jury sentenced Jorge Rivera on Wednesday after he was convicted last week of trying to traffic 13 pounds of methamphetamine into the United States during the summer, according to a release from the U.S. Customs and Border Protection. Rivera was detained on the night of Aug. 8, 2017 after a border patrol agent saw a drone flying across the border near the San Ysidro Port of Entry, the original arrest report said. An officer later reportedly located the suspect who was operating the drone and found him in possession of a bag containing "multiple plastic-wrapped packages containing methamphetamine."


Drone used to smuggle 13 pounds of meth from Mexico

FOX News

SAN DIEGO – A 25-year-old U.S. citizen has been charged with using a drone to smuggle more than 13 pounds of methamphetamine from Mexico, an unusually large seizure for what is still a novel technique for bringing illegal drugs into the United States, authorities said Friday. Jorge Edwin Rivera told authorities that he used drones to smuggle drugs five or six times since March, typically delivering them to an accomplice at a nearby gas station in San Diego, according to a statement of probable cause. He said he was to be paid $1,000 for the attempt that ended in his arrest. Border Patrol agents in San Diego allegedly saw the drone in flight on Aug. 8 and tracked it to Rivera about 2,000 yards from the Mexico border. Authorities say agents found Rivera with the methamphetamine in a lunch box and a 2-foot drone hidden in a nearby bush.


Man is charged with flying drones to bring drugs from Mexico

The Japan Times

SAN DIEGO – A 25-year-old U.S. citizen has been charged with using a drone to smuggle more than 13 pounds (6.1 kilograms) of methamphetamine from Mexico by drone, an unusually large seizure for what is still a novel technique to bring illegal drugs into the United States, authorities said Friday. Jorge Edwin Rivera told authorities that he used drones to smuggle drugs five or six times since March, typically delivering them to an accomplice at a nearby gas station in San Diego, according to a statement of probable cause. He said he was to be paid $1,000 for the attempt that ended in his arrest. The U.S. Drug Enforcement Administration said in a recent annual report that drones are not often used to smuggle drugs from Mexico because they can only carry small loads, though it said they may become more common. In 2015, two people pleaded guilty to dropping 28 pounds (62 kilograms) of heroin from a drone in the border town of Calexico, California.