buster
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation
Qi, Chengwen, Ma, Ren, Li, Bowen, Du, He, Hui, Binyuan, Wu, Jinwang, Laili, Yuanjun, He, Conghui
First-order logic (FOL) reasoning, which involves sequential deduction, is pivotal for intelligent systems and serves as a valuable task for evaluating reasoning capabilities, particularly in chain-of-thought (CoT) contexts. Existing benchmarks often rely on extensive human annotation or handcrafted templates, making it difficult to achieve the necessary complexity, scalability, and diversity for robust evaluation. To address these limitations, we propose a novel framework called ProverGen that synergizes the generative strengths of Large Language Models (LLMs) with the rigor and precision of symbolic provers, enabling the creation of a scalable, diverse, and high-quality FOL reasoning dataset, ProverQA. ProverQA is also distinguished by its inclusion of accessible and logically coherent intermediate reasoning steps for each problem. Our evaluation shows that state-of-the-art LLMs struggle to solve ProverQA problems, even with CoT prompting, highlighting the dataset's challenging nature. We also finetune Llama3.1-8B-Instruct on a separate training set generated by our framework. The finetuned model demonstrates consistent improvements on both in-distribution and out-of-distribution test sets, suggesting the value of our proposed data generation framework. Code available at: https://github.com/opendatalab/ProverGen
Buster: Implanting Semantic Backdoor into Text Encoder to Mitigate NSFW Content Generation
Zhao, Xin, Chen, Xiaojun, Xuan, Yuexin, Zhao, Zhendong, Jia, Xiaojun, Li, Xinfeng, Wang, Xiaofeng
The rise of deep learning models in the digital era has raised substantial concerns regarding the generation of Not-Safe-for-Work (NSFW) content. Existing defense methods primarily involve model fine-tuning and post-hoc content moderation. Nevertheless, these approaches largely lack scalability in eliminating harmful content, degrade the quality of benign image generation, or incur high inference costs. To address these challenges, we propose an innovative framework named \textit{Buster}, which injects backdoors into the text encoder to prevent NSFW content generation. Buster leverages deep semantic information rather than explicit prompts as triggers, redirecting NSFW prompts towards targeted benign prompts. Additionally, Buster employs energy-based training data generation through Langevin dynamics for adversarial knowledge augmentation, thereby ensuring robustness in harmful concept definition. This approach demonstrates exceptional resilience and scalability in mitigating NSFW content. Particularly, Buster fine-tunes the text encoder of Text-to-Image models within merely five minutes, showcasing its efficiency. Our extensive experiments denote that Buster outperforms nine state-of-the-art baselines, achieving a superior NSFW content removal rate of at least 91.2\% while preserving the quality of harmless images.
Show Less, Instruct More: Enriching Prompts with Definitions and Guidelines for Zero-Shot NER
Zamai, Andrew, Zugarini, Andrea, Rigutini, Leonardo, Ernandes, Marco, Maggini, Marco
Recently, several specialized instruction-tuned Large Language Models (LLMs) for Named Entity Recognition (NER) have emerged. Compared to traditional NER approaches, these models have strong generalization capabilities. Existing LLMs mainly focus on zero-shot NER in out-of-domain distributions, being fine-tuned on an extensive number of entity classes that often highly or completely overlap with test sets. In this work instead, we propose SLIMER, an approach designed to tackle never-seen-before named entity tags by instructing the model on fewer examples, and by leveraging a prompt enriched with definition and guidelines. Experiments demonstrate that definition and guidelines yield better performance, faster and more robust learning, particularly when labelling unseen Named Entities. Furthermore, SLIMER performs comparably to state-of-the-art approaches in out-of-domain zero-shot NER, while being trained on a reduced tag set.
BUSTER: a "BUSiness Transaction Entity Recognition" dataset
Zugarini, Andrea, Zamai, Andrew, Ernandes, Marco, Rigutini, Leonardo
Albeit Natural Language Processing has seen major breakthroughs in the last few years, transferring such advances into real-world business cases can be challenging. One of the reasons resides in the displacement between popular benchmarks and actual data. Lack of supervision, unbalanced classes, noisy data and long documents often affect real problems in vertical domains such as finance, law and health. To support industry-oriented research, we present BUSTER, a BUSiness Transaction Entity Recognition dataset. The dataset consists of 3779 manually annotated documents on financial transactions. We establish several baselines exploiting both general-purpose and domain-specific language models. The best performing model is also used to automatically annotate 6196 documents, which we release as an additional silver corpus to BUSTER.
Artificial Intelligence And Neural Networks For Everyone, Even Kids.
Artificial Intelligence (or AI) is a field in computer science that focuses on solving problems by applying learning techniques (and some math). In some ways, AI and the field, in general, focuses on building programs that try and imitate the way your own brain works. But let's talk about learning some more because it's important in understanding artificial intelligence. There are so many ways we as humans or even other animals learn. Let's take my dog, Buster. When Buster was a pup I wanted to teach him to roll over, but I had two main problems.
Night of the Test(Automation)Busters
Then we internals took over and tried to add new features, while not breaking the existing ones… …today the monolith is still in place, but most of its functionality has been replaced by micro services communicating via asynchronous messaging and deliver their own frontends. In this session we will talk about challenges we faced over the past three years, about the "best practices" that failed while scaling up from 0 to 40 teams and the new challenges we are facing today. In BDD the formalized examples use a natural language-based DSL driven by the Given/When/Then keywords. At the same time, property-based testing (PBT) uses abstract (mathematical) formulas to declare expectations for the output values given some constraints on the input. The PBT tools try to disproof that the application fulfills these requirements by taking samples from the valid input value space.
Offbeat indie 'Buster's Mal Heart' showcases Rami Malek's gifts
Rami Malek haunts "Buster's Mal Heart" like an alien being just getting to know human form. In this finely calibrated indie from writer/director Sarah Adina Smith, the Emmy-winning "Mr. Robot" star is used to disconcertingly good effect, his large, lidded eyes like reservoirs of hope and pain. If Malek's role -- a mountain man nicknamed Buster who rants about an upcoming apocalyptic Inversion, but who was once a dedicated family man and hotel concierge named Jonah -- reminds you of his split-personality TV role, it's less a glomming-on than a burnishing of the actor's gift with conveying paralyzing rootlessness. In the Jonah scenes, Malek is a clean-cut, teetering Lynchian figure, dreaming of a better life for wife Marty (Kate Lyn Shiel) and their toddler daughter, but drawn to a mysterious conspiracy theorist figure (DJ Qualls) who feeds his nagging belief that his fate is mechanistic and predetermined.
'Mr. Robot' Star Rami Malek To Replace Sasha Baron Cohen As Freddie Mercury In Queen Biopic
Robot" star Rami Malek has been tapped to play Queen frontman Freddie Mercury in a biopic about the band entitled "Bohemian Rhapsody." The actor will replace Sasha Baron Cohen, who left the project due to creative differences, according to Entertainment Weekly. The screenplay for the film was written by Anthony McCarten. The movie will be directed by Bryan Singer. Original Queen members Brian May and Roger Taylor will serve as the movie's music producers. Cohen was tapped to play Mercury in 2010. He eventually dropped out of the project in 2013 because of creative differences from the members of the band. In an interview Howard Stern, the actor explained the reason behind the departure, saying: "The problem is -- and I think it's with any biopic, and I fully understand why Queen wanted to do this -- if you're in control of your rights and your life story, why wouldn't you depict yourself as great as possible?" Cohen also said he had an issue about the living members of the group wanting to end the movie with the band carrying on after Mercury died of AIDS. Aside from the upcoming film, fans can also look forward to seeing Malek in the thriller "Buster's Mal Heart." The film, which debuted at the Toronto International Film Festival, will hit theaters early 2017, according to Deadline. The film follows the story of Buster, a mountain man who squats in empty homes. Before he became what he is, Buster was a family guy. The movie explores how the shift from one life to another happened. The USA TV show will return next year, although an air date has not been announced. Rami Malek will play Freddie Mercury in the upcoming Queen biopic. Robot" arrives at the 68th Primetime Emmy Awards in Los Angeles, California U.S., September 18, 2016.
Swipe Buster lets you check to see if YOUR partner is cheating on Tinder
A new online tool can search Tinder to find out if your partner has recently been swiping. The service called Swipe Buster costs 5 and uses public information from the dating platform to'bust' a suspected cheater. This allows anyone to search for specific Tinder users without even making an account – but the ease of'spying' now has many users concerned about the security of their data. A new online tool can search Tinder to find out if your partner has recently been swiping. The service called Swipe Buster costs 5 and uses public information from the dating platform to'bust' a suspected cheater.