Goto

Collaborating Authors

 Generative AI


MotionFlow: Attention-Driven Motion Transfer in Video Diffusion Models

arXiv.org Artificial Intelligence

Text-to-video models have demonstrated impressive capabilities in producing diverse and captivating video content, showcasing a notable advancement in generative AI. However, these models generally lack fine-grained control over motion patterns, limiting their practical applicability. We introduce MotionFlow, a novel framework designed for motion transfer in video diffusion models. Our method utilizes cross-attention maps to accurately capture and manipulate spatial and temporal dynamics, enabling seamless motion transfers across various contexts. Our approach does not require training and works on test-time by leveraging the inherent capabilities of pre-trained video diffusion models. In contrast to traditional approaches, which struggle with comprehensive scene changes while maintaining consistent motion, MotionFlow successfully handles such complex transformations through its attention-based mechanism. Our qualitative and quantitative experiments demonstrate that MotionFlow significantly outperforms existing models in both fidelity and versatility even during drastic scene alterations.


Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation

arXiv.org Artificial Intelligence

The detection of sensitive content in large datasets is crucial for ensuring that shared and analysed data is free from harmful material. However, current moderation tools, such as external APIs, suffer from limitations in customisation, accuracy across diverse sensitive categories, and privacy concerns. Additionally, existing datasets and open-source models focus predominantly on toxic language, leaving gaps in detecting other sensitive categories such as substance abuse or self-harm. In this paper, we put forward a unified dataset tailored for social media content moderation across six sensitive categories: conflictual language, profanity, sexually explicit material, drug-related content, self-harm, and spam. By collecting and annotating data with consistent retrieval strategies and guidelines, we address the shortcomings of previous focalised research. Our analysis demonstrates that fine-tuning large language models (LLMs) on this novel dataset yields significant improvements in detection performance compared to open off-the-shelf models such as LLaMA, and even proprietary OpenAI models, which underperform by 10-15% overall. This limitation is even more pronounced on popular moderation APIs, which cannot be easily tailored to specific sensitive content categories, among others.


BadGPT-4o: stripping safety finetuning from GPT models

arXiv.org Artificial Intelligence

LLM vendors expend substantial effort to secure their models and make them unhelpful to adversaries like cybercriminals (Touvron et al. 2023, Section 4.3) (OpenAI et al. 2024, Section 3) (OpenAI 2024a). However, LLMs have been repeatedly "jailbroken" out of these constraints (Chao et al. 2024; Mazeika et al. 2024; Souly et al. 2024). No robust LLM security measures are known. Classic jailbreaks encode LLM prompts to bypass model safeguards. They tend to be unstable, add a token overhead, and reduce model performance (Chao et al. 2024; Mazeika et al. 2024; Souly et al. 2024).


A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

arXiv.org Artificial Intelligence

Text-to-SQL systems facilitate smooth interaction with databases by translating natural language queries into Structured Query Language (SQL), bridging the gap between non-technical users and complex database management systems. This survey provides a comprehensive overview of the evolution of AI-driven text-to-SQL systems, highlighting their foundational components, advancements in large language model (LLM) architectures, and the critical role of datasets such as Spider, WikiSQL, and CoSQL in driving progress. We examine the applications of text-to-SQL in domains like healthcare, education, and finance, emphasizing their transformative potential for improving data accessibility. Additionally, we analyze persistent challenges, including domain generalization, query optimization, support for multi-turn conversational interactions, and the limited availability of datasets tailored for NoSQL databases and dynamic real-world scenarios. To address these challenges, we outline future research directions, such as extending text-to-SQL capabilities to support NoSQL databases, designing datasets for dynamic multi-turn interactions, and optimizing systems for real-world scalability and robustness. By surveying current advancements and identifying key gaps, this paper aims to guide the next generation of research and applications in LLM-based text-to-SQL systems.


Is Your Paper Being Reviewed by an LLM? Investigating AI Text Detectability in Peer Review

arXiv.org Artificial Intelligence

Peer review is a critical process for ensuring the integrity of published scientific research. Confidence in this process is predicated on the assumption that experts in the relevant domain give careful consideration to the merits of manuscripts which are submitted for publication. With the recent rapid advancements in the linguistic capabilities of large language models (LLMs), a new potential risk to the peer review process is that negligent reviewers will rely on LLMs to perform the often time consuming process of reviewing a paper. In this study, we investigate the ability of existing AI text detection algorithms to distinguish between peer reviews written by humans and different state-of-the-art LLMs. Our analysis shows that existing approaches fail to identify many GPT-4o written reviews without also producing a high number of false positive classifications. To address this deficiency, we propose a new detection approach which surpasses existing methods in the identification of GPT-4o written peer reviews at low levels of false positive classifications. Our work reveals the difficulty of accurately identifying AI-generated text at the individual review level, highlighting the urgent need for new tools and methods to detect this type of unethical application of generative AI.


TFT-multi: simultaneous forecasting of vital sign trajectories in the ICU

arXiv.org Artificial Intelligence

Trajectory forecasting in healthcare data has been an important area of research in precision care and clinical integration for computational methods. In recent years, generative AI models have demonstrated promising results in capturing short and long range dependencies in time series data. While these models have also been applied in healthcare, most of them only predict one value at a time, which is unrealistic in a clinical setting where multiple measures are taken at once. In this work, we extend the framework temporal fusion transformer (TFT), a multi-horizon time series prediction tool, and propose TFT-multi, an end-to-end framework that can predict multiple vital trajectories simultaneously. We apply TFT-multi to forecast 5 vital signs recorded in the intensive care unit: blood pressure, pulse, SpO2, temperature and respiratory rate. We hypothesize that by jointly predicting these measures, which are often correlated with one another, we can make more accurate predictions, especially in variables with large missingness. We validate our model on the public MIMIC dataset and an independent institutional dataset, and demonstrate that this approach outperforms state-of-the-art univariate prediction tools including the original TFT and Prophet, as well as vector regression modeling for multivariate prediction. Furthermore, we perform a study case analysis by applying our pipeline to forecast blood pressure changes in response to actual and hypothetical pressor administration.


Here's What OpenAI's 200 Monthly ChatGPT Pro Subscription Includes

WIRED

Earlier today, OpenAI launched ChatGPT Pro, a 200 monthly subscription for its flagship chatbot. This release is the first of many expected during the next 12 days, as the San Francisco startup has scheduled a slew of announcements to roll out starting today. Everything from OpenAI's 20 monthly subscription is included at this price level as well as significantly more access to the GPT-4o and o1 artificial intelligence models. With a ChatGPT Pro subscription--which will cost 2,400 for a full year--users can also use an exclusive model from OpenAI called o1 pro mode that wields more computing power to process answers. "Power users of ChatGPT, at this point, they really use it a lot, and they want more compute than 20 can buy," said CEO Sam Altman during the video broadcast announcing the new premium tier.


OpenAI wants 200 a month for its most advanced features

Engadget

OpenAI kicked off its "12 Days of OpenAI" series of livestreams with the announcement of a new, more expensive tier for its flagship chatbot. Starting today, ChatGPT users can pay 200 per month for ChatGPT Pro. Included in the package is unlimited access to the company's latest model, o1, which following a limited preview earlier in the year, is now faster and 34 percent less likely to produce a major error when answering difficult real-world questions. ChatGPT Pro also comes with access to GPT-4o, o1-mini and the company's Advanced Voice mode, but the reason most power users are likely to splurge is the addition of an o1 "pro mode" that gives the chatbot additional compute power to reason through the most complex problems. "In evaluations from external expert testers, o1 pro mode produces more reliably accurate and comprehensive responses, especially in areas like data science, programming, and case law analysis," OpenAI says of the feature.


Believe it or not, ChatGPT gets over 1 billion messages every single day

PCWorld

There has been a lot of talk about AI chatbots over the past few years, but how much are they actually used? OpenAI's CEO Sam Altman shared in a tweet (spotted by MSPoweruser) some figures that blew us away. ChatGPT apparently has 300 million weekly active users, and the AI chatbot receives over 1 billion messages every day. Altman also boasts that over 1.3 million developers in the US alone have built upon OpenAI for various tools and services. Maybe that isn't too surprising when you consider how ChatGPT can improve day-to-day life.


The Download: OpenAI's defense contract, and making food from microbes

MIT Technology Review

You have been born into an era of intelligent machines. They have watched over you almost since your conception. They let your parents listen in on your tiny heartbeat, track your gestation on an app, and post your sonogram on social media. Well before you were born, you were known to the algorithm. Your arrival coincided with the 125th anniversary of this magazine.