Goto

Collaborating Authors

 Generative AI


Senators introduce bill to protect individuals against AI-generated deepfakes

Engadget

Today, a group of senators introduced the NO FAKES Act, a law that would make it illegal to create digital recreations of a person's voice or likeness without that individual's consent. Amy Klobuchar (D-Minn.) and Thom Tillis (R-N.C.), fully titled the Nurture Originals, Foster Art, and Keep Entertainment Safe Act of 2024. If it passes, the NO FAKES Act would create an option for people to seek damages when their voice, face or body are recreated by AI. Both individuals and companies would be held liable for producing, hosting or sharing unauthorized digital replicas, including ones made by generative AI. We've already seen many instances of celebrities finding their imitations of themselves out in the world.


Semantic Successive Refinement: A Generative AI-aided Semantic Communication Framework

arXiv.org Artificial Intelligence

Semantic Communication (SC) is an emerging technology aiming to surpass the Shannon limit. Traditional SC strategies often minimize signal distortion between the original and reconstructed data, neglecting perceptual quality, especially in low Signal-to-Noise Ratio (SNR) environments. To address this issue, we introduce a novel Generative AI Semantic Communication (GSC) system for single-user scenarios. This system leverages deep generative models to establish a new paradigm in SC. Specifically, At the transmitter end, it employs a joint source-channel coding mechanism based on the Swin Transformer for efficient semantic feature extraction and compression. At the receiver end, an advanced Diffusion Model (DM) reconstructs high-quality images from degraded signals, enhancing perceptual details. Additionally, we present a Multi-User Generative Semantic Communication (MU-GSC) system utilizing an asynchronous processing model. This model effectively manages multiple user requests and optimally utilizes system resources for parallel processing. Simulation results on public datasets demonstrate that our generative AI semantic communication systems achieve superior transmission efficiency and enhanced communication content quality across various channel conditions. Compared to CNN-based DeepJSCC, our methods improve the Peak Signal-to-Noise Ratio (PSNR) by 17.75% in Additive White Gaussian Noise (AWGN) channels and by 20.86% in Rayleigh channels.


Tabular Data Augmentation for Machine Learning: Progress and Prospects of Embracing Generative AI

arXiv.org Artificial Intelligence

Machine learning (ML) on tabular data is ubiquitous, yet obtaining abundant high-quality tabular data for model training remains a significant obstacle. Numerous works have focused on tabular data augmentation (TDA) to enhance the original table with additional data, thereby improving downstream ML tasks. Recently, there has been a growing interest in leveraging the capabilities of generative AI for TDA. Therefore, we believe it is time to provide a comprehensive review of the progress and future prospects of TDA, with a particular emphasis on the trending generative AI. Specifically, we present an architectural view of the TDA pipeline, comprising three main procedures: pre-augmentation, augmentation, and post-augmentation. Pre-augmentation encompasses preparation tasks that facilitate subsequent TDA, including error handling, table annotation, table simplification, table representation, table indexing, table navigation, schema matching, and entity matching. Augmentation systematically analyzes current TDA methods, categorized into retrieval-based methods, which retrieve external data, and generation-based methods, which generate synthetic data. We further subdivide these methods based on the granularity of the augmentation process at the row, column, cell, and table levels. Post-augmentation focuses on the datasets, evaluation and optimization aspects of TDA. We also summarize current trends and future directions for TDA, highlighting promising opportunities in the era of generative AI. In addition, the accompanying papers and related resources are continuously updated and maintained in the GitHub repository at https://github.com/SuDIS-ZJU/awesome-tabular-data-augmentation to reflect ongoing advancements in the field.


Between the AI and Me: Analysing Listeners' Perspectives on AI- and Human-Composed Progressive Metal Music

arXiv.org Artificial Intelligence

Generative AI models have recently blossomed, significantly impacting artistic and musical traditions. Research investigating how humans interact with and deem these models is therefore crucial. Through a listening and reflection study, we explore participants' perspectives on AI- vs human-generated progressive metal, in symbolic format, using rock music as a control group. AI-generated examples were produced by ProgGP, a Transformer-based model. We propose a mixed methods approach to assess the effects of generation type (human vs. AI), genre (progressive metal vs. rock), and curation process (random vs. cherry-picked). This combines quantitative feedback on genre congruence, preference, creativity, consistency, playability, humanness, and repeatability, and qualitative feedback to provide insights into listeners' experiences. A total of 32 progressive metal fans completed the study. Our findings validate the use of fine-tuning to achieve genre-specific specialization in AI music generation, as listeners could distinguish between AI-generated rock and progressive metal. Despite some AI-generated excerpts receiving similar ratings to human music, listeners exhibited a preference for human compositions. Thematic analysis identified key features for genre and AI vs. human distinctions. Finally, we consider the ethical implications of our work in promoting musical data diversity within MIR research by focusing on an under-explored genre.


OpenAI rolls out advanced Voice Mode and no, it won't sound like ScarJo

Engadget

OpenAI has started rolling out its advanced Voice Mode feature. Starting today, a small number of paying ChatGPT users will be able to have a tete-a-tete with the AI chatbot. All ChatGPT Plus members should receive access to the expanded toolset by the fall of this year. In an announcement on X, the company said this advanced version of its Voice Mode "offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions." We're starting to roll out advanced Voice Mode to a small group of ChatGPT Plus users.


The AI Search War Has Begun

The Atlantic - Technology

Every second of every day, people across the world type tens of thousands of queries into Google, adding up to trillions of searches a year. Google and a few other search engines are the portal through which several billion people navigate the internet. Many of the world's most powerful tech companies, including Google, Microsoft, and OpenAI, have recently spotted an opportunity to remake that gateway with generative AI, and they are racing to seize it. And as of this week, the generative-AI search wars are in full swing. The value of an AI-powered search bar is straightforward: Instead of having to open and read multiple links, wouldn't it be better to type your query into a chatbot and receive an immediate, comprehensive answer?


OpenAI has released a new ChatGPT bot that you can talk to

MIT Technology Review

The voice mode is powered by OpenAI's new GPT-4o model, which combines voice, text, and vision capabilities. To gather feedback, the company is initially launching the chatbot to a "small group of users" paying for ChatGPT Plus, but it says it will make the bot available to all ChatGPT Plus subscribers this fall. OpenAI says it will notify customers who are part of the first rollout wave in the ChatGPT app and provide instructions on how to use the new model. The new voice feature, which was announced in May, is being launched a month later than originally planned because the company said it needed more time to improve safety features, such as the model's ability to detect and refuse unwanted content. The company also said it was preparing its infrastructure to offer real-time responses to millions of users.


UK regulator looks at Google's partnership with Anthropic

The Guardian

The Competition and Markets Authority has begun a preliminary investigation into a partnership between Google and the AI startup Anthropic, marking the latest in a string of investigations into deals between big tech companies and smallerAI ones. Google invested 2bn (about 1.56bn) into Anthropic in 2023, shortly after signing a cloud computing agreement with the startup, which develops the Claude LLM and chatbot. The CMA is now considering whether the partnership has "resulted in the creation of a relevant merger situation" which would allow the agency to begin a formal investigation. It is inviting comments over the next two weeks. The move comes amid broader concerns about competition in the generative AI sector. A deal between Amazon and Anthropic is also being investigated by the CMA as a potential merger after Amazon took a 4bn stake in the company and signed a deal to become one of the startup's cloud computing providers.


TechScape: Will OpenAI's 5bn gamble on chatbots pay off? Only if you use them

The Guardian

What if you build it and they don't come? The Guardian's journalism is independent. We will earn a commission if you buy something through an affiliate link. It's fair to say the shine is coming off the AI boom. Soaring valuations are starting to look unstable next to the sky-high spending required to sustain them.


The Responsible Development of Automated Student Feedback with Generative AI

arXiv.org Artificial Intelligence

Abstract--Contribution: This paper identifies four critical ethical considerations for implementing generative AI tools to provide automated feedback to students. Background: Providing rich feedback to students is essential for supporting student learning. Recent advances in generative AI, particularly with large language models (LLMs), provide the opportunity to deliver repeatable, scalable and instant automatically generated feedback to students, making abundant a previously scarce and expensive learning resource. A visualisation of Bloom's revised taxonomy, modified from [6]. Intended Outcomes: The goal of this work is to enable the use of AI systems to automate mundane assessment and feedback tasks, without introducing a "tyranny of the majority", where HE release of powerful language technology tools based on generative language modelling (e.g., ChatGPT, GPT-are going to use AI tools in their working lives, we should 4(o), Claude, Gemini, Llama; [1]-[3]), marked a significant aim to train them in their use. For example, While assessment is a clear space of development for days after the release of ChatGPT, students, educators, and this type of educational technology, we argue that the real the public alike discovered the potential of the application potential of generative language modelling can be found in for assisting with a range of teaching and learning tasks, but student feedback. E. D. Lindsay is with the UNESCO Centre for Problem Based Learning M. Zhang is with the Department of Computer Science, Aalborg University, A.C. Meyers Vænge 15, 2450 København SV, Denmark. A. Johri is the Director of the Technocritical Research on AI, Learning J. Bjerva is with the Department of Computer Science, Aalborg University, Manuscript revised on July 31, 2024. Hence, this current state has common patterns of student answers and standardize responses effectively locked some engineering courses into a focus, to them, rather than having to make bespoke responses to where a particular set of questions are iterated over.