Generative AI
Receiver-Centric Generative Semantic Communications
Liu, Xunze, Sun, Yifei, Wang, Zhaorui, You, Lizhao, Pan, Haoyuan, Wang, Fangxin, Cui, Shuguang
This paper investigates semantic communications between a transmitter and a receiver, where original data, such as videos of interest to the receiver, is stored at the transmitter. Although significant process has been made in semantic communications, a fundamental design problem is that the semantic information is extracted based on certain criteria at the transmitter alone, without considering the receiver's specific information needs. As a result, critical information of primary concern to the receiver may be lost. In such cases, the semantic transmission becomes meaningless to the receiver, as all received information is irrelevant to its interests. To solve this problem, this paper presents a receiver-centric generative semantic communication system, where each transmission is initialized by the receiver. Specifically, the receiver first sends its request for the desired semantic information to the transmitter at the start of each transmission. Then, the transmitter extracts the required semantic information accordingly. A key challenge is how the transmitter understands the receiver's requests for semantic information and extracts the required semantic information in a reasonable and robust manner. We address this challenge by designing a well-structured framework and leveraging off-the-shelf generative AI products, such as GPT-4, along with several specialized tools for detection and estimation. Evaluation results demonstrate the feasibility and effectiveness of the proposed new semantic communication system.
OpenAI will pay DotDash Meredith at least 16 million per year to license its content
OpenAI is paying the digital media company Dotdash Meredith at least 16 million per year to license its content, according to public financial documents reviewed by Adweek. We already knew about this burgeoning partnership, but we didn't have a financial figure. The actual payout could rise above 16 million per year, as it only reflects the "fixed" component of the payment. The "variable" component will be calculated in the future, according to a recent earnings call led by the chief operating and financial officer of Dotdash Meredith's parent company IAC. "If you look at Q3 of 2024, licensing revenue was up about 4.1 million year over year. The lion's share of that would be driven by the OpenAI license," CFO Chris Halpin said.
The US Patent and Trademark Office Banned Staff From Using Generative AI
The US Patent and Trademark Office banned the use of generative artificial intelligence for any purpose last year, citing security concerns with the technology as well as the propensity of some tools to exhibit "bias, unpredictability, and malicious behavior," according to an April 2023 internal guidance memo obtained by WIRED through a public records request. Jamie Holcombe, the chief information officer of the USPTO, wrote that the office is "committed to pursuing innovation within our agency" but are still "working to bring these capabilities to the office in a responsible way." Paul Fucito, press secretary for the USPTO, clarified to WIRED that employees can use "state-of-the-art generative AI models" at work--but only inside the agency's internal testing environment. "Innovators from across the USPTO are now using the AI Lab to better understand generative AI's capabilities and limitations and to prototype AI-powered solutions to critical business needs," Fucito wrote in an email. Outside of the testing environment, USPTO staff are barred from relying on AI programs like OpenAI's ChatGPT or Anthropic's Claude for work tasks.
Breaking the Cycle of Recurring Failures: Applying Generative AI to Root Cause Analysis in Legacy Banking Systems
Jin, Siyuan, Bei, Zhendong, Chen, Bichao, Xia, Yong
Traditional banks face significant challenges in digital transformation, primarily due to legacy system constraints and fragmented ownership. Recent incidents show that such fragmentation often results in superficial incident resolutions, leaving root causes unaddressed and causing recurring failures. We introduce a novel approach to post-incident analysis, integrating knowledge-based GenAI agents with the "Five Whys" technique to examine problem descriptions and change request data. This method uncovered that approximately 70% of the incidents previously attributed to management or vendor failures were due to underlying internal code issues. We present a case study to show the impact of our method. By scanning over 5,000 projects, we identified over 400 files with a similar root cause. Overall, we leverage the knowledge-based agents to automate and elevate root cause analysis, transforming it into a more proactive process. These agents can be applied across other phases of the software development lifecycle, further improving development processes.
Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of a Virtual Campus Environment with OpenAI GPT Integration with Unity 3D
TG, Adithya, N, Abhinavaram, Srinivasa, Gowri
This paper presents a new approach to multiple language learning, with Hindi the language to be learnt in our case, by using the integration of virtual reality environments and AI enabled tutoring systems using OpenAIs GPT api calls. We have developed a scenario which has a virtual campus environment using Unity which focuses on a detailed representation of our universitys buildings 11th floor, where most of the cultural and technological activities take place. Within this virtual environment that we have created, we have an AI tutor powered by OpenAI's GPT model which was called using an api which moves around with the user. This provided language learning support in Hindi, as GPT is able to take care of language translation. Our approach mainly involves utilising speech to text, text to text conversion and text to speech capabilities to facilitate real time interaction between users and the AI tutor in the presence of internet. This research demonstrates the use of combining VR technology with AI tutoring for immersive language learning experiences and provides interaction.
Building Trust: Foundations of Security, Safety and Transparency in AI
Sidhpurwala, Huzaifa, Mollett, Garth, Fox, Emily, Bestavros, Mark, Chen, Huamin
This p aper explore s the rapidly evolving ecosystem of publicly available AI models, and their potential implications on the s ecurit y and s afet y lands cape. A s AI models become increasingly prevalent, understanding their potential risks and vulnerabilitie s is crucial. We review the current s ecurit y and s afet y s cenarios while highlighting challenge s such as tracking issue s, remediation, and the app arent abs ence of AI model lifecycle and ownership proce ss e s. Comprehensive strategie s to enhance s ecurit y and s afet y for both model developers and end-us ers are propos ed. This p aper aims to provide s ome of the foundational piece s for more standardized s ecurit y, s afet y, and transp arency in the development and operation of AI models and the larger open ecosystems and communitie s forming around them. Generative AI, a branch of artificial intelligence focus ed on AI produc tion of content such as text, image s and video, has s een significant advancement s since the introduc tion of generative advers arial net works (GANs) in 2014 (Goodfellow et al., 2014), which improved data generation but faced issue s like training instabilit y. The development of transformers and s elf at tention mechanisms in 2017 (Vaswani et al., 2017) facilitated further improvement s in natural language proce ssing, leading to large language models (LLMs) like GPT (Radford et al., 2018) with highly advanced text generation cap abilitie s. Dif fusion models (S ohl-Dickstein et al., 2015) have als o s een rapid advancement in image and video generation. This rapid advancement in technology cap abilit y has been matched by an equally rapid uptake in adoption. A s with any new technology, it is worth noting that the industr y is still identif ying new and valuable us e s for AI and the s e market predic tions may fluc tuate as us e cas e s are te sted in real world environment s with real world problems. For the purpos e of clarit y we shall be using the term public model, for a model which is publicly available for download and us e. LLMs are the next evolution of data s cience, a field focus ed on math and data. Unlike traditional systems and applications which rely on logic and programming for a specified outcome, large language model development t ypically consist s of architec ture re s earch and de sign, which is then coded.
A Review on Generative AI Models for Synthetic Medical Text, Time Series, and Longitudinal Data
Loni, Mohammad, Poursalim, Fatemeh, Asadi, Mehdi, Gharehbaghi, Arash
This paper presents the results of a novel scoping review on the practical models for generating three different types of synthetic health records (SHRs): medical text, time series, and longitudinal data. The innovative aspects of the review, which incorporate study objectives, data modality, and research methodology of the reviewed studies, uncover the importance and the scope of the topic for the digital medicine context. In total, 52 publications met the eligibility criteria for generating medical time series (22), longitudinal data (17), and medical text (13). Privacy preservation was found to be the main research objective of the studied papers, along with class imbalance, data scarcity, and data imputation as the other objectives. The adversarial network-based, probabilistic, and large language models exhibited superiority for generating synthetic longitudinal data, time series, and medical texts, respectively. Finding a reliable performance measure to quantify SHR re-identification risk is the major research gap of the topic.
InvisMark: Invisible and Robust Watermarking for AI-generated Image Provenance
Xu, Rui, Hu, Mengya, Lei, Deren, Li, Yaxi, Lowe, David, Gorevski, Alex, Wang, Mingyu, Ching, Emily, Deng, Alex
The proliferation of AI-generated images has intensified the need for robust content authentication methods. We present InvisMark, a novel watermarking technique designed for high-resolution AI-generated images. Our approach leverages advanced neural network architectures and training strategies to embed imperceptible yet highly robust watermarks. InvisMark achieves state-of-the-art performance in imperceptibility (PSNR$\sim$51, SSIM $\sim$ 0.998) while maintaining over 97\% bit accuracy across various image manipulations. Notably, we demonstrate the successful encoding of 256-bit watermarks, significantly expanding payload capacity while preserving image quality. This enables the embedding of UUIDs with error correction codes, achieving near-perfect decoding success rates even under challenging image distortions. We also address potential vulnerabilities against advanced attacks and propose mitigation strategies. By combining high imperceptibility, extended payload capacity, and resilience to manipulations, InvisMark provides a robust foundation for ensuring media provenance in an era of increasingly sophisticated AI-generated content. Source code of this paper is available at: https://github.com/microsoft/InvisMark.
There's No Longer Any Doubt That Hollywood Writing Is Powering AI
Editor's note: This analysis is part of The Atlantic's investigation into the OpenSubtitles data set. You can access the search tool directly here. Find The Atlantic's search tool for books used to train AI here. For as long as generative-AI chatbots have been on the internet, Hollywood writers have wondered if their work has been used to train them. The chatbots are remarkably fluent with movie references, and companies seem to be training them on all available sources.
Search the Hollywood AI Database
There's No Longer Any Doubt That Hollywood Writing Is Powering AI Alex Reisner * AI's Fingerprints Were All Over the Election Matteo Wong * Use this search tool to see how writing from 139,000 movies and TV shows has trained generative AI.By Alex ReisnerIllustration by The Atlantic. Editor's note: This search tool is part of The Atlantic's investigation into the OpenSubtitles data set. You can read more about this data set and how it's been used to train AIhere. Find The Atlantic's search tool for books used to train AI here. About the Author Alex Reisner is a freelance writer, programmer, and technical consultant.