Generative AI
Fox News AI Newsletter: China gains ground
FILE - Chinese President Xi Jinping waves at an event to introduce new members of the Politburo Standing Committee at the Great Hall of the People in Beijing on Oct. 23, 2022. A man is seen using the OpenAI ChatGPT artificial intelligence chat website in this illustration photo on July 18, 2023. AMERICA MUST WIN: OpenAI's Chris Lehane is warning of America's shrinking lead in the artificial intelligence space as the company releases its economic blueprint and policy proposals for the U.S. 'ONCE UPON A TIME': A happily-ever-after with someone a woman believed was Hollywood hunk Brad Pitt quickly turned into a living nightmare. AI TRANSFORMER HOMES: AC Future, a leading developer of AI-enabled sustainable living solutions, has partnered with world-renowned Italian design house Pininfarina to create a groundbreaking collection of transformable living spaces. This innovative collaboration has resulted in three distinct products: AI-THd (AI Transformer Home Drivable), AI-THu (AI Transformer Home Unit) and AI-THt (AI Transformer Home Trailer).
CogView: Mastering Text-to-Image Generation via Transformers
Text-to-Image generation in the general domain has long been an open problem, which requires both a powerful generative model and cross-modal understanding. We propose CogView, a 4-billion-parameter Transformer with VQ-VAE tokenizer to advance this problem. We also demonstrate the finetuning strategies for various downstream tasks, e.g. CogView achieves the state-of-the-art FID on the blurred MS COCO dataset, outperforming previous GAN-based models and a recent similar work DALL-E.
On Analyzing Generative and Denoising Capabilities of Diffusion-based Deep Generative Models
Their main strength comes from their unique setup in which a model (the backward diffusion process) is trained to reverse the forward diffusion process, which gradually adds noise to the input signal. Although DDGMs are well studied, it is still unclear how the small amount of noise is transformed during the backward diffusion process. Here, we focus on analyzing this problem to gain more insight into the behavior of DDGMs and their denoising and generative capabilities. We observe a fluid transition point that changes the functionality of the backward diffusion process from generating a (corrupted) image from noise to denoising the corrupted image to the final sample. Based on this observation, we postulate to divide a DDGM into two parts: a denoiser and a generator.
OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression
Explicit deep generative models (DGMs), e.g., VAEs and Normalizing Flows, have shown to offer an effective data modelling alternative for lossless compression. However, DGMs themselves normally require large storage space and thus contaminate the advantage brought by accurate data density estimation.To eliminate the requirement of saving separate models for different target datasets, we propose a novel setting that starts from a pretrained deep generative model and compresses the data batches while adapting the model with a dynamical system for only one epoch.We formalise this setting as that of One-Shot Online Adaptation (OSOA) of DGMs for lossless compression and propose a vanilla algorithm under this setting. Experimental results show that vanilla OSOA can save significant time versus training bespoke models and space versus using one model for all targets.With the same adaptation step number or adaptation time, it is shown vanilla OSOA can exhibit better space efficiency, e.g., 47\% less space, than fine-tuning the pretrained model and saving the fine-tuned model.Moreover, we showcase the potential of OSOA and motivate more sophisticated OSOA algorithms by showing further space or time efficiency with multiple updates per batch and early stopping.
Semi-supervised Learning with Deep Generative Models
The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Generative approaches have thus far been either inflexible, inefficient or non-scalable. We show that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.
Generative Physical AI in Vision: A Survey
Liu, Daochang, Zhang, Junyu, Dinh, Anh-Dung, Park, Eunbyung, Zhang, Shichao, Xu, Chang
Generative Artificial Intelligence (AI) has rapidly advanced the field of computer vision by enabling machines to create and interpret visual data with unprecedented sophistication. This transformation builds upon a foundation of generative models to produce realistic images, videos, and 3D or 4D content. Traditionally, generative models primarily focus on visual fidelity while often neglecting the physical plausibility of generated content. This gap limits their effectiveness in applications requiring adherence to real-world physical laws, such as robotics, autonomous systems, and scientific simulations. As generative AI evolves to increasingly integrate physical realism and dynamic simulation, its potential to function as a "world simulator" expands-enabling the modeling of interactions governed by physics and bridging the divide between virtual and physical realities. This survey systematically reviews this emerging field of physics-aware generative AI in computer vision, categorizing methods based on how they incorporate physical knowledge-either through explicit simulation or implicit learning. We analyze key paradigms, discuss evaluation protocols, and identify future research directions. By offering a comprehensive overview, this survey aims to help future developments in physically grounded generation for vision. The reviewed papers are summarized at https://github.com/BestJunYu/Awesome-Physics-aware-Generation.
An Integrated Approach to AI-Generated Content in e-health
Ahmed, Tasnim, Choudhury, Salimur
Artificial Intelligence-Generated Content, a subset of Generative Artificial Intelligence, holds significant potential for advancing the e-health sector by generating diverse forms of data. In this paper, we propose an end-to-end class-conditioned framework that addresses the challenge of data scarcity in health applications by generating synthetic medical images and text data, evaluating on practical applications such as retinopathy detection, skin infections and mental health assessments. Our framework integrates Diffusion and Large Language Models (LLMs) to generate data that closely match real-world patterns, which is essential for improving downstream task performance and model robustness in e-health applications. Experimental results demonstrate that the synthetic images produced by the proposed diffusion model outperform traditional GAN architectures. Similarly, in the text modality, data generated by uncensored LLM achieves significantly better alignment with real-world data than censored models in replicating the authentic tone.
Fanar: An Arabic-Centric Multimodal Generative AI Platform
Fanar Team, null, Abbas, Ummar, Ahmad, Mohammad Shahmeer, Alam, Firoj, Altinisik, Enes, Asgari, Ehsannedin, Boshmaf, Yazan, Boughorbel, Sabri, Chawla, Sanjay, Chowdhury, Shammur, Dalvi, Fahim, Darwish, Kareem, Durrani, Nadir, Elfeky, Mohamed, Elmagarmid, Ahmed, Eltabakh, Mohamed, Fatehkia, Masoomali, Fragkopoulos, Anastasios, Hasanain, Maram, Hawasly, Majd, Husaini, Mus'ab, Jung, Soon-Gyo, Lucas, Ji Kim, Magdy, Walid, Messaoud, Safa, Mohamed, Abubakr, Mohiuddin, Tasnim, Mousi, Basel, Mubarak, Hamdy, Musleh, Ahmad, Naeem, Zan, Ouzzani, Mourad, Popovic, Dorde, Sadeghi, Amin, Sencar, Husrev Taha, Shinoy, Mohammed, Sinan, Omar, Zhang, Yifan, Ali, Ahmed, Kheir, Yassine El, Ma, Xiaosong, Ruan, Chaoyi
We present Fanar, a platform for Arabic-centric multimodal generative AI systems, that supports language, speech and image generation tasks. At the heart of Fanar are Fanar Star and Fanar Prime, two highly capable Arabic Large Language Models (LLMs) that are best in the class on well established benchmarks for similar sized models. Fanar Star is a 7B (billion) parameter model that was trained from scratch on nearly 1 trillion clean and deduplicated Arabic, English and Code tokens. Fanar Prime is a 9B parameter model continually trained on the Gemma-2 9B base model on the same 1 trillion token set. Both models are concurrently deployed and designed to address different types of prompts transparently routed through a custom-built orchestrator. The Fanar platform provides many other capabilities including a customized Islamic Retrieval Augmented Generation (RAG) system for handling religious prompts, a Recency RAG for summarizing information about current or recent events that have occurred after the pre-training data cut-off date. The platform provides additional cognitive capabilities including in-house bilingual speech recognition that supports multiple Arabic dialects, voice and image generation that is fine-tuned to better reflect regional characteristics. Finally, Fanar provides an attribution service that can be used to verify the authenticity of fact based generated content. The design, development, and implementation of Fanar was entirely undertaken at Hamad Bin Khalifa University's Qatar Computing Research Institute (QCRI) and was sponsored by Qatar's Ministry of Communications and Information Technology to enable sovereign AI technology development.
A Generative Security Application Engineering Curriculum
Feng, Wu-chang, Baker-Robinson, David
Generative AI and large language models (LLMs) are transforming security by automating many tasks being performed manually. With such automation changing the practice of security as we know it, it is imperative that we prepare future students for the technology landscape they will ultimately face. Towards this end, we describe an initial curriculum and course that attempts to show students how to apply generative AI in order to solve problems in security. By refocusing security education and training on aspects uniquely suited for humans and showing students how to leverage automation for the rest, we believe we can better align security education practices with generative AI as it evolves.
Sam Altman's OpenAI backing initiative headed by several anti-Trump staff pushing liberal causes
Fox News chief national security correspondent Jennifer Griffin reports on what the US and Israel are doing to stay ahead of adversaries in AI on'Special Report.' OpenAI has partnered with a new AI initiative led by a group co-founded with outgoing Special Presidential Envoy for Climate John Kerry that has pushed left-wing causes and has several board members aligned with Democrats. OpenAI, led by CEO Sam Altman, is backing an initiative known as AI 2030, which is aimed at shaping "public dialogue about U.S. competition against China on AI," Politico reported in October. The initiative is led by the "non-partisan" think tank American Security Project (ASP), where Kerry was a founding member and served two stints on the board of directors. ASP has promoted the idea that climate change is a national security threat, and argued on its website that pulling out of the Iran Nuclear Deal was a bad idea that "harms national security."