Goto

Collaborating Authors

 Generative AI


Re.Dis.Cover Place with Generative AI: Exploring the Experience and Design of City Wandering with Image-to-Image AI

arXiv.org Artificial Intelligence

The HCI field has demonstrated a growing interest in leveraging emerging technologies to enrich urban experiences. However, insufficient studies investigate the experience and design space of AI image technology (AIGT) applications for playful urban interaction, despite its widespread adoption. To explore this gap, we conducted an exploratory study involving four participants who wandered and photographed within Eindhoven Centre and interacted with an image-to-image AI. Preliminary findings present their observations, the effect of their familiarity with places, and how AIGT becomes an explorer's tool or co-speculator. We then highlight AIGT's capability of supporting playfulness, reimaginations, and rediscoveries of places through defamiliarizing and familiarizing cityscapes. Additionally, we propose the metaphor AIGT as a 'tourist' to discuss its opportunities for engaging explorations and risks of stereotyping places. Collectively, our research provides initial empirical insights and design considerations, inspiring future HCI endeavors for creating urban play with generative AI.


Deep Generative Modeling Reshapes Compression and Transmission: From Efficiency to Resiliency

arXiv.org Artificial Intelligence

Information theory and machine learning are inextricably linked and have even been referred to as "two sides of the same coin". One particularly elegant connection is the essential equivalence between probabilistic generative modeling and data compression or transmission. In this article, we reveal the dual-functionality of deep generative models that reshapes both data compression for efficiency and transmission error concealment for resiliency. We present how the contextual predictive capabilities of powerful generative models can be well positioned to be strong compressors and estimators. In this sense, we advocate for viewing the deep generative modeling problem through the lens of end-to-end communications, and evaluate the compression and error restoration capabilities of foundation generative models. We show that the kernel of many large generative models is powerful predictor that can capture complex relationships among semantic latent variables, and the communication viewpoints provide novel insights into semantic feature tokenization, contextual learning, and usage of deep generative models. In summary, our article highlights the essential connections of generative AI to source and channel coding techniques, and motivates researchers to make further explorations in this emerging topic.


Survey for Landing Generative AI in Social and E-commerce Recsys -- the Industry Perspectives

arXiv.org Artificial Intelligence

Recently, generative AI (GAI), with their emerging capabilities, have presented unique opportunities for augmenting and revolutionizing industrial recommender systems (Recsys). Despite growing research efforts at the intersection of these fields, the integration of GAI into industrial Recsys remains in its infancy, largely due to the intricate nature of modern industrial Recsys infrastructure, operations, and product sophistication. Drawing upon our experiences in successfully integrating GAI into several major social and e-commerce platforms, this survey aims to comprehensively examine the underlying system and AI foundations, solution frameworks, connections to key research advancements, as well as summarize the practical insights and challenges encountered in the endeavor to integrate GAI into industrial Recsys. As pioneering work in this domain, we hope outline the representative developments of relevant fields, shed lights on practical GAI adoptions in the industry, and motivate future research.


The Impact of AI on Academic Research and Publishing

arXiv.org Artificial Intelligence

Keywords: Artificial Intelligence, Large Language Models, Academic Research, Publishing Ethics, Scholarly Publishing Abstract Generative artificial intelligence (AI) technologies like ChatGPT, have significantly impacted academic writing and publishing through their ability to generate content at levels comparable to or surpassing human writers. Through a review of recent interdisciplinary literature, this paper examines ethical considerations surrounding the integration of AI into academia, focusing on the potential for this technology to be used for scholarly misconduct and necessary oversight when using it for writing, editing, and reviewing of scholarly papers. The findings highlight the need for collaborative approaches to AI usage among publishers, editors, reviewers, and authors to ensure that this technology is used ethically and productively. Introduction Generative artificial intelligence technologies have rapidly transformed our daily lives, with one of the most profound impacts observed in the realm of writing. These models can produce content at a level that either matches or surpasses the quality of an average human writer. This transformation holds particular significance in academia, where faculty members are traditionally expected to engage in extensive scholarly writing. The increasing prevalence of generative artificial intelligence in academia raises substantial ethical concerns.


Benchmarking Counterfactual Image Generation

arXiv.org Artificial Intelligence

Generative AI has revolutionised visual content editing, empowering users to effortlessly modify images and videos. However, not all edits are equal. To perform realistic edits in domains such as natural image or medical imaging, modifications must respect causal relationships inherent to the data generation process. Such image editing falls into the counterfactual image generation regime. Evaluating counterfactual image generation is substantially complex: not only it lacks observable ground truths, but also requires adherence to causal constraints. Although several counterfactual image generation methods and evaluation metrics exist, a comprehensive comparison within a unified setting is lacking. We present a comparison framework to thoroughly benchmark counterfactual image generation methods. We integrate all models that have been used for the task at hand and expand them to novel datasets and causal graphs, demonstrating the superiority of Hierarchical VAEs across most datasets and metrics. Our framework is implemented in a user-friendly Python package that can be extended to incorporate additional SCMs, causal methods, generative models, and datasets for the community to build on.


Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction

arXiv.org Artificial Intelligence

In recent years, large language models (LLMs) have demonstrated notable success across various tasks, but the trustworthiness of LLMs is still an open problem. One specific threat is the potential to generate toxic or harmful responses. Attackers can craft adversarial prompts that induce harmful responses from LLMs. In this work, we pioneer a theoretical foundation in LLMs security by identifying bias vulnerabilities within the safety fine-tuning and design a black-box jailbreak method named DRA (Disguise and Reconstruction Attack), which conceals harmful instructions through disguise and prompts the model to reconstruct the original harmful instruction within its completion. We evaluate DRA across various open-source and closed-source models, showcasing state-of-the-art jailbreak success rates and attack efficiency. Notably, DRA boasts a 91.1% attack success rate on OpenAI GPT-4 chatbot.


Efficient Shapley Values for Attributing Global Properties of Diffusion Models to Data Group

arXiv.org Artificial Intelligence

As diffusion models are deployed in real-world settings, data attribution is needed to ensure fair acknowledgment for contributors of high-quality training data and to identify sources of harmful content. Previous work focuses on identifying individual training samples important for the generation of a given image. However, instead of focusing on a given generated image, some use cases require understanding global properties of the distribution learned by a diffusion model (e.g., demographic diversity). Furthermore, training data for diffusion models are often contributed in groups rather than separately (e.g., multiple artworks from the same artist). Hence, here we tackle the problem of attributing global properties of diffusion models to groups of training data. Specifically, we develop a method to efficiently estimate Shapley values by leveraging model pruning and fine-tuning. We empirically demonstrate the utility of our method with three use cases: (i) global image quality for a DDPM trained on a CIFAR dataset, (ii) demographic diversity for an LDM trained on CelebA-HQ, and (iii) overall aesthetic quality for a Stable Diffusion model LoRA-finetuned on Post-Impressionist artworks.


LLMs Meet Multimodal Generation and Editing: A Survey

arXiv.org Artificial Intelligence

With the recent advancement in large language models (LLMs), there is a growing interest in combining LLMs with multimodal learning. Previous surveys of multimodal large language models (MLLMs) mainly focus on multimodal understanding. This survey elaborates on multimodal generation and editing across various domains, comprising image, video, 3D, and audio. Specifically, we summarize the notable advancements with milestone works in these fields and categorize these studies into LLM-based and CLIP/T5-based methods. Then, we summarize the various roles of LLMs in multimodal generation and exhaustively investigate the critical technical components behind these methods and the multimodal datasets utilized in these studies. Additionally, we dig into tool-augmented multimodal agents that can leverage existing generative models for human-computer interaction. Lastly, we discuss the advancements in the generative AI safety field, investigate emerging applications, and discuss future prospects. Our work provides a systematic and insightful overview of multimodal generation and processing, which is expected to advance the development of Artificial Intelligence for Generative Content (AIGC) and world models. A curated list of all related papers can be found at https://github.com/YingqingHe/Awesome-LLMs-meet-Multimodal-Generation


SynthAI: A Multi Agent Generative AI Framework for Automated Modular HLS Design Generation

arXiv.org Artificial Intelligence

In this paper, we introduce SynthAI, a new method for the automated creation of High-Level Synthesis (HLS) designs. SynthAI integrates ReAct agents, Chain-of-Thought (CoT) prompting, web search technologies, and the Retrieval-Augmented Generation (RAG) framework within a structured decision graph. This innovative approach enables the systematic decomposition of complex hardware design tasks into multiple stages and smaller, manageable modules. As a result, SynthAI produces synthesizable designs that closely adhere to user-specified design objectives and functional requirements. We further validate the capabilities of SynthAI through several case studies, highlighting its proficiency in generating complex, multi-module logic designs from a single initial prompt. The SynthAI code is provided via the following repo: \url{https://github.com/sarashs/FPGA_AGI}


Governance of Generative Artificial Intelligence for Companies

arXiv.org Artificial Intelligence

Generative Artificial Intelligence (GenAI), specifically large language models like ChatGPT, has swiftly entered organizations without adequate governance, posing both opportunities and risks. Despite extensive debates on GenAI's transformative nature and regulatory measures, limited research addresses organizational governance, encompassing technical and business perspectives. Our review paper fills this gap by surveying recent works with the purpose of developing a framework for GenAI governance within companies. This framework outlines the scope, objectives, and governance mechanisms tailored to harness business opportunities as well as mitigate risks associated with GenAI integration. Our research contributes a focused approach to GenAI governance, offering practical insights for companies navigating the challenges of GenAI adoption and highlighting research gaps.