Goto

Collaborating Authors

 Generative AI


GraphSPNs: Sum-Product Networks Benefit From Canonical Orderings

arXiv.org Artificial Intelligence

Deep generative models have recently made a remarkable progress in capturing complex probability distributions over graphs. However, they are intractable and thus unable to answer even the most basic probabilistic inference queries without resorting to approximations. Therefore, we propose graph sum-product networks (GraphSPNs), a tractable deep generative model which provides exact and efficient inference over (arbitrary parts of) graphs. We investigate different principles to make SPNs permutation invariant. We demonstrate that GraphSPNs are able to (conditionally) generate novel and chemically valid molecular graphs, being competitive to, and sometimes even better than, existing intractable models. We find out that (Graph)SPNs benefit from ensuring the permutation invariance via canonical ordering.


VRCopilot: Authoring 3D Layouts with Generative AI Models in VR

arXiv.org Artificial Intelligence

Immersive authoring provides an intuitive medium for users to create 3D scenes via direct manipulation in Virtual Reality (VR). Recent advances in generative AI have enabled the automatic creation of realistic 3D layouts. However, it is unclear how capabilities of generative AI can be used in immersive authoring to support fluid interactions, user agency, and creativity. We introduce VRCopilot, a mixed-initiative system that integrates pre-trained generative AI models into immersive authoring to facilitate human-AI co-creation in VR. VRCopilot presents multimodal interactions to support rapid prototyping and iterations with AI, and intermediate representations such as wireframes to augment user controllability over the created content. Through a series of user studies, we evaluated the potential and challenges in manual, scaffolded, and automatic creation in immersive authoring. We found that scaffolded creation using wireframes enhanced the user agency compared to automatic creation. We also found that manual creation via multimodal specification offers the highest sense of creativity and agency.


DiffZOO: A Purely Query-Based Black-Box Attack for Red-teaming Text-to-Image Generative Model via Zeroth Order Optimization

arXiv.org Artificial Intelligence

Current text-to-image (T2I) synthesis diffusion models raise misuse concerns, particularly in creating prohibited or not-safe-for-work (NSFW) images. To address this, various safety mechanisms and red teaming attack methods are proposed to enhance or expose the T2I model's capability to generate unsuitable content. However, many red teaming attack methods assume knowledge of the text encoders, limiting their practical usage. In this work, we rethink the case of \textit{purely black-box} attacks without prior knowledge of the T2l model. To overcome the unavailability of gradients and the inability to optimize attacks within a discrete prompt space, we propose DiffZOO which applies Zeroth Order Optimization to procure gradient approximations and harnesses both C-PRV and D-PRV to enhance attack prompts within the discrete prompt domain. We evaluated our method across multiple safety mechanisms of the T2I diffusion model and online servers. Experiments on multiple state-of-the-art safety mechanisms show that DiffZOO attains an 8.5% higher average attack success rate than previous works, hence its promise as a practical red teaming tool for T2l models.


SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization

arXiv.org Artificial Intelligence

Recent studies have revealed that, during the inference on generative AI models such as transformer, the importance of different weights exhibits substantial context-dependent variations. This naturally manifests a promising potential of adaptively configuring weight quantization to improve the generative AI inference efficiency. Although configurable weight quantization can readily leverage the hardware support of variable-precision arithmetics in modern GPU and AI accelerators, little prior research has studied how one could exploit variable weight quantization to proportionally improve the AI model memory access speed and energy efficiency. Motivated by the rapidly maturing CXL ecosystem, this work develops a CXL-based design solution to fill this gap. The key is to allow CXL memory controllers play an active role in supporting and exploiting runtime configurable weight quantization. Using transformer as a representative generative AI model, we carried out experiments that well demonstrate the effectiveness of the proposed design solution.


Iranian group used ChatGPT to try to influence US election, OpenAI says

The Guardian

OpenAI said on Friday it had taken down accounts of an Iranian group for using its ChatGPT chatbot to generate content meant for influencing the US presidential election and other issues. The operation, identified as Storm-2035, used ChatGPT to generate content focused on topics such as commentary on the candidates on both sides in the US elections, the conflict in Gaza and Israel's presence at the Olympic Games and then shared it via social media accounts and websites, Open AI said. Investigation by the Microsoft-backed AI company showed ChatGPT was used for generating long-form articles and shorter social media comments. OpenAI said the operation did not appear to have achieved meaningful audience engagement. The majority of the identified social media posts received few or no likes, shares or comments and the company did not see indications of web articles being shared across social media.


OpenAI shut down an Iranian influence op that used ChatGPT to generate bogus news articles

Engadget

OpenAI said on Friday that it thwarted an Iranian influence campaign that used ChatGPT to generate fake news stories and social posts aimed at Americans. The company said it identified and banned accounts generating content for five websites (in English and Spanish) pretending to be news outlets, spreading "polarizing messages" on issues like the US presidential campaign, LGBTQ rights and the war in Gaza. The operation was identified as "Storm-2035," part of a series of influence campaigns Microsoft identified last week as "connected with the Iranian government." In addition to the news posts, it included "a dozen accounts on X and one on Instagram" connected to the operation. OpenAI said the op didn't appear to have gained any meaningful traction.


Why Does AI Art Look Like That?

The Atlantic - Technology

This week, X launched an AI-image generator, allowing paying subscribers of Elon Musk's social platform to make their own art. So--naturally--some users appear to have immediately made images of Donald Trump flying a plane toward the World Trade Center; Mickey Mouse wielding an assault rifle, and another of him enjoying a cigarette and some beer on the beach; and so on. Some of the images that people have created using the tool are deeply unsettling; others are just strange, or even kind of funny. They depict wildly different scenarios and characters. But somehow they all kind of look alike, bearing unmistakable hallmarks of AI art that have cropped up in recent years thanks to products such as Midjourney and DALL-E.


What Is Gemini Live and How Do You Use It?

WIRED

Google launched a barrage of new hardware this week, from the Pixel 9 smartphones to new wireless earbuds. Underpinning all the shiny gadgetry is Google's Gemini artificially intelligent assistant. The chatbot launched earlier this year and is now the default assistant on the Pixel 9 series and is already available on millions of Android phones worldwide. But there's a new way to talk to this chatbot that's now rolling out: Gemini Live. This is Google's response to OpenAI's GPT-4o, a way to talk to the assistant naturally, much like a normal voice conversation between two humans (or at least that's the goal).


VERA: Validation and Evaluation of Retrieval-Augmented Systems

arXiv.org Artificial Intelligence

The increasing use of Retrieval-Augmented Generation (RAG) systems in various applications necessitates stringent protocols to ensure RAG systems accuracy, safety, and alignment with user intentions. In this paper, we introduce VERA (Validation and Evaluation of Retrieval-Augmented Systems), a framework designed to enhance the transparency and reliability of outputs from large language models (LLMs) that utilize retrieved information. VERA improves the way we evaluate RAG systems in two important ways: (1) it introduces a cross-encoder based mechanism that encompasses a set of multidimensional metrics into a single comprehensive ranking score, addressing the challenge of prioritizing individual metrics, and (2) it employs Bootstrap statistics on LLM-based metrics across the document repository to establish confidence bounds, ensuring the repositorys topical coverage and improving the overall reliability of retrieval systems. Through several use cases, we demonstrate how VERA can strengthen decision-making processes and trust in AI applications. Our findings not only contribute to the theoretical understanding of LLM-based RAG evaluation metric but also promote the practical implementation of responsible AI systems, marking a significant advancement in the development of reliable and transparent generative AI technologies.


Blockchain-Enabled Accountability in Data Supply Chain: A Data Bill of Materials Approach

arXiv.org Artificial Intelligence

Data governance is critical in the era of advanced artificial intelligence (AI), particularly with the proliferation of large-scale generative AI that necessitates extensive datasets for model training and fine-tuning. Organisations that navigate complex data supply chains involving multiple stakeholders and varied tools are facing challenges in ensuring the traceability, verifiability, and reproducibility of data. This complexity is compounded in cross-departmental or cross-organisational data exchanges, where maintaining data accountability becomes increasingly significant. This issue is exacerbated after the emergence of large-scale generative AI models such as Large Language Models (LLMs) [1]. As enterprises and research institutions all need large and high-quality corpora for model development and enhancement, the lack of effective governance frameworks to manage data creation, usage, and transfer, especially across diverse stakeholders, becomes evident. Within a data supply chain, which involves continuing dataset artifact transformation and dissemination, stakeholders need to i) ensure data traceability in terms of the origin, authorisation and operations conducted on the dataset artifacts, ii) achieve data verifiability with authenticated sources and licence, iii) preserve data reproducibility that if questions are raised for specific steps on processing or transferring, and consequently, iv) the overall accountability to identify the responsible stakeholders if violations are detected. Nevertheless, current data governance models, often tied to specific platforms and focusing on data storage schemes (e.g., object storage, InterPlanetary File System), secure trading protocols [2, 3], and privacy regulations (e.g. the General Data Protection Regulation), fall short in addressing the dynamic nature of data flows from the perspective of the overall data supply chain and the requirement for platform-agnostic traceability solutions.