Goto

Collaborating Authors

 Generative AI


Ethical Concerns of Generative AI and Mitigation Strategies: A Systematic Mapping Study

arXiv.org Artificial Intelligence

The evolution of Generative AI, particularly Large Language Models (LLMs), has seen remarkable advancements since 2020 with the introduction of models like Chat-GPT and Bard. LLMs have revolutionized tasks, such as writing assistance, code generation, and customer support automation, by leveraging vast amounts of data to generate coherent and contextually relevant natural language (NL) responses [1, 2]. As a subset of Generative AI--systems designed to create new content--LLMs go beyond traditional AI techniques, which focus primarily on analyzing existing data. LLMs, in contrast, are capable of generating text, images, and music that mimic human creativity [3]. This capability is powered by advancements in neural network architectures, especially transformers, which enable LLMs to learn the nuances of human language and produce semantically accurate content [4].


A hybrid marketplace of ideas

arXiv.org Artificial Intelligence

The convergence of humans and artificial intelligence systems introduces new dynamics into the cultural and intellectual landscape. Complementing emerging cultural evolution concepts such as machine culture, AI agents represent a significant techno-sociological development, particularly within the anthropological study of Web3 as a community focused on decentralization through blockchain. Despite their growing presence, the cultural significance of AI agents remains largely unexplored in academic literature. Toward this end, we conceived hybrid netnography, a novel interdisciplinary approach that examines the cultural and intellectual dynamics within digital ecosystems by analyzing the interactions and contributions of both human and AI agents as co-participants in shaping narratives, ideas, and cultural artifacts. We argue that, within the Web3 community on the social media platform X, these agents challenge traditional notions of participation and influence in public discourse, creating a hybrid marketplace of ideas, a conceptual space where human and AI generated ideas coexist and compete for attention. We examine the current state of AI agents in idea generation, propagation, and engagement, positioning their role as cultural agents through the lens of memetics and encouraging further inquiry into their cultural and societal impact. Additionally, we address the implications of this paradigm for privacy, intellectual property, and governance, highlighting the societal and legal challenges of integrating AI agents into the hybrid marketplace of ideas.


Generative AI Policies under the Microscope: How CS Conferences Are Navigating the New Frontier in Scholarly Writing

arXiv.org Artificial Intelligence

While Gen-AI offers significant benefits in content generation and task automation [9], it can be also misused and abused in nefarious applications [7], with more significant risks toward long-tail populations and regions [6]. Professionals in fields like journalism and law still remain cautious due to concerns over hallucinations and ethical issues but scholars in Computer Science (CS), the field where Gen-AI originated, appear to be cautiously but actively exploring its use. For instance, [3] reports the increased use of large language models (LLMs) in the CS scholarly articles (up to 17.5%), compared to Mathematics articles (up to 6.3%), and [2] reports that between 6.5% and 16.9% of peer reviews at ICLR 2024, NeurIPS 2023, CoRL 2023, and EMNLP 2023 may have been significantly altered by LLMs beyond minor revisions. Considering researchers' increasing adoption of Gen-AI, it is crucial to establish usage guidelines and well-defined policies to promote fair and ethical practices in scholarly writing and peer reviews. Previous research also examined Gen-AI policies by major publishers like Elsevier, Springer, etc. [5], but there is still a lack of clear understanding of how CS conferences are adapting to this paradigm shift.


Large Model Based Agents: State-of-the-Art, Cooperation Paradigms, Security and Privacy, and Future Trends

arXiv.org Artificial Intelligence

With the rapid advancement of large models (LMs), the development of general-purpose intelligent agents powered by LMs has become a reality. It is foreseeable that in the near future, LM-driven general AI agents will serve as essential tools in production tasks, capable of autonomous communication and collaboration without human intervention. This paper investigates scenarios involving the autonomous collaboration of future LM agents. We review the current state of LM agents, the key technologies enabling LM agent collaboration, and the security and privacy challenges they face during cooperative operations. To this end, we first explore the foundational principles of LM agents, including their general architecture, key components, enabling technologies, and modern applications. We then discuss practical collaboration paradigms from data, computation, and knowledge perspectives to achieve connected intelligence among LM agents. After that, we analyze the security vulnerabilities and privacy risks associated with LM agents, particularly in multi-agent settings, examining underlying mechanisms and reviewing current and potential countermeasures. Lastly, we propose future research directions for building robust and secure LM agent ecosystems.


A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI

arXiv.org Machine Learning

Multi-modal generative AI systems, such as those combining vision and language, rely on contrastive pre-training to learn representations across different modalities. While their practical benefits are widely acknowledged, a rigorous theoretical understanding of the contrastive pre-training framework remains limited. This paper develops a theoretical framework to explain the success of contrastive pre-training in downstream tasks, such as zero-shot classification, conditional diffusion models, and vision-language models. We introduce the concept of approximate sufficient statistics, a generalization of the classical sufficient statistics, and show that near-minimizers of the contrastive pre-training loss are approximately sufficient, making them adaptable to diverse downstream tasks. We further propose the Joint Generative Hierarchical Model for the joint distribution of images and text, showing that transformers can efficiently approximate relevant functions within this model via belief propagation. Building on this framework, we derive sample complexity guarantees for multi-modal learning based on contrastive pre-trained representations. Numerical simulations validate these theoretical findings, demonstrating the strong generalization performance of contrastively pre-trained transformers in various multi-modal tasks.


Generative AI for Cel-Animation: A Survey

arXiv.org Artificial Intelligence

Traditional Celluloid (Cel) Animation production pipeline encompasses multiple essential steps, including storyboarding, layout design, keyframe animation, inbetweening, and colorization, which demand substantial manual effort, technical expertise, and significant time investment. These challenges have historically impeded the efficiency and scalability of Cel-Animation production. The rise of generative artificial intelligence (GenAI), encompassing large language models, multimodal models, and diffusion models, offers innovative solutions by automating tasks such as inbetween frame generation, colorization, and storyboard creation. This survey explores how GenAI integration is revolutionizing traditional animation workflows by lowering technical barriers, broadening accessibility for a wider range of creators through tools like AniDoc, ToonCrafter, and AniSora, and enabling artists to focus more on creative expression and artistic innovation. Despite its potential, issues such as maintaining visual consistency, ensuring stylistic coherence, and addressing ethical considerations continue to pose challenges. Furthermore, this paper discusses future directions and explores potential advancements in AI-assisted animation. For further exploration and resources, please visit our GitHub repository: https://github.com/yunlong10/Awesome-AI4Animation


FBI's new warning about AI-driven scams that are after your cash

FOX News

Kurt Knutsson discusses some tips to keep you safe. The FBI is issuing a warning that criminals are increasingly using generative AI technologies, particularly deepfakes, to exploit unsuspecting individuals. This alert serves as a reminder of the growing sophistication and accessibility of these technologies and the urgent need for vigilance in protecting ourselves from potential scams. Let's explore what deepfakes are, how they're being used by criminals and what steps you can take to safeguard your personal information. Enter the giveaway by signing up for my free newsletter.


British AI startup with government ties is developing tech for military drones

The Guardian

A company that has worked closely with the UK government on artificial intelligence safety, the NHS and education is also developing AI for military drones. The consultancy Faculty AI has "experience developing and deploying AI models on to UAVs", or unmanned aerial vehicles, according to a defence industry partner company. Faculty has emerged as one of the most active companies selling AI services in the UK. Unlike the likes of OpenAI, Deepmind or Anthropic, it does not develop models itself, instead focusing on reselling models, notably from OpenAI, and consulting on their use in government and industry. Faculty gained particular prominence in the UK after working on data analysis for the Vote Leave campaign before the Brexit vote.


BiasGuard: Guardrailing Fairness in Machine Learning Production Systems

arXiv.org Artificial Intelligence

As machine learning (ML) systems increasingly impact critical sectors such as hiring, financial risk assessments, and criminal justice, the imperative to ensure fairness has intensified due to potential negative implications. While much ML fairness research has focused on enhancing training data and processes, addressing the outputs of already deployed systems has received less attention. This paper introduces 'BiasGuard', a novel approach designed to act as a fairness guardrail in production ML systems. BiasGuard leverages Test-Time Augmentation (TTA) powered by Conditional Generative Adversarial Network (CTGAN), a cutting-edge generative AI model, to synthesize data samples conditioned on inverted protected attribute values, thereby promoting equitable outcomes across diverse groups. This method aims to provide equal opportunities for both privileged and unprivileged groups while significantly enhancing the fairness metrics of deployed systems without the need for retraining. Our comprehensive experimental analysis across diverse datasets reveals that BiasGuard enhances fairness by 31% while only reducing accuracy by 0.09% compared to non-mitigated benchmarks. Additionally, BiasGuard outperforms existing post-processing methods in improving fairness, positioning it as an effective tool to safeguard against biases when retraining the model is impractical.


Wavelet-Driven Generalizable Framework for Deepfake Face Forgery Detection

arXiv.org Artificial Intelligence

The evolution of digital image manipulation, particularly with the advancement of deep generative models, significantly challenges existing deepfake detection methods, especially when the origin of the deepfake is obscure. To tackle the increasing complexity of these forgeries, we propose \textbf{Wavelet-CLIP}, a deepfake detection framework that integrates wavelet transforms with features derived from the ViT-L/14 architecture, pre-trained in the CLIP fashion. Wavelet-CLIP utilizes Wavelet Transforms to deeply analyze both spatial and frequency features from images, thus enhancing the model's capability to detect sophisticated deepfakes. To verify the effectiveness of our approach, we conducted extensive evaluations against existing state-of-the-art methods for cross-dataset generalization and detection of unseen images generated by standard diffusion models. Our method showcases outstanding performance, achieving an average AUC of 0.749 for cross-data generalization and 0.893 for robustness against unseen deepfakes, outperforming all compared methods. The code can be reproduced from the repo: \url{https://github.com/lalithbharadwajbaru/Wavelet-CLIP}