Generative AI
'Time is running out': can a future of undetectable deepfakes be avoided?
With more than 4,000 shares, 20,000 comments, and 100,000 reactions on Facebook, the photo of the elderly woman, sitting behind her homemade 122nd birthday cake, has unquestionably gone viral. "I started decorating cakes from five years old," the caption reads, "and I can't wait to grow my baking journey." The picture is also unquestionably fake. If the curious candles โ one seems to float in the air, attached to nothing โ or the weird amorphous blobs on the cake in the foreground didn't give it away, then the fact the celebrant would be the oldest person in the world by almost five years should. Thankfully, the stakes for viral supercentenarian cake decorators are low.
How tech giants cut corners to harvest data for AI
The artificial intelligence lab had exhausted every reservoir of reputable English-language text on the internet as it developed its latest AI system. It needed more data to train the next version of its technology -- lots more. So OpenAI researchers created a speech recognition tool called Whisper. It could transcribe the audio from YouTube videos, yielding new conversational text that would make an AI system smarter. Some OpenAI employees discussed how such a move might go against YouTube's rules, three people with knowledge of the conversations said. YouTube, which is owned by Google, prohibits use of its videos for applications that are "independent" of the video platform.
AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Ghosh, Shaona, Varshney, Prasoon, Galinkin, Erick, Parisien, Christopher
As Large Language Models (LLMs) and generative AI become more widespread, the content safety risks associated with their use also increase. We find a notable deficiency in high-quality content safety datasets and benchmarks that comprehensively cover a wide range of critical safety areas. To address this, we define a broad content safety risk taxonomy, comprising 13 critical risk and 9 sparse risk categories. Additionally, we curate AEGISSAFETYDATASET, a new dataset of approximately 26, 000 human-LLM interaction instances, complete with human annotations adhering to the taxonomy. We plan to release this dataset to the community to further research and to help benchmark LLM models for safety. To demonstrate the effectiveness of the dataset, we instruction-tune multiple LLM-based safety models. We show that our models (named AEGISSAFETYEXPERTS), not only surpass or perform competitively with the state-of-the-art LLM-based safety models and general purpose LLMs, but also exhibit robustness across multiple jail-break attack categories. We also show how using AEGISSAFETYDATASET during the LLM alignment phase does not negatively impact the performance of the aligned models on MT Bench scores. Furthermore, we propose AEGIS, a novel application of a no-regret online adaptation framework with strong theoretical guarantees, to perform content moderation with an ensemble of LLM content safety experts in deployment
Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model
Yang, Jichang, Chen, Hegan, Chen, Jia, Wang, Songqi, Wang, Shaocong, Yu, Yifei, Chen, Xi, Wang, Bo, Zhang, Xinyuan, Cui, Binbin, Li, Yi, Lin, Ning, Xu, Meng, Li, Yi, Xu, Xiaoxin, Qi, Xiaojuan, Wang, Zhongrui, Zhang, Xumeng, Shang, Dashan, Wang, Han, Liu, Qi, Cheng, Kwang-Ting, Liu, Ming
Human brains image complicated scenes when reading a novel. Replicating this imagination is one of the ultimate goals of AI-Generated Content (AIGC). However, current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. This deficiency is rooted in the difference between the brain and digital computers. Digital computers have physically separated storage and processing units, resulting in frequent data transfers during iterative calculations, incurring large time and energy overheads. This issue is further intensified by the conversion of inherently continuous and analog generation dynamics, which can be formulated by neural differential equations, into discrete and digital operations. Inspired by the brain, we propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion, employing emerging resistive memory. The integration of storage and computation within resistive memory synapses surmount the von Neumann bottleneck, benefiting the generative speed and energy efficiency. The closed-loop feedback integrator is time-continuous, analog, and compact, physically implementing an infinite-depth neural network. Moreover, the software-hardware co-design is intrinsically robust to analog noise. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros. Demonstrating equivalent generative quality to the software baseline, our system achieved remarkable enhancements in generative speed for both unconditional and conditional generation tasks, by factors of 64.8 and 156.5, respectively. Moreover, it accomplished reductions in energy consumption by factors of 5.2 and 4.1. Our approach heralds a new horizon for hardware solutions in edge computing for generative AI applications.
Automatic Authorities: Power and AI
Forthcoming in Collaborative Intelligence: How Humans and AI are Transforming our World, Arathi Sethumadhavan and Mira Lane (eds.), Seth Lazar, Australian National University Man, a child in understanding of himself, has placed in his hands physical tools of incalculable power. He plays with them like a child, and whether they work harm or good is largely a matter of accident. The instrumentality becomes a master and works fatally as if possessed of a will of its own-- not because it has a will but because man has not. Introduction As rapid advances in Artificial Intelligence and the rise of some of history's most potent corporations meet the diminished neoliberal state, people are increasingly subject to power exercised by means of automated systems. Machine learning, big data, and related computational technologies now underpin vital government services from criminal justice to tax auditing, public health to social services, immigration to defence (Citron, 2008; Calo and Citron, 2020; Engstrom et al., 2020). Google and Amazon connect consumers and producers in new algorithmic markets (Nadler and Cicilline, 2020). Google's search algorithm--and possibly in the near future OpenAI's GPT-4 or another large language model--determines, for many, how they find out about everything from how to vote to where to get vaccinated. Meta, Twitter, TikTok, Google and others algorithmically decide whose speech is amplified, reduced, or restricted (Vaidhyanathan, 2011; Pasquale, 2015; Gillespie, 2018; Suzor, 2019). And a new wave of products based on rapid advances in Large Language Models (LLMs) have the potential to further transform our economic and political lives. Automatic Authorities are automated computational systems used to exercise power over us by substantially determining what we may know, what we may have, and what our options will be. This chapter is based on, and substantially revises, my'Power and AI: Nature and Justification', in the Oxford Handbook of AI Governance (Justin Bullock et al., eds). My thanks to the publisher for their permission to use this material. But what normative lessons should we draw from these analyses? Power is everywhere, and is not necessarily bad.
Responsible Generative AI: What to Generate and What Not
In recent years, generative AI (GenAI), like large language models and text-to-image models, has received significant attention across various domains. However, ensuring the responsible generation of content by these models is crucial for their real-world applicability. This raises an interesting question: \textit{What should responsible GenAI generate, and what should it not?} To answer the question, this paper investigates the practical responsible requirements of both textual and visual generative models, outlining five key considerations: generating truthful content, avoiding toxic content, refusing harmful instruction, leaking no training data-related content, and ensuring generated content identifiable. Specifically, we review recent advancements and challenges in addressing these requirements. Besides, we discuss and emphasize the importance of responsible GenAI across healthcare, education, finance, and artificial general intelligence domains. Through a unified perspective on both textual and visual generative models, this paper aims to provide insights into practical safety-related issues and further benefit the community in building responsible GenAI.
Is English the New Programming Language? How About Pseudo-code Engineering?
Michaelsen, Gian Alexandre, Santos, Renato P. dos
Background: The integration of artificial intelligence (AI) into daily life, particularly through chatbots utilizing natural language processing (NLP), presents both revolutionary potential and unique challenges. This intended to investigate how different input forms impact ChatGPT, a leading language model by OpenAI, performance in understanding and executing complex, multi-intention tasks. Design: Employing a case study methodology supplemented by discourse analysis, the research analyzes ChatGPT's responses to inputs varying from natural language to pseudo-code engineering. The study specifically examines the model's proficiency across four categories: understanding of intentions, interpretability, completeness, and creativity. Setting and Participants: As a theoretical exploration of AI interaction, this study focuses on the analysis of structured and unstructured inputs processed by ChatGPT, without direct human participants. Data collection and analysis: The research utilizes synthetic case scenarios, including the organization of a "weekly meal plan" and a "shopping list," to assess ChatGPT's response to prompts in both natural language and pseudo-code engineering. The analysis is grounded in the identification of patterns, contradictions, and unique response elements across different input formats. Results: Findings reveal that pseudo-code engineering inputs significantly enhance the clarity and determinism of ChatGPT's responses, reducing ambiguity inherent in natural language. Enhanced natural language, structured through prompt engineering techniques, similarly improves the model's interpretability and creativity. Conclusions: The study underscores the potential of pseudo-code engineering in refining human-AI interaction and achieving more deterministic, concise, and direct outcomes, advocating for its broader application across disciplines requiring precise AI responses.
Contextual Chart Generation for Cyber Deception
Nguyen, David D., Liebowitz, David, Nepal, Surya, Kanhere, Salil S., Abuadbba, Sharif
Honeyfiles are security assets designed to attract and detect intruders on compromised systems. Honeyfiles are a type of honeypot that mimic real, sensitive documents, creating the illusion of the presence of valuable data. Interaction with a honeyfile reveals the presence of an intruder, and can provide insights into their goals and intentions. Their practical use, however, is limited by the time, cost and effort associated with manually creating realistic content. The introduction of large language models has made high-quality text generation accessible, but honeyfiles contain a variety of content including charts, tables and images. This content needs to be plausible and realistic, as well as semantically consistent both within honeyfiles and with the real documents they mimic, to successfully deceive an intruder. In this paper, we focus on an important component of the honeyfile content generation problem: document charts. Charts are ubiquitous in corporate documents and are commonly used to communicate quantitative and scientific data. Existing image generation models, such as DALL-E, are rather prone to generating charts with incomprehensible text and unconvincing data. We take a multi-modal approach to this problem by combining two purpose-built generative models: a multitask Transformer and a specialized multi-head autoencoder. The Transformer generates realistic captions and plot text, while the autoencoder generates the underlying tabular data for the plot. To advance the field of automated honeyplot generation, we also release a new document-chart dataset and propose a novel metric Keyword Semantic Matching (KSM). This metric measures the semantic consistency between keywords of a corpus and a smaller bag of words. Extensive experiments demonstrate excellent performance against multiple large language models, including ChatGPT and GPT4.
OpenAI and Google reportedly used transcriptions of YouTube videos to train their AI models
The report, which describes the lengths OpenAI, Google and Meta have gone to in order to maximize the amount of data they can feed to their AIs, cites numerous people with knowledge of the companies' practices. It comes just days after YouTube CEO Neal Mohan said in an interview with Bloomberg Originals that OpenAI's alleged use of YouTube videos to train its new text-to-video generator, Sora, would go against the platform's policies. According to the NYT, OpenAI used its Whisper speech recognition tool to transcribe more than one million hours of YouTube videos, which were then used to train GPT-4. The Information previously reported that OpenAI had used YouTube videos and podcasts to train the two AI systems. OpenAI president Greg Brockman was reportedly among the people on this team.
The Journey to Trustworthy AI- Part 1: Pursuit of Pragmatic Frameworks
Nasr-Azadani, Mohamad M, Chatelain, Jean-Luc
This paper reviews Trustworthy Artificial Intelligence (TAI) and its various definitions. Considering the principles respected in any society, TAI is often characterized by a few attributes, some of which have led to confusion in regulatory or engineering contexts. We argue against using terms such as Responsible or Ethical AI as substitutes for TAI. And to help clarify any confusion, we suggest leaving them behind. Given the subjectivity and complexity inherent in TAI, developing a universal framework is deemed infeasible. Instead, we advocate for approaches centered on addressing key attributes and properties such as fairness, bias, risk, security, explainability, and reliability. We examine the ongoing regulatory landscape, with a focus on initiatives in the EU, China, and the USA. We recognize that differences in AI regulations based on geopolitical and geographical reasons pose an additional challenge for multinational companies. We identify risk as a core factor in AI regulation and TAI. For example, as outlined in the EU-AI Act, organizations must gauge the risk level of their AI products to act accordingly (or risk hefty fines). We compare modalities of TAI implementation and how multiple cross-functional teams are engaged in the overall process. Thus, a brute force approach for enacting TAI renders its efficiency and agility, moot. To address this, we introduce our framework Set-Formalize-Measure-Act (SFMA). Our solution highlights the importance of transforming TAI-aware metrics, drivers of TAI, stakeholders, and business/legal requirements into actual benchmarks or tests. Finally, over-regulation driven by panic of powerful AI models can, in fact, harm TAI too. Based on GitHub user-activity data, in 2023, AI open-source projects rose to top projects by contributor account. Enabling innovation in TAI hinges on the independent contributions of the open-source community.