Generative AI
The White House Is Preparing for an AI-Dominated Future
Earlier today, President Joe Biden signed the most sweeping set of regulatory principles on artificial intelligence in America to date: a lengthy executive order that directs all types of government agencies to make sure America is leading the way in developing the technology while also addressing the many dangers that it poses. The order explicitly pushes agencies to establish rules and guidelines, write reports, and create funding and research initiatives for AI--"the most consequential technology of our time," in the president's own words. The scope of the order is impressive, especially given that the generative-AI boom began just about a year ago. But the document's many parts--and there are many--are at times in tension, revealing a broader confusion over what, exactly, America's primary attitude toward AI should be: Is it a threat to national security, or a just society? Is it a geopolitical weapon?
Mark Zuckerberg's Real Cage Fight
This article is from Big Technology, a newsletter by Alex Kantrowitz. Sam Altman sat comfortably between Satya Nadella and Sundar Pichai at a White House gathering of top A.I. CEOs in May--with one noticeable gap in the guest list. With Alphabet, Microsoft, and OpenAI in attendance, it was impossible to miss Mark Zuckerberg's absence. And that appeared to be no accident. The meeting, one administration official said, "was focused on companies currently leading in the space."
Generative AI Is Playing a Surprising Role in Israel-Hamas Disinformation
In the weeks since Hamas launched its October 7 surprise attack on Israel, the ensuing conflict has generated an unprecedented wave of disinformation, an "algorithmically driven fog of war" that has tripped up major new organizations and left social media companies floundering. Yet, amid all of the deceptive images and video moving around on social media, the content generated by artificial intelligence tools has remained relatively peripheral. Even as some wondered if the Israel-Hamas war would be the first conflict dominated by false generative AI images, the technology has had a more complex and subtle impact. "There are definitely AI images circulating but not to the degree where I think it's playing a central role in the spread of information," says Layla Mashkoor, an associate editor at the Atlantic Council's Digital Forensic Research Lab, which studies online disinformation. Primarily, Mashkoor says, AI-generated disinformation is being used by activists to solicit support--or give the impression of wider support--for a particular side.
Generative retrieval-augmented ontologic graph and multi-agent strategies for interpretive large language model-based materials design
Transformer neural networks show promising capabilities, in particular for uses in materials analysis, design and manufacturing, including their capacity to work effectively with both human language, symbols, code, and numerical data. Here we explore the use of large language models (LLMs) as a tool that can support engineering analysis of materials, applied to retrieving key information about subject areas, developing research hypotheses, discovery of mechanistic relationships across disparate areas of knowledge, and writing and executing simulation codes for active knowledge generation based on physical ground truths. When used as sets of AI agents with specific features, capabilities, and instructions, LLMs can provide powerful problem solution strategies for applications in analysis and design problems. Our experiments focus on using a fine-tuned model, MechGPT, developed based on training data in the mechanics of materials domain. We first affirm how finetuning endows LLMs with reasonable understanding of domain knowledge. However, when queried outside the context of learned matter, LLMs can have difficulty to recall correct information. We show how this can be addressed using retrieval-augmented Ontological Knowledge Graph strategies that discern how the model understands what concepts are important and how they are related. Illustrated for a use case of relating distinct areas of knowledge - here, music and proteins - such strategies can also provide an interpretable graph structure with rich information at the node, edge and subgraph level. We discuss nonlinear sampling strategies and agent-based modeling applied to complex question answering, code generation and execution in the context of automated force field development from actively learned Density Functional Theory (DFT) modeling, and data analysis.
Addressing Weak Decision Boundaries in Image Classification by Leveraging Web Search and Generative Models
Dammu, Preetam Prabhu Srikar, Feng, Yunhe, Shah, Chirag
Machine learning (ML) technologies are known to be riddled with ethical and operational problems, however, we are witnessing an increasing thrust by businesses to deploy them in sensitive applications. One major issue among many is that ML models do not perform equally well for underrepresented groups. This puts vulnerable populations in an even disadvantaged and unfavorable position. We propose an approach that leverages the power of web search and generative models to alleviate some of the shortcomings of discriminative models. We demonstrate our method on an image classification problem using ImageNet's People Subtree subset, and show that it is effective in enhancing robustness and mitigating bias in certain classes that represent vulnerable populations (e.g., female doctor of color). Our new method is able to (1) identify weak decision boundaries for such classes; (2) construct search queries for Google as well as text for generating images through DALL-E 2 and Stable Diffusion; and (3) show how these newly captured training samples could alleviate population bias issue. While still improving the model's overall performance considerably, we achieve a significant reduction (77.30\%) in the model's gender accuracy disparity. In addition to these improvements, we observed a notable enhancement in the classifier's decision boundary, as it is characterized by fewer weakspots and an increased separation between classes. Although we showcase our method on vulnerable populations in this study, the proposed technique is extendable to a wide range of problems and domains.
Transformation vs Tradition: Artificial General Intelligence (AGI) for Arts and Humanities
Liu, Zhengliang, Li, Yiwei, Cao, Qian, Chen, Junwen, Yang, Tianze, Wu, Zihao, Hale, John, Gibbs, John, Rasheed, Khaled, Liu, Ninghao, Mai, Gengchen, Liu, Tianming
Recent advances in artificial general intelligence (AGI), particularly large language models and creative image generation systems have demonstrated impressive capabilities on diverse tasks spanning the arts and humanities. However, the swift evolution of AGI has also raised critical questions about its responsible deployment in these culturally significant domains traditionally seen as profoundly human. This paper provides a comprehensive analysis of the applications and implications of AGI for text, graphics, audio, and video pertaining to arts and the humanities. We survey cutting-edge systems and their usage in areas ranging from poetry to history, marketing to film, and communication to classical art. We outline substantial concerns pertaining to factuality, toxicity, biases, and public safety in AGI systems, and propose mitigation strategies. The paper argues for multi-stakeholder collaboration to ensure AGI promotes creativity, knowledge, and cultural values without undermining truth or human dignity. Our timely contribution summarizes a rapidly developing field, highlighting promising directions while advocating for responsible progress centering on human flourishing. The analysis lays the groundwork for further research on aligning AGI's technological capacities with enduring social goods.
Denoising Diffusion Probabilistic Models for Hardware-Impaired Communication Systems: Towards Wireless Generative AI
Letafati, Mehdi, Ali, Samad, Latva-aho, Matti
Thanks to the outstanding achievements from state-of-the-art generative models like ChatGPT and diffusion models, generative AI has gained substantial attention across various industrial and academic domains. In this paper, denoising diffusion probabilistic models (DDPMs) are proposed for a practical finite-precision wireless communication system with hardware-impaired transceivers. The intuition behind DDPM is to decompose the data generation process over the so-called "denoising" steps. Inspired by this, a DDPM-based receiver is proposed for a practical wireless communication scheme that faces realistic non-idealities, including hardware impairments (HWI), channel distortions, and quantization errors. It is shown that our approach provides network resilience under low-SNR regimes, near-invariant reconstruction performance with respect to different HWI levels and quantization errors, and robust out-of-distribution performance against non-Gaussian noise. Moreover, the reconstruction performance of our scheme is evaluated in terms of cosine similarity and mean-squared error (MSE), highlighting more than 25 dB improvement compared to the conventional deep neural network (DNN)-based receivers.
Adaptive importance sampling for Deep Ritz
Wan, Xiaoliang, Zhou, Tao, Zhou, Yuancheng
We introduce an adaptive sampling method for the Deep Ritz method aimed at solving partial differential equations (PDEs). Two deep neural networks are used. One network is employed to approximate the solution of PDEs, while the other one is a deep generative model used to generate new collocation points to refine the training set. The adaptive sampling procedure consists of two main steps. The first step is solving the PDEs using the Deep Ritz method by minimizing an associated variational loss discretized by the collocation points in the training set. The second step involves generating a new training set, which is then used in subsequent computations to further improve the accuracy of the current approximate solution. We treat the integrand in the variational loss as an unnormalized probability density function (PDF) and approximate it using a deep generative model called bounded KRnet. The new samples and their associated PDF values are obtained from the bounded KRnet. With these new samples and their associated PDF values, the variational loss can be approximated more accurately by importance sampling. Compared to the original Deep Ritz method, the proposed adaptive method improves accuracy, especially for problems characterized by low regularity and high dimensionality. We demonstrate the effectiveness of our new method through a series of numerical experiments.
FACTIFY3M: A Benchmark for Multimodal Fact Verification with Explainability through 5W Question-Answering
Chakraborty, Megha, Pahwa, Khushbu, Rani, Anku, Chatterjee, Shreyas, Dalal, Dwip, Dave, Harshit, G, Ritvik, Gurumurthy, Preethi, Mahor, Adarsh, Mukherjee, Samahriti, Pakala, Aditya, Paul, Ishan, Reddy, Janvita, Sarkar, Arghya, Sensharma, Kinjal, Chadha, Aman, Sheth, Amit P., Das, Amitava
Combating disinformation is one of the burning societal crises -- about 67% of the American population believes that disinformation produces a lot of uncertainty, and 10% of them knowingly propagate disinformation. Evidence shows that disinformation can manipulate democratic processes and public opinion, causing disruption in the share market, panic and anxiety in society, and even death during crises. Therefore, disinformation should be identified promptly and, if possible, mitigated. With approximately 3.2 billion images and 720,000 hours of video shared online daily on social media platforms, scalable detection of multimodal disinformation requires efficient fact verification. Despite progress in automatic text-based fact verification (e.g., FEVER, LIAR), the research community lacks substantial effort in multimodal fact verification. To address this gap, we introduce FACTIFY 3M, a dataset of 3 million samples that pushes the boundaries of the domain of fact verification via a multimodal fake news dataset, in addition to offering explainability through the concept of 5W question-answering. Salient features of the dataset include: (i) textual claims, (ii) ChatGPT-generated paraphrased claims, (iii) associated images, (iv) stable diffusion-generated additional images (i.e., visual paraphrases), (v) pixel-level image heatmap to foster image-text explainability of the claim, (vi) 5W QA pairs, and (vii) adversarial fake news stories.
What the evolution of our own brains can tell us about the future of AI
The explosive growth in artificial intelligence in recent years -- crowned with the meteoric rise of generative AI chatbots like ChatGPT -- has seen the technology take on many tasks that, formerly, only human minds could handle. But despite their increasingly capable linguistic computations, these machine learning systems remain surprisingly inept at making the sorts of cognitive leaps and logical deductions that even the average teenager can consistently get right. In this week's Hitting the Books excerpt, A Brief History of Intelligence: Evolution, AI, and the Five Breakthroughs That Made Our Brains, AI entrepreneur Max Bennett explores the quizzical gap in computer competency by exploring the development of the organic machine AIs are modeled after: the human brain. Focusing on the five evolutionary "breakthroughs," amidst myriad genetic dead ends and unsuccessful offshoots, that led our species to our modern minds, Bennett also shows that the same advancements that took humanity eons to evolve can be adapted to help guide development of the AI technologies of tomorrow. In the excerpt below, we take a look at how generative AI systems like GPT-3 are built to mimic the predictive functions of the neocortex, but still can't quite get a grasp on the vagaries of human speech.