Goto

Collaborating Authors

 Generative AI


A Foundational Multimodal Vision Language AI Assistant for Human Pathology

arXiv.org Artificial Intelligence

The field of computational pathology has witnessed remarkable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders. However, despite the explosive growth of generative artificial intelligence (AI), there has been limited study on building general purpose, multimodal AI assistants tailored to pathology. Here we present PathChat, a vision-language generalist AI assistant for human pathology using an in-house developed foundational vision encoder pretrained on 100 million histology images from over 100,000 patient cases and 1.18 million pathology image-caption pairs. The vision encoder is then combined with a pretrained large language model and the whole system is finetuned on over 250,000 diverse disease agnostic visual language instructions. We compare PathChat against several multimodal vision language AI assistants as well as GPT4V, which powers the commercially available multimodal general purpose AI assistant ChatGPT-4. When relevant clinical context is provided with the histology image, PathChat achieved a diagnostic accuracy of 87% on multiple-choice questions based on publicly available cases of diverse tissue origins and disease models. Additionally, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive and general vision language AI assistant that can flexibly handle both visual and natural language inputs, PathChat can potentially find impactful applications in pathology education, research, and human-in-the-loop clinical decision making.


Saturn Platform: Foundation Model Operations and Generative AI for Financial Services

arXiv.org Artificial Intelligence

Saturn is an innovative platform that assists Foundation Model (FM) building and its integration with IT operations (Ops). It is custom-made to meet the requirements of data scientists, enabling them to effectively create and implement FMs while enhancing collaboration within their technical domain. By offering a wide range of tools and features, Saturn streamlines and automates different stages of FM development, making it an invaluable asset for data science teams. This white paper introduces prospective applications of generative AI models derived from FMs in the financial sector.


On the Stability of Iterative Retraining of Generative Models on their own Data

arXiv.org Machine Learning

Deep generative models have made tremendous progress in modeling complex data, often exhibiting generation quality that surpasses a typical human's ability to discern the authenticity of samples. Undeniably, a key driver of this success is enabled by the massive amounts of web-scale data consumed by these models. Due to these models' striking performance and ease of availability, the web will inevitably be increasingly populated with synthetic content. Such a fact directly implies that future iterations of generative models must contend with the reality that their training is curated from both clean data and artificially generated data from past models. In this paper, we develop a framework to rigorously study the impact of training generative models on mixed datasets (of real and synthetic data) on their stability. We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough and the proportion of clean training data (w.r.t. synthetic data) is large enough. We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models on CIFAR10 and FFHQ.


Divide-and-Conquer Attack: Harnessing the Power of LLM to Bypass the Censorship of Text-to-Image Generation Model

arXiv.org Artificial Intelligence

Text-to-image generative models offer many innovative services but also raise ethical concerns due to their potential to generate unethical images. Most publicly available text-to-image models employ safety filters to prevent unintended generation intents. In this work, we introduce the Divide-and-Conquer Attack to circumvent the safety filters of state-of-the-art text-to-image models. Our attack leverages LLMs as agents for text transformation, creating adversarial prompts from sensitive ones. We have developed effective helper prompts that enable LLMs to break down sensitive drawing prompts into multiple harmless descriptions, allowing them to bypass safety filters while still generating sensitive images. This means that the latent harmful meaning only becomes apparent when all individual elements are drawn together. Our evaluation demonstrates that our attack successfully circumvents the closed-box safety filter of SOTA DALLE-3 integrated natively into ChatGPT to generate unethical images. This approach, which essentially uses LLM-generated adversarial prompts against GPT-4-assisted DALLE-3, is akin to using one's own spear to breach their shield. It could have more severe security implications than previous manual crafting or iterative model querying methods, and we hope it stimulates more attention towards similar efforts. Our code and data are available at: https://github.com/researchcode001/Divide-and-Conquer-Attack


Generating High-Resolution Regional Precipitation Using Conditional Diffusion Model

arXiv.org Artificial Intelligence

Climate downscaling is a crucial technique within climate research, serving to project low-resolution (LR) climate data to higher resolutions (HR). Previous research has demonstrated the effectiveness of deep learning for downscaling tasks. However, most deep learning models for climate downscaling may not perform optimally for high scaling factors (i.e., 4x, 8x) due to their limited ability to capture the intricate details required for generating HR climate data. Furthermore, climate data behaves differently from image data, necessitating a nuanced approach when employing deep generative models. In response to these challenges, this paper presents a deep generative model for downscaling climate data, specifically precipitation on a regional scale. We employ a denoising diffusion probabilistic model (DDPM) conditioned on multiple LR climate variables. The proposed model is evaluated using precipitation data from the Community Earth System Model (CESM) v1.2.2 simulation. Our results demonstrate significant improvements over existing baselines, underscoring the effectiveness of the conditional diffusion model in downscaling climate data.


Navigating the generative AI era: Introducing the AI assessment scale for ethical GenAI assessment

arXiv.org Artificial Intelligence

Recent developments in Generative Artificial Intelligence (GenAI) have created a paradigm shift in multiple areas of society, and the use of these technologies is likely to become a defining feature of education in coming decades. GenAI offers transformative pedagogical opportunities, while simultaneously posing ethical and academic challenges. Against this backdrop, we outline a practical, simple, and sufficiently comprehensive tool to allow for the integration of GenAI tools into educational assessment: the AI Assessment Scale (AIAS). The AIAS empowers educators to select the appropriate level of GenAI usage in assessments based on the learning outcomes they seek to address. The AIAS offers greater clarity and transparency for students and educators, provides a fair and equitable policy tool for institutions to work with, and offers a nuanced approach which embraces the opportunities of GenAI while recognising that there are instances where such tools may not be pedagogically appropriate or necessary. By adopting a practical, flexible approach that can be implemented quickly, the AIAS can form a much-needed starting point to address the current uncertainty and anxiety regarding GenAI in education. As a secondary objective, we engage with the current literature and advocate for a refocused discourse on GenAI tools in education, one which foregrounds how technologies can help support and enhance teaching and learning, which contrasts with the current focus on GenAI as a facilitator of academic misconduct.


Multimodality of AI for Education: Towards Artificial General Intelligence

arXiv.org Artificial Intelligence

This paper presents a comprehensive examination of how multimodal artificial intelligence (AI) approaches are paving the way towards the realization of Artificial General Intelligence (AGI) in educational contexts. It scrutinizes the evolution and integration of AI in educational systems, emphasizing the crucial role of multimodality, which encompasses auditory, visual, kinesthetic, and linguistic modes of learning. This research delves deeply into the key facets of AGI, including cognitive frameworks, advanced knowledge representation, adaptive learning mechanisms, strategic planning, sophisticated language processing, and the integration of diverse multimodal data sources. It critically assesses AGI's transformative potential in reshaping educational paradigms, focusing on enhancing teaching and learning effectiveness, filling gaps in existing methodologies, and addressing ethical considerations and responsible usage of AGI in educational settings. The paper also discusses the implications of multimodal AI's role in education, offering insights into future directions and challenges in AGI development. This exploration aims to provide a nuanced understanding of the intersection between AI, multimodality, and education, setting a foundation for future research and development in AGI.


E.U. Reaches Deal on World's First Comprehensive AI Rules

TIME - Tech

European Union negotiators clinched a deal Friday on the world's first comprehensive artificial intelligence rules, paving the way for legal oversight of AI technology that has promised to transform everyday life and spurred warnings of existential dangers to humanity. Negotiators from the European Parliament and the bloc's 27 member countries overcame big differences on controversial points including generative AI and police use of face recognition surveillance to sign a tentative political agreement for the Artificial Intelligence Act. "The EU becomes the very first continent to set clear rules for the use of AI." The result came after marathon closed-door talks this week, with the initial session lasting 22 hours before a second round kicked off Friday morning. Officials were under the gun to secure a political victory for the flagship legislation.


Why the EU AI Act was so hard to agree on

MIT Technology Review

First, Melissa tells me, there is a lot of disagreement about foundation models, which has taken up most of the energy and space during the latest debates. There are several definitions of the term "foundation model" floating around, which is part of what's causing the discord, but the core concept has to do with general-purpose AI that can do many different things for various applications. You've probably played around with ChatGPT; that interface is essentially powered by a foundation model, in this case a large language model from OpenAI. Making this more complex, though, is that these technologies can also be plugged into various other applications with more narrow uses, like education or advertising. Initial versions of the EU AI Act didn't explicitly consider foundation models, but Melissa notes that the proliferation of generative AI products over the past year pushed lawmakers to integrate them into the risk framework. In the version of the legislation passed by Parliament in June, all foundation models would be tightly regulated regardless of their assigned risk category or how they are used. This was deemed necessary in light of the vast amount of training data required to build them, as well as IP and privacy concerns and the overall impact they have on other technologies. But of course, tech companies that build foundation models have disputed this and advocate for a more nuanced approach that considers how the models are used. France, Germany, and Italy have flipped their positions and gone so far to say that foundation models should be largely exempt from AI Act regulations.


An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation

arXiv.org Artificial Intelligence

Unit tests play a key role in ensuring the correctness of software. However, manually creating unit tests is a laborious task, motivating the need for automation. Large Language Models (LLMs) have recently been applied to this problem, utilizing additional training or few-shot learning on examples of existing tests. This paper presents a large-scale empirical evaluation on the effectiveness of LLMs for automated unit test generation without additional training or manual effort, providing the LLM with the signature and implementation of the function under test, along with usage examples extracted from documentation. We also attempt to repair failed generated tests by re-prompting the model with the failing test and error message. We implement our approach in TestPilot, a test generation tool for JavaScript that automatically generates unit tests for all API functions in an npm package. We evaluate TestPilot using OpenAI's gpt3.5-turbo LLM on 25 npm packages with a total of 1,684 API functions. The generated tests achieve a median statement coverage of 70.2% and branch coverage of 52.8%, significantly improving on Nessie, a recent feedback-directed JavaScript test generation technique, which achieves only 51.3% statement coverage and 25.6% branch coverage. We also find that 92.8% of TestPilot's generated tests have no more than 50% similarity with existing tests (as measured by normalized edit distance), with none of them being exact copies. Finally, we run TestPilot with two additional LLMs, OpenAI's older code-cushman-002 LLM and the open LLM StarCoder. Overall, we observed similar results with the former (68.2% median statement coverage), and somewhat worse results with the latter (54.0% median statement coverage), suggesting that the effectiveness of the approach is influenced by the size and training set of the LLM, but does not fundamentally depend on the specific model.