Generative AI
PromptCrafter: Crafting Text-to-Image Prompt through Mixed-Initiative Dialogue with LLM
Baek, Seungho, Im, Hyerin, Ryu, Jiseung, Park, Juhyeong, Lee, Takyeon
Text-to-image generation model is able to generate images across a diverse range of subjects and styles based on a single prompt. Recent works have proposed a variety of interaction methods that help users understand the capabilities of models and utilize them. However, how to support users to efficiently explore the model's capability and to create effective prompts are still open-ended research questions. In this paper, we present PromptCrafter, a novel mixed-initiative system that allows step-by-step crafting of text-to-image prompt. Through the iterative process, users can efficiently explore the model's capability, and clarify their intent. PromptCrafter also supports users to refine prompts by answering various responses to clarifying questions generated by a Large Language Model. Lastly, users can revert to a desired step by reviewing the work history. In this workshop paper, we discuss the design process of PromptCrafter and our plans for follow-up studies.
Development of the ChatGPT, Generative Artificial Intelligence and Natural Large Language Models for Accountable Reporting and Use (CANGARU) Guidelines
Cacciamani, Giovanni E., Eppler, Michael B., Ganjavi, Conner, Pekan, Asli, Biedermann, Brett, Collins, Gary S., Gill, Inderbir S.
The swift progress and ubiquitous adoption of Generative AI (GAI), Generative Pre-trained Transformers (GPTs), and large language models (LLMs) like ChatGPT, have spurred queries about their ethical application, use, and disclosure in scholarly research and scientific productions. A few publishers and journals have recently created their own sets of rules; however, the absence of a unified approach may lead to a 'Babel Tower Effect,' potentially resulting in confusion rather than desired standardization. In response to this, we present the ChatGPT, Generative Artificial Intelligence, and Natural Large Language Models for Accountable Reporting and Use Guidelines (CANGARU) initiative, with the aim of fostering a cross-disciplinary global inclusive consensus on the ethical use, disclosure, and proper reporting of GAI/GPT/LLM technologies in academia. The present protocol consists of four distinct parts: a) an ongoing systematic review of GAI/GPT/LLM applications to understand the linked ideas, findings, and reporting standards in scholarly research, and to formulate guidelines for its use and disclosure, b) a bibliometric analysis of existing author guidelines in journals that mention GAI/GPT/LLM, with the goal of evaluating existing guidelines, analyzing the disparity in their recommendations, and identifying common rules that can be brought into the Delphi consensus process, c) a Delphi survey to establish agreement on the items for the guidelines, ensuring principled GAI/GPT/LLM use, disclosure, and reporting in academia, and d) the subsequent development and dissemination of the finalized guidelines and their supplementary explanation and elaboration documents.
Gradient Surgery for One-shot Unlearning on Generative Model
Bae, Seohui, Kim, Seoyoon, Jung, Hyemin, Lim, Woohyung
Recent regulation on right-to-be-forgotten emerges tons of interest in unlearning pre-trained machine learning models. While approximating a straightforward yet expensive approach of retrain-from-scratch, recent machine unlearning methods unlearn a sample by updating weights to remove its influence on the weight parameters. In this paper, we introduce a simple yet effective approach to remove a data influence on the deep generative model. Inspired by works in multi-task learning, we propose to manipulate gradients to regularize the interplay of influence among samples by projecting gradients onto the normal plane of the gradients to be retained. Our work is agnostic to statistics of the removal samples, outperforming existing baselines while providing theoretical analysis for the first time in unlearning a generative model.
How judges, not politicians, could dictate America's AI rules
If these cases prove successful, they could force OpenAI, Meta, Microsoft, and others to change the way AI is built, trained, and deployed so that it is more fair and equitable. They could also create new ways for artists, authors, and others to be compensated for having their work used as training data for AI models, through a system of licensing and royalties. The generative AI boom has revived American politicians' enthusiasm for passing AI-specific laws. However, we're unlikely to see any such legislation pass in the next year, given the split Congress and intense lobbying from tech companies, says Ben Winters, senior counsel at the Electronic Privacy Information Center. Even the most prominent attempt to create new AI rules, Senator Chuck Schumer's SAFE Innovation framework, does not include any specific policy proposals.
If AI image generators are so smart, why do they struggle to write and count?
AI image produced using the prompt'hyper-realistic ten hands on a picture with text saying hello'. Generative AI tools such as Midjourney, Stable Diffusion and DALL-E 2 have astounded us with their ability to produce remarkable images in a matter of seconds. Despite their achievements, however, there remains a puzzling disparity between what AI image generators can produce and what we can. For instance, these tools often won't deliver satisfactory results for seemingly simple tasks such as counting objects and producing accurate text. If generative AI has reached such unprecedented heights in creative expression, why does it struggle with tasks even a primary school student could complete? Exploring the underlying reasons helps sheds light on the complex numerical nature of AI, and the nuance of its capabilities.
Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
Friedrich, Felix, Brack, Manuel, Struppek, Lukas, Hintersdorf, Dominik, Schramowski, Patrick, Luccioni, Sasha, Kersting, Kristian
Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications. However, since they are highly data-driven, relying on billion-sized datasets randomly scraped from the internet, they also suffer from degenerated and biased human behavior, as we demonstrate. In fact, they may even reinforce such biases. To not only uncover but also combat these undesired effects, we present a novel strategy, called Fair Diffusion, to attenuate biases after the deployment of generative text-to-image models. Specifically, we demonstrate shifting a bias, based on human instructions, in any direction yielding arbitrary proportions for, e.g., identity groups. As our empirical evaluation demonstrates, this introduced control enables instructing generative image models on fairness, requiring no data filtering nor additional training. Artificial intelligence (AI) has become an integral part of our lives.
AI for the Generation and Testing of Ideas Towards an AI Supported Knowledge Development Environment
New systems employ Machine Learning to sift through large knowledge sources, creating flexible Large Language Models. These models discern context and predict sequential information in various communication forms. Generative AI, leveraging Transformers, generates textual or visual outputs mimicking human responses. It proposes one or multiple contextually feasible solutions for a user to contemplate. However, generative AI does not currently support traceability of ideas, a useful feature provided by search engines indicating origin of information. The narrative style of generative AI has gained positive reception. People learn from stories. Yet, early ChatGPT efforts had difficulty with truth, reference, calculations, and aspects like accurate maps. Current capabilities of referencing locations and linking to apps seem to be better catered by the link-centric search methods we've used for two decades. Deploying truly believable solutions extends beyond simulating contextual relevance as done by generative AI. Combining the creativity of generative AI with the provenance of internet sources in hybrid scenarios could enhance internet usage. Generative AI, viewed as drafts, stimulates thinking, offering alternative ideas for final versions or actions. Scenarios for information requests are considered. We discuss how generative AI can boost idea generation by eliminating human bias. We also describe how search can verify facts, logic, and context. The user evaluates these generated ideas for selection and usage. This paper introduces a system for knowledge workers, Generate And Search Test, enabling individuals to efficiently create solutions previously requiring top collaborations of experts.
Image Captions are Natural Prompts for Text-to-Image Models
Lei, Shiye, Chen, Hao, Zhang, Sen, Zhao, Bo, Tao, Dacheng
With the rapid development of Artificial Intelligence Generated Content (AIGC), it has become common practice in many learning tasks to train or fine-tune large models on synthetic data due to the data-scarcity and privacy leakage problems. Albeit promising with unlimited data generation, owing to massive and diverse information conveyed in real images, it is challenging for text-to-image generative models to synthesize informative training data with hand-crafted prompts, which usually leads to inferior generalization performance when training downstream models. In this paper, we theoretically analyze the relationship between the training effect of synthetic data and the synthetic data distribution induced by prompts. Then we correspondingly propose a simple yet effective method that prompts text-to-image generative models to synthesize more informative and diverse training data. Specifically, we caption each real image with the advanced captioning model to obtain informative and faithful prompts that extract class-relevant information and clarify the polysemy of class names. The image captions and class names are concatenated to prompt generative models for training image synthesis. Extensive experiments on ImageNette, ImageNet-100, and ImageNet-1K verify that our method significantly improves the performance of models trained on synthetic training data, i.e., 10% classification accuracy improvements on average.
On the application of Large Language Models for language teaching and assessment technology
Caines, Andrew, Benedetto, Luca, Taslimipoor, Shiva, Davis, Christopher, Gao, Yuan, Andersen, Oeistein, Yuan, Zheng, Elliott, Mark, Moore, Russell, Bryant, Christopher, Rei, Marek, Yannakoudakis, Helen, Mullooly, Andrew, Nicholls, Diane, Buttery, Paula
The recent release of very large language models such as PaLM and GPT-4 has made an unprecedented impact in the popular media and public consciousness, giving rise to a mixture of excitement and fear as to their capabilities and potential uses, and shining a light on natural language processing research which had not previously received so much attention. The developments offer great promise for education technology, and in this paper we look specifically at the potential for incorporating large language models in AI-driven language teaching and assessment systems. We consider several research areas - content creation and calibration, assessment and feedback - and also discuss the risks and ethical considerations surrounding generative AI in education technology for language learners. Overall we find that larger language models offer improvements over previous models in text generation, opening up routes toward content generation which had not previously been plausible. For text generation they must be prompted carefully and their outputs may need to be reshaped before they are ready for use. For automated grading and grammatical error correction, tasks whose progress is checked on well-known benchmarks, early investigations indicate that large language models on their own do not improve on state-of-the-art results according to standard evaluation metrics. For grading it appears that linguistic features established in the literature should still be used for best performance, and for error correction it may be that the models can offer alternative feedback styles which are not measured sensitively with existing methods. In all cases, there is work to be done to experiment with the inclusion of large language models in education technology for language learners, in order to properly understand and report on their capacities and limitations, and to ensure that foreseeable risks such as misinformation and harmful bias are mitigated.
AI learned from their work. Now they want compensation.
This past week, comedian Sarah Silverman filed a lawsuit against OpenAI and Facebook parent company Meta, alleging they used a pirated copy of her book in training data because the companies' chatbots can summarize her book accurately. Novelists Mona Awad and Paul Tremblay filed a similar lawsuit against OpenAI. And more than 5,000 authors, including Jodi Picoult, Margaret Atwood and Viet Thanh Nguyen, have signed a petition asking tech companies to get consent from and give credit and compensation to writers whose books were used in training data.