Goto

Collaborating Authors

 Generative AI


AI-generated content should be labelled, EU commissioner says

Al Jazeera

Companies deploying AI tools with the ability to generate disinformation, such as ChatGPT and Bard, should label such content as part of their efforts to combat fake news, according to European Commission deputy head Vera Jourova. Unveiled late last year, Microsoft-backed OpenAI's ChatGPT has become the fastest-growing consumer application in history and set off a race among tech companies to bring generative AI products to market. Concerns however are mounting about potential abuse of the technology and the possibility that bad actors and even governments may use it to produce far more disinformation than before. "Signatories who integrate generative AI into their services like Bingchat for Microsoft, Bard for Google should build in necessary safeguards that these services cannot be used by malicious actors to generate disinformation," Jourova told a press conference on Monday. "Signatories who have services with a potential to disseminate AI-generated disinformation should, in turn, put in place technology to recognise such content and clearly label this to users," she said.


The Download: making sense of tech, and Apple's AR ambitions

MIT Technology Review

It's been a busy year. Over the past 12 months, we've witnessed the explosion of generative AI, the collapse of crypto, and a whole lot of promises from lawmakers pledging to slow the march of climate change. While it's easy to feel overwhelmed by all this rapid change, we're here to help. Our MIT Technology Review Explains section is dedicated to untangling the complex, sometimes messy, world of science and technology to help you understand what's happening. Our series of explainers cut through the noise and get to the heart of the issues that really matter, covering everything from biotechnology and cryptocurrency to quantum computing and what's going on in China's tech industry.


Chegg Embraced AI. ChatGPT Ate Its Lunch Anyway

WIRED

Investors were surprised when the online education company Chegg last month revealed that ChatGPT was hurting subscriber growth--the company lost half of its market value overnight. But long before Chegg became an index case for the disruptive force of ChatGPT, its top brass had heard plenty of warnings about the threat and opportunity of generative AI. For years, on afternoon walks outside Chegg's Silicon Valley headquarters, former executives say they had discussed someday slashing costs by tapping AI programs to replace an army of instructors that answer student questions and draft flashcards. Matthew Ramirez, a product leader who left Chegg two years ago, says he even advised CEO Dan Rosensweig in 2020 that generative AI would be the bus that ran down Chegg if it didn't prepare itself. And just weeks after OpenAI launched ChatGPT last November, a source familiar with the exchange says, one Chegg executive had the bot write an email to Rosensweig urging him to develop a ChatGPT rival.


In Defense of Humanity

The Atlantic - Technology

On July 13, 1833, during a visit to the Cabinet of Natural History at the Jardin des Plantes, in Paris, Ralph Waldo Emerson had an epiphany. Peering at the museum's specimens--butterflies, hunks of amber and marble, carved seashells--he felt overwhelmed by the interconnectedness of nature, and humankind's place within it. Check out more from this issue and find your next story to read. The experience inspired him to write "The Uses of Natural History," and to articulate a philosophy that put naturalism at the center of intellectual life in a technologically chaotic age--guiding him, along with the collective of writers and radical thinkers known as transcendentalists, to a new spiritual belief system. Through empirical observation of the natural world, Emerson believed, anyone could become "a definer and map-maker of the latitudes and longitudes of our condition"--finding agency, individuality, and wonder in a mechanized age. America was crackling with invention in those years, and everything seemed to be speeding up as a result.


Empowering Business Transformation: The Positive Impact and Ethical Considerations of Generative AI in Software Product Management -- A Systematic Literature Review

arXiv.org Artificial Intelligence

Generative Artificial Intelligence (GAI) has made outstanding strides in recent years, with a good-sized impact on software product management. Drawing on pertinent articles from 2016 to 2023, this systematic literature evaluation reveals generative AI's potential applications, benefits, and constraints in this area. The study shows that technology can assist in idea generation, market research, customer insights, product requirements engineering, and product development. It can help reduce development time and costs through automatic code generation, customer feedback analysis, and more. However, the technology's accuracy, reliability, and ethical consideration persist. Ultimately, generative AI's practical application can significantly improve software product management activities, leading to more efficient use of resources, better product outcomes, and improved end-user experiences.


Stack Over-Flowing with Results: The Case for Domain-Specific Pre-Training Over One-Size-Fits-All Models

arXiv.org Artificial Intelligence

Large pre-trained neural language models have brought immense progress to both NLP and software engineering. Models in OpenAI's GPT series now dwarf Google's BERT and Meta's RoBERTa, which previously set new benchmarks on a wide range of NLP applications. These models are trained on massive corpora of heterogeneous data from web crawls, which enables them to learn general language patterns and semantic relationships. However, the largest models are both expensive to train and deploy and are often closed-source, so we lack access to their data and design decisions. We argue that this trend towards large, general-purpose models should be complemented with single-purpose, more modestly sized pre-trained models. In this work, we take StackOverflow (SO) as a domain example in which large volumes of rich aligned code and text data is available. We adopt standard practices for pre-training large language models, including using a very large context size (2,048 tokens), batch size (0.5M tokens) and training set (27B tokens), coupled with a powerful toolkit (Megatron-LM), to train two models: SOBertBase, with 109M parameters, and SOBertLarge with 762M parameters, at a budget of just $\$187$ and $\$800$ each. We compare the performance of our models with both the previous SOTA model trained on SO data exclusively as well general-purpose BERT models and OpenAI's ChatGPT on four SO-specific downstream tasks - question quality prediction, closed question prediction, named entity recognition and obsoletion prediction (a new task we introduce). Not only do our models consistently outperform all baselines, the smaller model is often sufficient for strong results. Both models are released to the public. These results demonstrate that pre-training both extensively and properly on in-domain data can yield a powerful and affordable alternative to leveraging closed-source general-purpose models.


Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom Instruction

arXiv.org Artificial Intelligence

Coaching, which involves classroom observation and expert feedback, is a widespread and fundamental part of teacher training. However, the majority of teachers do not have access to consistent, high quality coaching due to limited resources and access to expertise. We explore whether generative AI could become a cost-effective complement to expert feedback by serving as an automated teacher coach. In doing so, we propose three teacher coaching tasks for generative AI: (A) scoring transcript segments based on classroom observation instruments, (B) identifying highlights and missed opportunities for good instructional strategies, and (C) providing actionable suggestions for eliciting more student reasoning. We recruit expert math teachers to evaluate the zero-shot performance of ChatGPT on each of these tasks for elementary math classroom transcripts. Our results reveal that ChatGPT generates responses that are relevant to improving instruction, but they are often not novel or insightful. For example, 82% of the model's suggestions point to places in the transcript where the teacher is already implementing that suggestion. Our work highlights the challenges of producing insightful, novel and truthful feedback for teachers while paving the way for future research to address these obstacles and improve the capacity of generative AI to coach teachers.


Computing Education in the Era of Generative AI

arXiv.org Artificial Intelligence

The computing education community has a rich history of pedagogical innovation designed to support students in introductory courses, and to support teachers in facilitating student learning. Very recent advances in artificial intelligence have resulted in code generation models that can produce source code from natural language problem descriptions -- with impressive accuracy in many cases. The wide availability of these models and their ease of use has raised concerns about potential impacts on many aspects of society, including the future of computing education. In this paper, we discuss the challenges and opportunities such models present to computing educators, with a focus on introductory programming classrooms. We summarize the results of two recent articles, the first evaluating the performance of code generation models on typical introductory-level programming problems, and the second exploring the quality and novelty of learning resources generated by these models. We consider likely impacts of such models upon pedagogical practice in the context of the most recent advances at the time of writing.


STEVE-1: A Generative Model for Text-to-Behavior in Minecraft

arXiv.org Artificial Intelligence

Constructing AI models that respond to text instructions is challenging, especially for sequential decision-making tasks. This work introduces an instruction-tuned Video Pretraining (VPT) model for Minecraft called STEVE-1, demonstrating that the unCLIP approach, utilized in DALL-E 2, is also effective for creating instruction-following sequential decision-making agents. STEVE-1 is trained in two steps: adapting the pretrained VPT model to follow commands in MineCLIP's latent space, then training a prior to predict latent codes from text. This allows us to finetune VPT through self-supervised behavioral cloning and hindsight relabeling, bypassing the need for costly human text annotations. By leveraging pretrained models like VPT and MineCLIP and employing best practices from text-conditioned image generation, STEVE-1 costs just $60 to train and can follow a wide range of short-horizon open-ended text and visual instructions in Minecraft. STEVE-1 sets a new bar for open-ended instruction following in Minecraft with low-level controls (mouse and keyboard) and raw pixel inputs, far outperforming previous baselines. We provide experimental evidence highlighting key factors for downstream performance, including pretraining, classifier-free guidance, and data scaling. All resources, including our model weights, training scripts, and evaluation tools are made available for further research.


Retrieval-Augmented Multimodal Language Modeling

arXiv.org Artificial Intelligence

Recent multimodal models such as DALL-E and CM3 have achieved remarkable progress in text-to-image and image-to-text generation. However, these models store all learned knowledge (e.g., the appearance of the Eiffel Tower) in the model parameters, requiring increasingly larger models and training data to capture more knowledge. To integrate knowledge in a more scalable and modular way, we propose a retrieval-augmented multimodal model, which enables a base multimodal model (generator) to refer to relevant text and images fetched by a retriever from external memory (e.g., documents on the web). Specifically, for the retriever, we use a pretrained CLIP, and for the generator, we train a CM3 Transformer on the LAION dataset. Our resulting model, named Retrieval-Augmented CM3 (RA-CM3), is the first multimodal model that can retrieve and generate both text and images. We show that RA-CM3 significantly outperforms baseline multimodal models such as DALL-E and CM3 on both image and caption generation tasks (12 FID and 17 CIDEr improvements on MS-COCO), while requiring much less compute for training (<30% of DALL-E). Moreover, we show that RA-CM3 exhibits novel capabilities, such as faithful image generation and multimodal in-context learning (e.g., image generation from demonstrations).