Generative AI
On the Tool Manipulation Capability of Open-source Large Language Models
Xu, Qiantong, Hong, Fenglu, Li, Bo, Hu, Changran, Chen, Zhengyu, Zhang, Jian
Recent studies on software tool manipulation with large language models (LLMs) mostly rely on closed model APIs. The industrial adoption of these models is substantially constrained due to the security and robustness risks in exposing information to closed LLM API services. In this paper, we ask can we enhance open-source LLMs to be competitive to leading closed LLM APIs in tool manipulation, with practical amount of human supervision. By analyzing common tool manipulation failures, we first demonstrate that open-source LLMs may require training with usage examples, in-context demonstration and generation style regulation to resolve failures. These insights motivate us to revisit classical methods in LLM literature, and demonstrate that we can adapt them as model alignment with programmatic data generation, system prompts and in-context demonstration retrievers to enhance open-source LLMs for tool manipulation. To evaluate these techniques, we create the ToolBench, a tool manipulation benchmark consisting of diverse software tools for real-world tasks. We demonstrate that our techniques can boost leading open-source LLMs by up to 90% success rate, showing capabilities competitive to OpenAI GPT-4 in 4 out of 8 ToolBench tasks. We show that such enhancement typically requires about one developer day to curate data for each tool, rendering a recipe with practical amount of human supervision.
Transformer models: an introduction and catalog
Amatriain, Xavier, Sankar, Ananth, Bing, Jie, Bodigutla, Praveen Kumar, Hazen, Timothy J., Kazi, Michaeel
In the past few years we have seen the meteoric appearance of dozens of foundation models of the Transformer family, all of which have memorable and sometimes funny, but not self-explanatory, names. The goal of this paper is to offer a somewhat comprehensive but simple catalog and classification of the most popular Transformer models. The paper also includes an introduction to the most important aspects and innovations in Transformer models. Our catalog will include models that are trained using self-supervised learning (e.g., BERT or GPT3) as well as those that are further trained using a human-in-the-loop (e.g. the InstructGPT model used by ChatGPT).
Transformative Effects of ChatGPT on Modern Education: Emerging Era of AI Chatbots
Gill, Sukhpal Singh, Xu, Minxian, Patros, Panos, Wu, Huaming, Kaur, Rupinder, Kaur, Kamalpreet, Fuller, Stephanie, Singh, Manmeet, Arora, Priyansh, Parlikad, Ajith Kumar, Stankovski, Vlado, Abraham, Ajith, Ghosh, Soumya K., Lutfiyya, Hanan, Kanhere, Salil S., Bahsoon, Rami, Rana, Omer, Dustdar, Schahram, Sakellariou, Rizos, Uhlig, Steve, Buyya, Rajkumar
ChatGPT, an AI-based chatbot, was released to provide coherent and useful replies based on analysis of large volumes of data. In this article, leading scientists, researchers and engineers discuss the transformative effects of ChatGPT on modern education. This research seeks to improve our knowledge of ChatGPT capabilities and its use in the education sector, identifying potential concerns and challenges. Our preliminary evaluation concludes that ChatGPT performed differently in each subject area including finance, coding and maths. While ChatGPT has the ability to help educators by creating instructional content, offering suggestions and acting as an online educator to learners by answering questions and promoting group work, there are clear drawbacks in its use, such as the possibility of producing inaccurate or false data and circumventing duplicate content (plagiarism) detectors where originality is essential. The often reported hallucinations within Generative AI in general, and also relevant for ChatGPT, can render its use of limited benefit where accuracy is essential. What ChatGPT lacks is a stochastic measure to help provide sincere and sensitive communication with its users. Academic regulations and evaluation practices used in educational institutions need to be updated, should ChatGPT be used as a tool in education. To address the transformative effects of ChatGPT on the learning environment, educating teachers and students alike about its capabilities and limitations will be crucial.
You Don't Have to Be Perfect to Be Amazing: Unveil the Utility of Synthetic Images
Xing, Xiaodan, Felder, Federico, Nan, Yang, Papanastasiou, Giorgos, Simon, Walsh, Yang, Guang
Synthetic images generated from deep generative models have the potential to address data scarcity and data privacy issues. The selection of synthesis models is mostly based on image quality measurements, and most researchers favor synthetic images that produce realistic images, i.e., images with good fidelity scores, such as low Fr\'echet Inception Distance (FID) and high Peak Signal-To-Noise Ratio (PSNR). However, the quality of synthetic images is not limited to fidelity, and a wide spectrum of metrics should be evaluated to comprehensively measure the quality of synthetic images. In addition, quality metrics are not truthful predictors of the utility of synthetic images, and the relations between these evaluation metrics are not yet clear. In this work, we have established a comprehensive set of evaluators for synthetic images, including fidelity, variety, privacy, and utility. By analyzing more than 100k chest X-ray images and their synthetic copies, we have demonstrated that there is an inevitable trade-off between synthetic image fidelity, variety, and privacy. In addition, we have empirically demonstrated that the utility score does not require images with both high fidelity and high variety. For intra- and cross-task data augmentation, mode-collapsed images and low-fidelity images can still demonstrate high utility. Finally, our experiments have also showed that it is possible to produce images with both high utility and privacy, which can provide a strong rationale for the use of deep generative models in privacy-preserving applications. Our study can shore up comprehensive guidance for the evaluation of synthetic images and elicit further developments for utility-aware deep generative models in medical image synthesis.
Can Generative AI Bots Be Trusted?
In November 2022, OpenAI released ChatGPT, a major step forward in creative artificial intelligence. ChatGPT is OpenAI's interface to a "large language model," a new breed of AI based on a neural network trained on billions of words of text. ChatGPT generates natural language responses to queries (prompts) on those texts. In bringing working versions of this technology to the public, ChatGPT has unleashed a huge wave of experimentation and commentary. It has inspired moods of awe, amazement, fear, and perplexity.
Google and the European Commission will collaborate on AI ground rules
The world's governments have taken note of generative AI's potential for massive disruption and are acting accordingly. European Commission (EC) industry chief Thierry Breton said Wednesday that it would work with Alphabet on a voluntary pact to establish artificial intelligence ground rules, according to Reuters. Breton met with Google CEO Sundar Pichai in Brussels to discuss the arrangement, which will include input from companies based in Europe and other regions. The EU has a history of enacting strict technology rules, and the alliance gives Google a chance to provide input while steering clear of trouble down the road. The compact aims to set up guidelines ahead of official legislation like the EU's proposed AI Act, which will take much longer to develop and enact.
Sam Altman's World Tour Hopes to Reassure AI Doomers
The excitement around the London arrival of OpenAI CEO Sam Altman was palpable from the queue that snaked its way around the University College London building ahead of his speech on Wednesday afternoon. Hundreds of eager-faced students and admirers of OpenAI's chatbot ChatGPT had come here to watch the UK leg of Altman's world tour, where he expects to travel to around 17 cities. This week, he has already visited Paris and Warsaw. Last week he was in Lagos. But the queue was soundtracked by a small group of people who had traveled to loudly express their anxiety that AI is advancing too fast.
OpenAI Could Quit Europe Over New AI Rules, CEO Sam Altman Warns
OpenAI CEO Sam Altman said Wednesday his company could "cease operating" in the European Union if it is unable to comply with the provisions of new artificial intelligence legislation that the bloc is currently preparing. "We're gonna try to comply," Altman said on the sidelines of a panel discussion at University College London, part of an ongoing tour of European countries. He said he had met with E.U. regulators to discuss the AI act as part of his tour, and added that OpenAI had "a lot" of criticisms of the way the act is currently worded. Altman said that OpenAI's skepticism centered on the E.U. law's designation of "high risk" systems as it is currently drafted. The law is still undergoing revisions, but under its current wording it may require large AI models like OpenAI's ChatGPT and GPT-4 to be designated as "high risk," forcing the companies behind them to comply with additional safety requirements.
When does Windows Copilot launch? Here's everything you need to know.
As you've undoubtedly noticed, AI-related news is everywhere, and its influence continues to grow. Just last week, OpenAI released an iOS version of ChatGPT (an Android version is coming soon) that runs directly on your iPhone and adds the ability to speak your request for information into its interactive chatbot user interface. Now, Microsoft has announced that it's bringing a range of new generative AI-powered features to Windows 11 starting in June. The main component is called Windows Copilot, a set of text-driven assistive capabilities that make using your PC easier and more intuitive. The company also announced the ability to integrate Bing Chat plug-ins into Windows, meaning that many of the impressive capabilities Microsoft brought to its Bing search engine will be available directly in Windows.
ChatGPT maker OpenAI calls for AI regulation, warning of 'existential risk'
Over the next decade, "it's conceivable that … AI systems will exceed expert skill level in most domains, and carry out as much productive activity as one of today's largest corporations," the OpenAI team wrote. "In terms of both potential upsides and downsides, superintelligence will be more powerful than other technologies humanity has had to contend with in the past. We can have a dramatically more prosperous future; but we have to manage risk to get there."