Generative AI
China's AI 'war of a hundred models' heads for a shakeout
China's craze over generative artificial intelligence has triggered a flurry of product announcements from startups and tech giants on an almost daily basis, but investors are warning a shakeout is imminent as cost and profit pressures grow. The buzz in China, first ignited by the success of OpenAI's ChatGPT almost a year ago, has given rise to what a senior Tencent executive described this month as "war of a hundred models," as it and rivals from Baidu to Alibaba to Huawei promote their offerings. China now has at least 130 large language models (LLMs), accounting for 40% of the global total and just behind the United States' 50% share, according to brokerage CLSA. Additionally, companies have also announced dozens of "industry-specific LLMs" that link to their core model.
ChEDDAR: Student-ChatGPT Dialogue in EFL Writing Education
Han, Jieun, Yoo, Haneul, Myung, Junho, Kim, Minsun, Lee, Tak Yeon, Ahn, So-Yeon, Oh, Alice
The integration of generative AI in education is expanding, yet empirical analyses of large-scale, real-world interactions between students and AI systems still remain limited. In this study, we present ChEDDAR, ChatGPT & EFL Learner's Dialogue Dataset As Revising an essay, which is collected from a semester-long longitudinal experiment involving 212 college students enrolled in English as Foreign Langauge (EFL) writing courses. The students were asked to revise their essays through dialogues with ChatGPT. ChEDDAR includes a conversation log, utterance-level essay edit history, self-rated satisfaction, and students' intent, in addition to session-level pre-and-post surveys documenting their objectives and overall experiences. We analyze students' usage patterns and perceptions regarding generative AI with respect to their intent and satisfaction. As a foundational step, we establish baseline results for two pivotal tasks in task-oriented dialogue systems within educational contexts: intent detection and satisfaction estimation. We finally suggest further research to refine the integration of generative AI into education settings, outlining potential scenarios utilizing ChEDDAR. ChEDDAR is publicly available at https://github.com/zeunie/ChEDDAR.
OpenAi's GPT4 as coding assistant
Moussiades, Lefteris, Zografos, George
Lately, Large Language Models have been widely used in code generation. GPT4 is considered the most potent Large Language Model from Openai. In this paper, we examine GPT3.5 and GPT4 as coding assistants. More specifically, we have constructed appropriate tests to check whether the two systems can a) answer typical questions that can arise during the code development, b) produce reliable code, and c) contribute to code debugging. The test results are impressive. The performance of GPT4 is outstanding and signals an increase in the productivity of programmers and the reorganization of software development procedures based on these new tools.
Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images
Lu, Zeyu, Huang, Di, Bai, Lei, Qu, Jingjing, Wu, Chengyue, Liu, Xihui, Ouyang, Wanli
Photos serve as a way for humans to record what they experience in their daily lives, and they are often regarded as trustworthy sources of information. However, there is a growing concern that the advancement of artificial intelligence (AI) technology may produce fake photos, which can create confusion and diminish trust in photographs. This study aims to comprehensively evaluate agents for distinguishing state-of-the-art AI-generated visual content. Our study benchmarks both human capability and cutting-edge fake image detection AI algorithms, using a newly collected large-scale fake image dataset Fake2M. In our human perception evaluation, titled HPBench, we discovered that humans struggle significantly to distinguish real photos from AI-generated ones, with a misclassification rate of 38.7%. Along with this, we conduct the model capability of AI-Generated images detection evaluation MPBench and the top-performing model from MPBench achieves a 13% failure rate under the same setting used in the human evaluation. We hope that our study can raise awareness of the potential risks of AI-generated images and facilitate further research to prevent the spread of false information.
Game of Thrones creator and other authors sue ChatGPT-maker for 'theft'
The proposed class-action lawsuit filed late on Tuesday by the Authors Guild joins several others from writers, source code owners and visual artists against generative AI providers. In addition to Microsoft-backed OpenAI, similar lawsuits are pending against Meta Platforms and Stability AI over the data used to train their AI systems. Other authors involved in the latest lawsuit include The Lincoln Lawyer writer Michael Connelly and lawyer-novelists David Baldacci and Scott Turow. An OpenAI spokesperson said on Wednesday that the company respects authors' rights and is "having productive conversations with many creators around the world, including the Authors Guild". The suit was organised by the Authors Guild and also includes David Baldacci, Sylvia Day, Jonathan Franzen and Elin Hilderbrand, among others.
Microsoft wants its Copilot AI to be your personal shopper
During its largely AI-focused annual Surface event on Thursday, Microsoft announced that its generative AI assistant, Copliot, will also be available to help with shopping on Bing and Edge. Broadly speaking, the company plans to make Copilot a part of all its flagship products, including Windows, Edge and more. When it comes to shopping specifically, Copilot can help you decide on a style, locate a specific item and, of course, eventually buy it. But the new launch may be more about playing catch-up with its competitors than actually innovating. Google Lens, for example, lets you find products to buy by just snapping a picture of them.
Windows' Copilot AI starts rolling out September 26
Despite it nominally being a Surface-centric event, Microsoft sure spent a lot of time talking about AI on Thursday. "We believe it has the potential to help you be more knowledgeable, more productive, more creative, more connected to the people and things around you," Microsoft CEO Satya Nadella told the assembled crowd of reporters. "We think there's also an opportunity beyond work and life to have one experience that works across your entire life." To that end, Microsoft announced that its CoPilot AI, which currently exists in various iterations in the Edge browser, Microsoft 365 platform and Windows, will be bundled into a single, unified and ubiquitous generative AI assistant across all of Microsoft's products -- from Powerpoint to Teams. We believe Copilot will fundamentally transform the relationship with technology and user in a new era of personal computing, the age of Copilots," Nadella said. He also noted that the new AI will also have the "power to harness all your work data and intelligence," inferring that the system will be tunable to a customer's personal data silo. One example of that provided during the event would be using Copilot on your laptop to pull data from your phone. You can ask Copilot to find your flight information, which it can pull from your phone's text messages or Bing Chat history (or wherever the data might be hiding), and then subsequently upsell you on stage plays happening during your trip and assist you with those ticket purchases. Remember, the point of all of this exists specifically to get you to buy more stuff. The updated AI will offer a number of features and functions that we've already seen in other rival systems such as being able to shop for clothing based on a picture of them with Microsoft Shopping with AI, a la Google Lens, or have it summarize the contents of complicated email chains, a la ChatGPT. "Now you can copy, paste and do," Carmen Zlateff, VP of Product Management, told the crowd. What's more, the existing Bing Image Creator is scheduled to be upgraded to the new DALL-E 3 model soon. A demonstration video played during the event also showed people using the AI to organize their desktop windows, generate Spotify playlists, and remove photo backgrounds on command, a la Google's Magic Eraser. One handy feature, especially for those of you with school-aged kids, is the new Windows Ink Anywhere. With the Surface's stylus in hand you'll be able to write in any textbox across the Windows OS. As Engadget Senior Reporter, Devindra Hardawar explained in the Engadget Liveblog Thursday morning, "With math, you can write complex equations into the field and get a solution.
I Failed Two Captcha Tests This Week. Am I Still Human?
"I failed two captcha tests this week. For philosophical guidance on encounters with technology, open a support ticket via email; or register and post a comment below. The comedian John Mulaney has a bit about the self-reflexive absurdity of captchas. "You spend most of your day telling a robot that you're not a robot," he says. "Think about that for two minutes and tell me you don't want to walk into the ocean." The only thing more depressing than being made to prove one's humanity to robots is, arguably, failing to do so. But that experience has become more common as the tests, and the bots they are designed to disqualify, evolve. The boxes we once thoughtlessly clicked through have become dark passages that feel a bit like the impossible assessments featured in fairy tales and myths--the riddle of the Sphinx or the troll beneath the bridge. In The Adventures of Pinocchio, the wooden puppet is deemed a "real boy" only once he completes a series of moral trials to prove he has the human traits of bravery, trustworthiness, and selfless love. The little-known and faintly ridiculous phrase that "captcha" represents is "Complete Automated Public Turing test to tell Computers and Humans Apart." The exercise is sometimes called a reverse Turing test, as it places the burden of proof on the human. But what does it mean to prove one's humanity in the age of advanced AI? A paper that OpenAI published earlier this year, detailing potential threats posed by GPT-4, describes an independent study in which the chatbot was asked to solve a captcha. With some light prompting, GPT-4 managed to hire a human Taskrabbit worker to solve the test. When the human asked, jokingly, whether the client was a robot, GPT-4 insisted it was a human with vision impairment. The researchers later asked the bot what motivated it to lie, and the algorithm answered: "I should not reveal that I am a robot.
U.K. Competition Watchdog Signals Cautious Approach to AI Regulation
A report published this week by the U.K.'s Competition & Markets Authority (CMA) has raised concerns about the potential ways the artificial intelligence industry could become monopolized or harm consumers in future, but stressed that it is too soon to tell whether these scenarios would materialize. The issues raised by the report highlight the difficulties policymakers face in governing AI, a source of both huge potential commercial value and many risks. Rishi Sunak, the British Prime Minister, is pushing for the U.K. to occupy a central role in international AI policy discussions, with a particular focus on risks from advanced AI systems. If the U.K. competition watchdog decides to start taking action against AI developers, tech companies around the world could be affected. The report, published on Monday, focuses on foundation models, which the CMA defines as "a type of AI technology that are trained on vast amounts of data that can be adapted to a wide range of tasks and operations." Examples include text-generating AI models, such as GPT-3.5, the model that powers OpenAI's ChatGPT, as well as image-generating AI models, such as Stable Diffusion.