Generative AI
'The future is here': Sam Altman shows off OpenAI's cutting edge video generator that can turn ANY command into an HD movie
In the Bling Zoo, a tiger wears a giant gold medallion, a monkey sports a bejeweled crown, and a turtle munches on a bowl of diamonds. Unfortunately, this fantastical destination does not exist. 'Bling Zoo' was just one of a series of videos Sora created Thursday when CEO Sam Altman asked his followers on X (formerly Twitter) to submit commands that were generated into movies. The results were so ultra-realistic, they led one observer to comment: 'This one convinced me the future is here and it's going to be OK.' One user requested that Sora create, 'An instructional cooking session for homemade gnocchi hosted by a grandmother social media influencer set in a rustic Tuscan country kitchen with cinematic lighting' This prompt led to the most realistic video containing a human that Altman posted on Thursday.
ChatGPT vs. Gemini: Which AI Chatbot Subscription Is Right for You?
The problem with testing AI chatbot subscriptions like Google's Gemini Advanced and OpenAI's ChatGPT Plus is their generality. The same tool is used for disparate applications; the same software service that developers in San Francisco are using to build their latest app might also be used by parents in Kansas to plan a Paw Patrol birthday. Even though companies often tout esoteric benchmarks to prove their chatbot's superiority, it can be hard to discern how a chatbot's technical prowess translates into a better experience for you, the user. Google is the latest company to offer one of its best AI chatbots as a subscription product; in early February, the company began offering access to Gemini Advanced for 20 per month. In doing so, Google was following the precedent set by OpenAI, which sells access to its GPT-4-powered chatbot for 20 a month.
OpenAI shows off life-like videos made with AI
The tech companies say they are monitoring the use of their tools and have instituted some policies against using them to produce political content. But enforcing those rules may be difficult. In January, OpenAI suspended a developer that had made a bot of the Democratic candidate Dean Phillips, only after a report in The Washington Post. The developer had made similar bots of political candidates in the fall.
The AI Industry Is Stuck on One Very Specific Way to Use a Chatbot
A perfect day in Los Angeles starts with a stroll along the Venice Beach boardwalk. After that, Beverly Hills, then Hollywood to see the Walk of Fame, then Griffith Park for a hike, then Chinatown for dim sum, then downtown, perhaps to catch an evening show at the Walt Disney Concert Hall. Or at least, that's what a chatbot thinks a "perfect day" is. This agenda was custom-made for me by Microsoft Copilot after I told it I had one day in town to explore the sights and asked it to plan accordingly. Here's a jam-packed 24-hour itinerary," Copilot responded, before rattling off an eight-part answer. What I didn't tell Copilot is that I already live here--and know that such an itinerary is perfect only if your idea of bliss is spending most of the day traversing one of the country's most sprawling, traffic-clogged cities, frantically popping from landmark to landmark. I asked Copilot to make me a travel itinerary because Microsoft has trotted it out as an example of how people can use the ChatGPT-like assistant. It can supposedly help you pick a destination, compare flight prices, and settle on attractions that are "popular with tourists--or just a little more off the beaten path." Of all the things you might ask a chatbot, AI companies love to suggest you ask for help planning upcoming travel. Open up ChatGPT and you might see this hypothetical prompt: "Plan a trip to see the best of New York in 3 days." Google's Gemini chatbot offers similar ones. Meta's line of chatbot assistants on Instagram and Facebook includes "Lorena," your own personal travel expert. And Rabbit, the company behind a new AI gadget, pulled out the travel example for a keynote video last month. If one were to play AI-marketing bingo, "trip itinerary" would get crossed off basically every time. More than a year into the generative-AI revolution, companies so frequently suggest that people use their tools in this way that you'd think chatbots would excel at it. In theory, chatbots that can instantaneously create travel plans are a marketer's dream. The use case is easy to understand: Planning a vacation can be a real challenge for people. First, it involves toggling among flight listings, hotel availability, and ticketing websites for major attractions. Then, it requires more nuanced research, to figure out which local restaurants are actually good and which are overpriced tourist scams, or what time to set off for a big hike that won't leave you in the woods after sunset. Most of this travel information already lives on the internet or in books, meaning that it has likely already been incorporated into a chatbot's training data. "There are probably thousands of places on webpages that describe a trip to Boston," Kathleen Creel, a professor of philosophy and computer science at Northeastern University, told me. There's people on Reddit talking about living in Boston and what they like."
Sora: OpenAI launches tool that instantly creates video from text
OpenAI revealed a tool on Thursday that can generate videos from text prompts. The new model, nicknamed Sora after the Japanese word for "sky", can produce realistic footage up to a minute long that adheres to a user's instructions on both subject matter and style. According to a company blogpost, the model is also able to create a video based on a still image or extend existing footage with new material. "We're teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction," the blogpost reads. One video included among several initial examples from the company was based on the prompt: "A movie trailer featuring the adventures of the 30-year-old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors."
OpenAI's new Sora model can generate minute-long videos from text prompts
OpenAI on Thursday announced Sora, a brand new model that generates high-definition videos up to one minute in length from text prompts. Sora, which means "sky" in Japanese, won't be available to the general public any time soon. Instead, OpenAI is making it available to a small group of academics and researchers who will assess harm and its potential for misuse. "Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background," the company said on its website. "The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world."
OpenAI teases an amazing new generative video model called Sora
"We think building models that can understand video, and understand all these very complex interactions of our world, is an important step for all future AI systems," says Tim Brooks, a scientist at OpenAI. OpenAI gave us a preview of Sora (which means sky in Japanese) under conditions of strict secrecy. In an unusual move, the firm would only share information about Sora if we agreed to wait until after news of the model was made public to seek the opinions of outside experts. OpenAI has not released a technical report or demonstrated the model actually working. And it says it won't be releasing Sora anytime soon.
OpenAI's Sora Turns AI Prompts Into Photorealistic Videos
We already know that OpenAI's chatbots can pass the bar exam without going to law school. Now, just in time for the Oscars, a new OpenAI app called Sora hopes to master cinema without going to film school. For now a research product, Sora is going out to a few select creators and a number of security experts who will red-team it for safety vulnerabilities. OpenAI plans to make it available to all wannabe auteurs at some unspecified date, but it decided to preview it in advance. Other companies, from giants like Google to startups like Runway, have already revealed text-to-video AI projects.
Investors Share Predictions for Artificial Intelligence in 2024 and Beyond
Each year, the TIME100 Most Influential Companies list recognizes businesses making extraordinary impact around the world. Enter your company here today. As investors were wowed by ChatGPT and the rapid progress made by artificial intelligence in recent years, money poured into the industry. Generative AI and AI-related startups raised nearly 50 billion in 2023, according to Crunchbase, a business data provider. Already in 2024, share prices for firms that play a role in manufacturing the advanced chips required for the most powerful AI models have skyrocketed, with Nvidia, AMD, and Arm share prices up 27%, 51%, and 82% respectively.
Rethinking Machine Unlearning for Large Language Models
Liu, Sijia, Yao, Yuanshun, Jia, Jinghan, Casper, Stephen, Baracaldo, Nathalie, Hase, Peter, Xu, Xiaojun, Yao, Yuguang, Li, Hang, Varshney, Kush R., Bansal, Mohit, Koyejo, Sanmi, Liu, Yang
We explore machine unlearning (MU) in the domain of large language models (LLMs), referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model capabilities, while maintaining the integrity of essential knowledge generation and not affecting causally unrelated information. We envision LLM unlearning becoming a pivotal element in the life-cycle management of LLMs, potentially standing as an essential foundation for developing generative AI that is not only safe, secure, and trustworthy, but also resource-efficient without the need of full retraining. We navigate the unlearning landscape in LLMs from conceptual formulation, methodologies, metrics, and applications. In particular, we highlight the often-overlooked aspects of existing LLM unlearning research, e.g., unlearning scope, data-model interaction, and multifaceted efficacy assessment. We also draw connections between LLM unlearning and related areas such as model editing, influence functions, model explanation, adversarial training, and reinforcement learning. Furthermore, we outline an effective assessment framework for LLM unlearning and explore its applications in copyright and privacy safeguards and sociotechnical harm reduction.