Generative AI
AI music generators could be a boon for artists -- but also problematic
It was only five years ago that electronic punk band YACHT entered the recording studio with a daunting task: They would train an AI on 14 years of their music, then synthesize the results into the album "Chain Tripping." "I'm not interested in being a reactionary," YACHT member and tech writer Claire L. Evans said in a documentary about the album. "I don't want to return to my roots and play acoustic guitar because I'm so freaked out about the coming robot apocalypse, but I also don't want to jump into the trenches and welcome our new robot overlords either." But our new robot overlords are making a whole lot of progress in the space of AI music generation. Even though the Grammy-nominated "Chain Tripping" was released in 2019, the technology behind it is already becoming outdated.
Council Post: Recent Advancements In Artificial Intelligence
Over the last few years, artificial intelligence (AI) has worked its way into every area of our lives. If you're a programmer, chances are you've started working with GitHub's Copilot, an AI tool that turns natural language prompts into coding suggestions to expedite programming. If you're a writer, you might have come across OpenAI's GPT-3 or similar autoregressive language models that use deep learning to create human-like text. It was just a few years ago such AI programs were in their infancy. Now they are becoming ubiquitous tools in writing and coding.
Optimization of Annealed Importance Sampling Hyperparameters
Goshtasbpour, Shirin, Perez-Cruz, Fernando
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. Although AIS is guaranteed to provide unbiased estimate for any set of hyperparameters, the common implementations rely on simple heuristics such as the geometric average bridging distributions between initial and the target distribution which affect the estimation performance when the computation budget is limited. In order to reduce the number of sampling iterations, we present a parameteric AIS process with flexible intermediary distributions defined by a residual density with respect to the geometric mean path. Our method allows parameter sharing between annealing distributions, the use of fix linear schedule for discretization and amortization of hyperparameter selection in latent variable models. We assess the performance of Optimized-Path AIS for marginal likelihood estimation of deep generative models and compare it to compare it to more computationally intensive AIS.
Interview: Why Mastering Language Is So Difficult for AI
The field of artificial intelligence has never lacked for hype. Back in 1965, AI pioneer Herb Simon declared, "Machines will be capable, within 20 years, of doing any work a man can do." That hasn't happened -- but there certainly have been noteworthy advances, especially with the rise of "deep learning" systems, in which programs plow through massive data sets looking for patterns, and then try to make predictions. Perhaps most famously, AIs that use deep learning can now beat the best human Go players (some years after computers bested humans at chess and Jeopardy). Mastering language has proven tougher, but a program called GPT-3, developed by OpenAI, can produce human-like text, including poetry and prose, in response to prompts.
Google Creates AI That Turns Text Into 3D Objects
DreamFusion, Google's next-gen, AI-powered text-to-3D-image generator, is here. A proof-of-concept paper is here, at least. DreamFusion is an evolution of Dream Fields, a text-to-3D-image generator revealed by Google back in 2021. And like Dream Fields, DreamFusion creates its 3D images by combining a Neural Radiance Field (NeRF) -- or a neural network that can create synthetic 3D scenes using partial 2D datasets -- with a pre-trained text-to-image prompt model. Unlike Dream Fields, which utilized OpenAI's CLIP technology as that latter pre-trained model, DreamFusion now uses its own: Imagen, Google's DALL-E 2 competitor.
Google won't let you talk to the latest language AI. This startup will
AI and InWorld AI have all been founded by ex-Google employees. After years of buildup, AI appears to be advancing rapidly with the release of systems like the text-to-image generator DALL-E, which was quickly followed by text-to-video and text-to-3D video tools announced by Meta and Google in recent weeks. Industry insiders say this recent brain drain is a partly a response to corporate labs growing increasingly closed off, in response to pressure to responsibly deploy AI. At smaller companies, engineers are freer to push ahead, which could lead to fewer safeguards.
DALL·E 2 Pre-Training Mitigations
In order to share the magic of DALL·E 2 with a broad audience, we needed to reduce the risks associated with powerful image generation models. To this end, we put various guardrails in place to prevent generated images from violating our content policy. This post focuses on pre-training mitigations, a subset of these guardrails which directly modify the data that DALL·E 2 learns from. In particular, DALL·E 2 is trained on hundreds of millions of captioned images from the internet, and we remove and reweight some of these images to change what the model learns. Since training data shapes the capabilities of any learned model, data filtering is a powerful tool for limiting undesirable model capabilities.
Meta enters the AI arms race with a creepy DALL-E 2 for video
AI image generation has been let loose and it seems there's no going back. With DALL-E 2 now open to all, another player has entered the fray not wanting to lose out – and it's none other than Facebook's parent company Meta. And while DALL-E 2 currently works its magic only with static images, Meta's revealed that it's working on a similar tool for video. Like with AI image generators such as DALL-E 2, users will be able to type in a descriptive text prompt, and the tool will generate four output options. Named Make-A-Video (give them a break, they were too busy with the tech to work on names) isn't yet public, but Meta AI has been doing requests on Twitter. The results are as creepy as they are astonishing.
Can Artificial Intelligence Reconstruct Ancient Mosaics?
Moral-Andrés, Fernando, Merino-Gómez, Elena, Reviriego, Pedro, Lombardi, Fabrizio
A large number of ancient mosaics have not reached us because they have been destroyed by erosion, earthquakes, looting or even used as materials in newer construction. To make things worse, among the small fraction of mosaics that we have been able to recover, many are damaged or incomplete. Therefore, restoration and reconstruction of mosaics play a fundamental role to preserve cultural heritage and to understand the role of mosaics in ancient cultures. This reconstruction has traditionally been done manually and more recently using computer graphics programs but always by humans. In the last years, Artificial Intelligence (AI) has made impressive progress in the generation of images from text descriptions and reference images. State of the art AI tools such as DALL-E2 can generate high quality images from text prompts and can take a reference image to guide the process. In august 2022, DALL-E2 launched a new feature called outpainting that takes as input an incomplete image and a text prompt and then generates a complete image filling the missing parts. In this paper, we explore whether this innovative technology can be used to reconstruct mosaics with missing parts. Hence a set of ancient mosaics have been used and reconstructed using DALL-E2; results are promising showing that AI is able to interpret the key features of the mosaics and is able to produce reconstructions that capture the essence of the scene. However, in some cases AI fails to reproduce some details, geometric forms or introduces elements that are not consistent with the rest of the mosaic. This suggests that as AI image generation technology matures in the next few years, it could be a valuable tool for mosaic reconstruction going forward.
Deep Objects Is Using Artificial Intelligence to Democratize Good Design
A quick run through popular program DALL-E 2 for terms like'Virgil Abloh-inspired sneaker' or'Yeezy sneaker' spits out a'best guess' that resembles dollar-bin unlicensed bootlegs. It's clunky, sterile, and lacks the narrative of what excites us about these designers. If we want AI to help'push culture forward', these are not the machines for the job. In rethinking how artificial intelligence can improve design, Deep Objects sought to create a model where human input was key, building an AI engine that democratizes the design of cultural artifacts. Built by the creative studio FTR (whose credits include Nike, PUMA, Google, Marni, Kendrick Lamar, Travis Scott, and Daft Punk), the team has been working on the project in secret for nearly two years. WHITEPAPER ISSUE 01 Your first real peek into [ DEEPOBJECTS ] and why we believe the world of design is in need of a shake up https://t.co/K6naXctz0J