AITopics | Generative AI

Collaborating Authors

Generative AI

News Overviews Instructional Materials AI-Alerts Classics

Testing OpenAI's whisper with a Scottish accent

#artificialintelligenceOct-15-2022, 20:20:43 GMT

OpenAI's recent release of Whisper boasts human-level robustness and accuracy in speech recognition. I'm not Scottish (although I was born pretty close), but I immediately wanted to test it with a Scottish accent and compare it to "human-level". Having bought an unexciting new iPhone, at least I could put its A16 Bionic chip with 16-core Neural Engine through its paces for my experiment. Once the boring tech stuff was out of the way, I shared the test app on TestFlight with a few colleagues, yielding much amusement with its borderline magical results. Here's a little clip from the start of Trainspotting, which is particularly challenging for machines to understand; a Scottish accent over the top of Iggy Pop isn't something you'd train for.

openai, scottish accent, testing openai, (4 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

Artyficial intelligence: what does creative AI mean for marketers? - Raconteur

#artificialintelligenceOct-15-2022, 20:20:24 GMT

It wasn't meant to happen like this. Yes, the robots were always going to come for everyone's jobs, but it was the menial ones that were set to go first. Freed from the need to fill out spreadsheets and perform administrative duties, we were all supposed to have extra time to indulge in more creative, fulfilling pursuits. Yet Microsoft Excel still exists while AI algorithms are producing works of art that are both commercially viable and critically respected. An AI artist, Jason Allen, recently caused outrage among old-school digital artists by winning a digital art competition. One of the writers of US publication The Atlantic, Charlie Warzel, provoked the ire of illustrators around the world by choosing to adorn an article about controversial radio host Alex Jones with an AI-generated caricature as opposed to using a stock photo or commissioning a portrait.

artist, artyficial intelligence, creativity, (12 more...)

#artificialintelligence

Country: Asia > Middle East > Jordan (0.05)

Industry: Media (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

Ultra-Large AI Models Are Over

#artificialintelligenceOct-15-2022, 01:10:23 GMT

I don't mean'over' as in "you won't see a new large AI model ever again" but as in "AI companies have reasons to not pursue them as a core research goal--indefinitely." This article isn't a critique of the past years--even if I don't buy the "scale is all you need" argument, I acknowledge just how far scaling has advanced the field. Parallelism can be drawn between the 2020-2022 scaling race and--keeping the distance--the 50s-70s space race. Both advanced science significantly as a byproduct of other intentions. While space exploration was innovative in nature, the quest for novelty isn't present in the "bigger is better" AI trend: To conquer space, the US and USSR had to design novel paths toward a clear goal. In contrast, AI companies have blindly followed a predefined path without knowing why or whether it'd lead us anywhere. You can't put the cart before the horse.

agi, intelligence, llm, (15 more...)

#artificialintelligence

Country:

Europe > Russia (0.24)
Asia > Russia (0.24)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.33)

Add feedback

LAION-5B: An open large-scale dataset for training next generation image-text models

Schuhmann, Christoph, Beaumont, Romain, Vencu, Richard, Gordon, Cade, Wightman, Ross, Cherti, Mehdi, Coombes, Theo, Katta, Aarush, Mullis, Clayton, Wortsman, Mitchell, Schramowski, Patrick, Kundurthy, Srivatsa, Crowson, Katherine, Schmidt, Ludwig, Kaczmarczyk, Robert, Jitsev, Jenia

arXiv.org Artificial IntelligenceOct-15-2022

Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of training on large amounts of noisy image-text data, without relying on expensive accurate labels used in standard vision unimodal supervised learning. The resulting models showed capabilities of strong text-guided image generation and transfer to downstream tasks, while performing remarkably at zero-shot classification with noteworthy out-of-distribution robustness. Since then, large-scale language-vision models like ALIGN, BASIC, GLIDE, Flamingo and Imagen made further improvements. Studying the training and capabilities of such models requires datasets containing billions of image-text pairs. Until now, no datasets of this size have been made openly available for the broader research community. To address this problem and democratize research on large-scale multi-modal models, we present LAION-5B - a dataset consisting of 5.85 billion CLIP-filtered image-text pairs, of which 2.32B contain English language. We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale. Additionally we provide several nearest neighbor indices, an improved web-interface for dataset exploration and subset generation, and detection scores for watermark, NSFW, and toxic content detection. Announcement page https://laion.ai/laion-5b-a-new-era-of-open-large-scale-multi-modal-datasets/

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.08402

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > Canada > Ontario > Toronto (0.04)
(6 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.67)
Government (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Add feedback

AI Sentience: I asked OpenAI about Google Lamda and Blake Lemoine

#artificialintelligenceOct-14-2022, 17:50:22 GMT

The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly. Human: Hello, who are you? AI: I am an AI created by OpenAI. How can I help you today?

consciousness, google lamda and blake lemoine, openai, (7 more...)

#artificialintelligence

Genre: Personal > Interview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.65)

Add feedback

Microsoft brings DALL-E 2 to the masses with Designer and Image Creator

#artificialintelligenceOct-14-2022, 09:12:27 GMT

Microsoft is making a major investment in DALL-E 2, OpenAI's AI-powered system that generates images from text, by bringing it to first-party apps and services. During its Ignite conference this week, Microsoft announced that it's integrating DALL-E 2 with the newly announced Microsoft Designer app and Image Creator tool in Bing and Microsoft Edge. With the advent of DALL-E 2 and open source alternatives like Stable Diffusion in recent years, AI image generators have exploded in popularity. In September, OpenAI said that more than 1.5 million users were actively creating over 2 million images a day with DALL-E 2, including artists, creative directors and authors. Brands such as Stitch Fix, Nestlé and Heinz have piloted DALL-E 2 for ad campaigns and other commercial use cases, while certain architectural firms have used DALL-E 2 and tools akin to it to conceptualize new buildings.

dall-e 2, image creator, microsoft, (12 more...)

#artificialintelligence

Country: North America > United States (0.30)

Industry: Government > Regional Government > North America Government > United States Government (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Photographer Creates AI Girlfriend to Stave Off Nosy Relatives

#artificialintelligenceOct-14-2022, 07:48:39 GMT

Unmesh Dinda from PiXimperfect has displayed the awesome power of artificially intelligent (AI) photo editing by creating a girlfriend that doesn't exist. Dinda's convincing selfie of a loved-up couple on a city break even has extremely realistic lighting and shadows that fit perfectly within the photo. The only catch: Dinda is the only real human in the photo and the woman was created through the power of AI. "If your relatives are more concerned about you getting married than you are, you need to send them a photo like this. This will keep them wondering for a while," Dinda says on his YouTube video. Last month, DALL-E announced that it will allow users to edit images with human faces after previously banning the practice.

artificial intelligence, machine learning, social media, (8 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Social Media (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.36)

Add feedback

Unconditional Image-Text Pair Generation with Multimodal Cross Quantizer

Lee, Hyungyung, Park, Sungjin, Lee, Joonseok, Choi, Edward

arXiv.org Artificial IntelligenceOct-14-2022

Although deep generative models have gained a lot of attention, most of the existing works are designed for unimodal generation. In this paper, we explore a new method for unconditional image-text pair generation. We design Multimodal Cross-Quantization VAE (MXQ-VAE), a novel vector quantizer for joint image-text representations, with which we discover that a joint image-text representation space is effective for semantically consistent image-text pair generation. To learn a multimodal semantic correlation in a quantized space, we combine VQ-VAE with a Transformer encoder and apply an input masking strategy. Specifically, MXQ-VAE accepts a masked image-text pair as input and learns a quantized joint representation space, so that the input can be converted to a unified code sequence, then we perform unconditional image-text pair generation with the code sequence. Extensive experiments show the correlation between the quantized joint space and the multimodal generation capability on synthetic and real-world datasets. In addition, we demonstrate the superiority of our approach in these two aspects over several baselines. The source code is publicly available at: https://github.com/ttumyche/MXQ-VAE.

image-text pair, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2204.07537

Country:

North America > United States > Massachusetts (0.04)
North America > United States > California (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

📝 📺 Edge#234: Inside Meta AI's Make-A-Video

#artificialintelligenceOct-13-2022, 13:45:49 GMT

On Thursdays, we dive deep into one of the freshest research papers or technology frameworks that is worth your attention. Our goal is to keep you up to date with new developments in AI to complement the concepts we debate in other editions of our newsletter. Text-to-Video (T2V) is considered the next frontier for generative artificial intelligence (AI) models. While the text-to-image (T2I) space is experiencing a revolution with models like DALL-E, Stable Diffusion, and Midjouney, T2V still remains a monumental challenge. Recently, researchers from Meta AI unveiled Make-A-Video, a T2V model able to create realistic short video clips from textual inputs.

edge, make-a-video, meta ai

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.68)

Add feedback

AI-generated imagery is the new clip art as Microsoft adds DALL-E to its Office suite

#artificialintelligenceOct-13-2022, 11:55:16 GMT

Microsoft doesn't say whether its Designer app can generate images of people, for example. The company says OpenAI has filtered "explicit sexual and violent content from the dataset used to train the model" and that it's also "deployed filters to limit generation of images that violate content policy" and "additional query blocking on sensitive topics." But, such filters are always permeable, and the tools could still be used to generate troubling imagery -- from NSFW creations to offensive or insensitive content.

ai-generated imagery, new clip art, office suite, (1 more...)

#artificialintelligence

Industry: Information Technology > Services (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.82)

Add feedback