Multimodal models are fast becoming a reality -- consequences be damned

Dec-21-2021, 22:35:05 GMT–#artificialintelligence

Roughly a year ago, VentureBeat wrote about progress in the AI and machine learning field toward developing multimodal models, or models that can understand the meaning of text, videos, audio, and images together in context. Back then, the work was in its infancy and faced formidable challenges, not least of which concerned biases amplified in training datasets. But breakthroughs have been made. This year, OpenAI released DALL-E and CLIP, two multimodal models that the research labs claims are a "a step toward systems with [a] deeper understanding of the world." DALL-E, inspired by the surrealist artist Salvador Dalí, was trained to generate images from simple text descriptions.

dataset, multimodal model, video, (16 more...)

#artificialintelligence

Dec-21-2021, 22:35:05 GMT

News Web Page

Add feedback

Country:
- Asia (0.04)
- North America > United States
  - California (0.15)

Genre:
- Research Report (0.50)

Industry:
- Information Technology (0.69)
- Health & Medicine > Therapeutic Area (0.32)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.83)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found