Google's new AI image analysis is pretty LiT - and beats OpenAI

Apr-23-2022, 15:40:10 GMT–#artificialintelligence

Google demonstrates impressive artificial intelligence image analysis: the multimodal trained LiT model outperforms OpenAI's CLIP. The combination of images and text descriptions, usually pulled en masse from the Internet, has proven to be a powerful resource for artificial intelligence training. Instead of relying on manually crafted image databases like ImageNet, where people search numerous images for each category like dog, cat, or table, newer image analysis models rely on comparatively unstructured masses of images and text. They learn multimodally and self-monitored. A particularly prominent example is OpenAI's CLIP, which is used, for example, in the new DALL-E 2. These self-supervised trained AI models have one major advantage: they learn much more robust representations of visual categories, since they do not have to rely on the categorizations manually identified by humans.

accuracy, image encoder, representation, (13 more...)

#artificialintelligence

Apr-23-2022, 15:40:10 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found