Google's new AI image analysis is pretty LiT - and beats OpenAI

#artificialintelligence 

Google demonstrates impressive artificial intelligence image analysis: the multimodal trained LiT model outperforms OpenAI's CLIP. The combination of images and text descriptions, usually pulled en masse from the Internet, has proven to be a powerful resource for artificial intelligence training. Instead of relying on manually crafted image databases like ImageNet, where people search numerous images for each category like dog, cat, or table, newer image analysis models rely on comparatively unstructured masses of images and text. They learn multimodally and self-monitored. A particularly prominent example is OpenAI's CLIP, which is used, for example, in the new DALL-E 2. These self-supervised trained AI models have one major advantage: they learn much more robust representations of visual categories, since they do not have to rely on the categorizations manually identified by humans.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found