mindtech
Surveillance AI needs fake data to track people. These companies are supplying it.
Companies are building software that uses AI to monitor people's behavior and interpret their emotions and body language in real life, virtually and even in the metaverse. But to develop that AI, they need fake data, and startups are stepping in to supply it. Synthetic data companies are providing millions of images, videos and sometimes audio data samples that have been generated for the sole purpose of training or improving AI models that could become part of our everyday lives in controversial forms of AI such as facial recognition, emotion AI and other algorithmic systems used to keep track of people's behavior. While in the past companies building computer vision-based AI often relied on publicly available datasets, now AI developers are looking to customized synthetic data to "address more and more domain-specific problems that have zero data you can actually access," said Ofir Zuk, co-founder and CEO of synthetic data company Datagen. Synthetic data companies including Datagen, Mindtech and Synthesis AI represent a corner of an increasingly compartmentalized AI industry.
Synthetic Images for AI Training
The upshot: Mindtech provides a capability for creating fully annotated synthetic training images to complement real images for improved AI training. We've spent a lot of time looking at AI training and AI inference and the architectures and processes used for each of those. Where the AI task involves images, we've blithely referred to the need for training sets; that's easy, right? After all, if you're trying to train your algorithm to recognize a dog, then just give it a bunch of pictures of dogs (OK, tag them with, "This one contains a dog") and then a bunch of pictures without dogs ("This one contains no dog"), and off you go! Right? And the behemoths like Google and Facebook have oodles of images and videos (videos being collections of frames, each of which is an image), thanks to the free stuff willingly served up by unsuspecting users (including images now and 10 years ago to help improve aging algorithms).