I don't know about you right there, but my house is often cluttered in the mornings. Trying to find my keys from the clutter sometimes takes up a lot of time and becomes quite an agonizing endeavor. Perhaps, if I could scan the room with some sort of computer algorithm, I would not have to waste minutes looking for my keys on those wretched mornings, right? And that's where object detection comes in. Now, while we are still working on fine-tuning real-life object detection, the process is entirely possible on digital media thanks to the remarkable power of object detection algorithms.
We are pleased to introduce the ability to export high-resolution keyframes from Azure Media Service's Video Indexer. Whereas keyframes were previously exported in reduced resolution compared to the source video, high resolution keyframes extraction gives you original quality images and allows you to make use of the image-based artificial intelligence models provided by the Microsoft Computer Vision and Custom Vision services to gain even more insights from your video. This unlocks a wealth of pre-trained and custom model capabilities. You can use the keyframes extracted from Video Indexer, for example, to identify logos for monetization and brand safety needs, to add scene description for accessibility needs or to accurately identify very specific objects relevant for your organization, like identifying a type of car or a place. Let's look at some of the use cases we can enable with this new introduction.
UOAIS: a novel #AI to segment unseen objects via HOM scheme Segmenting objects is an essential skill for robotic manipulations in an unstructured environment. Although previous works achieved encouraging results, they were limited to segmenting the only visible regions of unseen objects. Highlights: UOAIS to detect occlusions on unseen object instances UOAIS to highlight visible and amodal masks Introducing the Hierarchical Occlusion Modeling (HOM) SOTA on three benchmarks (tabletop, indoors, and bin environments) Link to paper, code and project in the first comment Want to know more about #AI? Follow ARGO Vision or ping me #artificialintelligence #machinelearning #deeplearningai #deepneuralnetworks #neuralnetworks #ml #deeplearning #computervision #nvidiavgpu #nvidia
At Google I/O, the global tech giant announced a bunch of free courses to help budding developers explore the potential of machine learning and artificial intelligence technology across various open-source frameworks and platforms like TensorFlow.js, TensorFlow Lite, Vertex.AI etc. We have made a list of all the machine learning and artificial intelligence courses announced at Google I/O. It is an excellent course for beginners, especially if you want to solve the spam issue. It will introduce you to TensorFlow.js and machine learning and help you build a comment-spam detection system using TensorFlow.js. Click here to watch the video. Here, you will learn the concepts behind machine learning and identify spam using text classification ML.
Real-world learning systems have practical limitations on the quality and quantity of the training datasets that they can collect and consider. How should a system go about choosing a subset of the possible training examples that still allows for learning accurate, generalizable models? To help address this question, we draw inspiration from a highly efficient practical learning system: the human child. Using head-mounted cameras, eye gaze trackers, and a model of foveated vision, we collected first-person (egocentric) images that represents a highly accurate approximation of the "training data" that toddlers' visual systems collect in everyday, naturalistic learning contexts. We used state-of-the-art computer vision learning models (convolutional neural networks) to help characterize the structure of these data, and found that child data produce significantly better object models than egocentric data experienced by adults in exactly the same environment. By using the CNNs as a modeling tool to investigate the properties of the child data that may enable this rapid learning, we found that child data exhibit a unique combination of quality and diversity, with not only many similar large, high-quality object views but also a greater number and diversity of rare views. This novel methodology of analyzing the visual "training data" used by children may not only reveal insights to improve machine learning, but also may suggest new experimental tools to better understand infant learning in developmental psychology.