Goto

Collaborating Authors

 enlarge


Photoshop took my favorite feature and massively boosted it with AI

PCWorld

Adobe Photoshop has taken a simple but powerful tool, Generative Fill, and beefed up its AI capabilities in a big way alongside a new Firefly AI art tool.


GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node Classification

arXiv.org Artificial Intelligence

Class imbalance is the phenomenon that some classes have much fewer instances than others, which is ubiquitous in real-world graph-structured scenarios. Recent studies find that off-the-shelf Graph Neural Networks (GNNs) would under-represent minor class samples. We investigate this phenomenon and discover that the subspaces of minor classes being squeezed by those of the major ones in the latent space is the main cause of this failure. We are naturally inspired to enlarge the decision boundaries of minor classes and propose a general framework GraphSHA by Synthesizing HArder minor samples. Furthermore, to avoid the enlarged minor boundary violating the subspaces of neighbor classes, we also propose a module called SemiMixup to transmit enlarged boundary information to the interior of the minor classes while blocking information propagation from minor classes to neighbor classes. Empirically, GraphSHA shows its effectiveness in enlarging the decision boundaries of minor classes, as it outperforms various baseline methods in class-imbalanced node classification with different GNN backbone encoders over seven public benchmark datasets. Code is avilable at https://github.com/wenzhilics/GraphSHA.


Using Voice Transformations to Create Additional Training Talkers for Word Spotting

Neural Information Processing Systems

Speech recognizers provide good performance for most users but the error rate often increases dramatically for a small percentage of talkers who are "different" from those talkers used for training. One expensive solution to this problem is to gather more training data in an attempt to sample these outlier users. A second solution, explored in this paper, is to artificially enlarge the number of training talkers by transforming the speech of existing training talkers. This approach is similar to enlarging the training set for OCR digit recognition by warping the training digit images, but is more difficult because continuous speech has a much larger number of dimensions (e.g. We explored the use of simple linear spectral warping to enlarge a 48-talker training data base used for word spotting.


How to increase a small photo/image to a perfect print?

#artificialintelligence

Have you been thinking about ways to make small-sized images and photos look good when printed? In this article you will learn how to use Artificial Intelligence to turn your small photos and images…


Bringing AI to Visual Inspection

#artificialintelligence

What started as a simple home repair project ended with multiple trips to the hardware store, cursing in the aisles, and a vow to never buy from a specific manufacturer ever again. A single defective bolt, which had evaded quality inspection and been packaged, shipped, and unfortunately purchased by me. The product packaging, installation instructions, and final functionality were all exemplary. But a single defective bolt, which costs only pennies in the product's bill of materials, was enough to sour me on the whole experience. Manufacturers and brand owners are under tremendous pressure to ensure premium end-to-end product quality, especially as consumers increasingly demand perfection.


I Think I Need AI! What is AI?

#artificialintelligence

Manufacturers of all sizes struggle with the cost of poor product quality, whether that translates into slower production, decreased profits, or unnecessary waste. Even worse, poor quality can do irreversible damage to brand reputation. In the food and beverage market, 20 percent of consumers say they will not purchase from a brand following a product recall. While artificial intelligence (AI) is gaining favor as a solution to quality problems, it brings a number of new, sometimes confusing, terms. As a first step, many manufacturers ask "What is AI?" Machine vision is a mainstay on today's manufacturing floor, thanks to programmers' ability to continuously train inspection systems to make automated decisions.


Repair Old Photos with AI Photo Restoration

#artificialintelligence

Most of our old photos are captured from the old cameras that are currently not in use. These gadgets were the best of their time but at the moment there is no use for those old cameras. In addition to this, most of them were black and white also. We all want to reimagine the past and intend to revitalize them by using any source. These past images can be restored by using several tools.


Synthetic Data: Changing Race In Facial Images To Address Bias In Medical Datasets

#artificialintelligence

UCLA Researchers have developed a method to change the apparent race of faces in datasets that are used to train medical machine learning systems, in an attempt to redress the racial bias that many common datasets suffer from. The new technique is capable of producing photorealistic and physiologically accurate synthetic video at an average rate of 0.005 seconds per frame, and is hoped to aid the development of new diagnostics systems for remote healthcare diagnosis and monitoring – a field that has expanded greatly under COVID restrictions. The system is intended to improve the applicability of remote photoplethysmography (rPPG), a computer vision technique that evaluates facial video content to detect volumetric changes in blood supply in a non-invasive manner. Though the work, which utilizes convolutional neural networks (CNNs), incorporates previous research code published by the UK's Durham University in 2020, the new application is intended to preserve pulsatile signals in the original test data, rather than just visually changing the apparent race of the data, as the 2020 research does. The first part of the encoder-decoder system uses the Durham race transfer model, pre-trained on VGGFace2, to generate proxy target frames with the prior Caucasian-to-African component of the Durham research.


Synthetic Data: Bridging The Occlusion Gap With Grand Theft Auto

#artificialintelligence

Researchers at the University of Illinois have created a new computer vision dataset that uses synthetic imagery generated by a Grand Theft Auto game engine to help solve one of the thorniest obstacles in semantic segmentation – recognizing objects that are only partly visible in source images and videos. To this end, as described in the paper, the researchers have used the GTA-V video game engine to generate a synthetic dataset that not only features a record-breaking number of occlusion instances, but which features perfect semantic segmentation and labelling, and accounts for temporal information in a way that is not addressed by similar open source datasets. The video below, published as supporting material for the research, illustrates the advantages of a complete 3D understanding of a scene, in that obscured objects are known and exposed in the scene in all circumstances, enabling the evaluating system to learn to associate partial occluded views with the entire (labeled) object. The resulting dataset, called SAIL-VOS 3D, is claimed by the authors to be the first synthetic video mesh dataset with frame-by-frame annotation, instance-level segmentation, ground truth depth for scene views and 2D annotations delineated by bounding boxes. The annotations of SAIL-VOS 3D include depth, instance-level modal and amodal segmentation, semantic labels and 3D meshes.


ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 2 - DAVE Developer's Wiki

#artificialintelligence

This Technical Note (TN for short) belongs to the series introduced here. Specifically, it illustrates the execution of this inference application (fruit classifier) on the Mito8M SoM, a system-on-module based on the NXP i.MX8M SoC. The kernel and the root file system of the tested platform were built with the L4.14.98_2.0.0 release of the Yocto Board Support Package for i.MX 8 family of devices. They were built with support for eIQ: "a collection of software and development tools for NXP microprocessors and microcontrollers to do inference of neural network models on embedded systems". To run the model on the target, a new C application was written.