wixsite
Survey of visual anomaly detection in industrial manufacturing using deep learning
The recent rapid development of deep learning has laid a milestone in visual anomaly detection (VAD). In this paper ... provide a comprehensive review of deep learning-based visual anomaly detection techniques, from the perspectives of neural network architectures, levels of supervision, loss functions, metrics and datasets.
Survey of Neural Radiance Field in 3D Vision
Neural Radiance Field (NeRF), a new novel view synthesis with implicit scene representation has taken the field of Computer Vision by storm. As a novel view synthesis and 3D reconstruction method, NeRF models find applications in robotics, urban mapping, autonomous navigation, virtual reality/augmented reality, and more. Since the original paper by Mildenhall et al., more than 250 preprints were published, with more than 100 eventually being accepted in tier one Computer Vision Conferences. Given NeRF popularity and the current interest in this research area, ... believe it necessary to compile a comprehensive survey of NeRF papers from the past two years ... organized into both architecture, and application based taxonomies.
Classify small images accurately using little memory and CPU with ImageSig
Classify small images accurately using little memory and CPU with ImageSig ImageSig: A signature transform for ultra-lightweight image recognition arXiv paper abstract https://arxiv.org/abs/2205.06929v1 arXiv PDF paper https://arxiv.org/pdf/2205.06929v1.pdf This paper introduces a new lightweight method for image recognition. ImageSig is based on computing signatures and does not require a convolutional structure or an attention-based encoder. ... achieves: a) an accuracy for 64 X 64 RGB images
Find location in video matching a sentence with TAN
Find location in video matching a sentence with TAN Temporal Alignment Networks for Long-term Video arXiv paper abstract https://arxiv.org/abs/2204.02968 arXiv PDF paper https://arxiv.org/pdf/2204.02968.pdf The objective ... is a temporal alignment network that ingests long term video sequences, and associated text sentences, in order to: (1) determine if a sentence is alignable with the video; and (2) if it is alignable, then determine its alignment. The challenge is to train such networks from
Get 3D models of multiple objects in RGB video with RayTran
Get 3D models of multiple objects in RGB video with RayTran RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers arXiv paper abstract https://arxiv.org/abs/2203.13296 arXiv PDF paper https://arxiv.org/pdf/2203.13296.pdf ... propose a transformer-based neural network architecture for multi-object 3D reconstruction from RGB videos. ... represent its knowledge: as a global 3D grid of features and an array of view-specific 2D grids. ... ex
Human pose estimation with 80% smaller model and 68% less CPU using STNet
Human pose estimation with %80 smaller model and 68% less CPU using STNet Towards Simple and Accurate Human Pose Estimation with Stair Network arXiv paper abstract https://arxiv.org/abs/2202.09115v1 arXiv PDF paper https://arxiv.org/pdf/2202.09115v1.pdf In ... keypoint coordinates regression task. ... existing approaches adopt complicated networks with a large number of parameters, leading to a heavy model with poor cost-effectiveness in practice. ... To overcome ... develop a small yet discrimi
Real-time 3D object detection on low CPU headsets
Real-time 3D object detection on low CPU headsets Realtime 3D Object Detection for Headsets arXiv paper abstract https://arxiv.org/abs/2201.08812v1 arXiv PDF paper https://arxiv.org/pdf/2201.08812v1.pdf Mobile headsets should be capable of understanding 3D physical environments to offer a truly immersive experience for augmented/mixed reality (AR/MR). However, their small form-factor and limited computation resources make it extremely challenging to execute in real-time 3D vision algorithms ...
- Information Technology > Artificial Intelligence > Vision (0.97)
- Information Technology > Architecture > Real Time Systems (0.89)