AITopics | projection unit

Every Scene Text Recognition (STR) task consists of text localization \& text recognition as the prominent sub-tasks. However, in real-world applications with fixed camera positions such as equipment monitor reading, image-based data entry, and printed document data extraction, the underlying data tends to be regular scene text. Hence, in these tasks, the use of generic, bulky models comes up with significant disadvantages compared to customized, efficient models in terms of model deployability, data privacy \& model reliability. Therefore, this paper introduces the underlying concepts, theory, implementation, and experiment results to develop models, which are highly specialized for the task itself, to achieve not only the SOTA performance but also to have minimal model weights, shorter inference time, and high model reliability. We introduce a novel deep learning architecture (GeoTRNet), trained to identify digits in a regular scene image, only using the geometrical features present, mimicking human perception over text recognition. The code is publicly available at https://github.com/ACRA-FL/GeoTRNet

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2302.03873

Country: Asia > Sri Lanka > Western Province > Colombo > Colombo (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.86)
Media (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Review: DBPN & D-DBPN -- Deep Back-Projection Networks For Super-Resolution (Super Resolution)

#artificialintelligenceMay-6-2020, 20:25:16 GMT

Multiple networks are constructed as S (T 2), M (T 4), and L (T 6) from the original DBPN. In the feature extraction, we use conv(3, 128) followed by conv(1, 32). Then, we use conv(1, 1) for the reconstruction. The input and output images are luminance only. The S network gives a higher PSNR than VDSR, DRCN, and LapSRN.

convolutional layer, enlargement, lapsrn, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (0.57)

Add feedback

RenderNet: A deep convolutional network for differentiable rendering from 3D shapes

Nguyen-Phuoc, Thu H., Li, Chuan, Balaban, Stephen, Yang, Yongliang

Neural Information Processing SystemsDec-31-2018

Traditional computer graphics rendering pipelines are designed for procedurally generating 2D images from 3D shapes with high performance. The nondifferentiability due to discrete operations (such as visibility computation) makes it hard to explicitly correlate rendering parameters and the resulting image, posing a significant challenge for inverse rendering tasks. Recent work on differentiable rendering achieves differentiability either by designing surrogate gradients for non-differentiable operations or via an approximate but differentiable renderer. These methods, however, are still limited when it comes to handling occlusion, and restricted to particular rendering effects. We present RenderNet, a differentiable rendering convolutional network with a novel projection unit that can render 2D images from 3D shapes. Spatial occlusion and shading calculation are automatically encoded in the network. Our experiments show that RenderNet can successfully learn to implement different shaders, and can be used in inverse rendering tasks to estimate shape, pose, lighting and texture from a single image.

artificial intelligence, machine learning, rendernet, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

RenderNet: A deep convolutional network for differentiable rendering from 3D shapes

Nguyen-Phuoc, Thu H., Li, Chuan, Balaban, Stephen, Yang, Yongliang

Neural Information Processing SystemsDec-31-2018

Traditional computer graphics rendering pipelines are designed for procedurally generating 2D images from 3D shapes with high performance. The nondifferentiability due to discrete operations (such as visibility computation) makes it hard to explicitly correlate rendering parameters and the resulting image, posing a significant challenge for inverse rendering tasks. Recent work on differentiable rendering achieves differentiability either by designing surrogate gradients for non-differentiable operations or via an approximate but differentiable renderer. These methods, however, are still limited when it comes to handling occlusion, and restricted to particular rendering effects. We present RenderNet, a differentiable rendering convolutional network with a novel projection unit that can render 2D images from 3D shapes. Spatial occlusion and shading calculation are automatically encoded in the network. Our experiments show that RenderNet can successfully learn to implement different shaders, and can be used in inverse rendering tasks to estimate shape, pose, lighting and texture from a single image.

artificial intelligence, machine learning, rendernet, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

projection unit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

RenderNet: A deep convolutional network for differentiable rendering from 3D shapes

RenderNet: A deep convolutional network for differentiable rendering from 3D shapes

cba0a4ee5ccd02fda0fe3f9a3e7b89fe-Supplemental.pdf

Geometric Perception based Efficient Text Recognition

Review: DBPN & D-DBPN -- Deep Back-Projection Networks For Super-Resolution (Super Resolution)

RenderNet: A deep convolutional network for differentiable rendering from 3D shapes

RenderNet: A deep convolutional network for differentiable rendering from 3D shapes