input image size
Dual-Resolution Correspondence Networks-Supplementary Material-Xinghui Li
In section 1, we provide five alternatives to the FPN-like structure for fusing the dual-resolution feature maps of the feature backbone. The channels of all feature maps are aligned to 1024 by 1 1 conv layers. As shown in Figure 2, all five types of variants have similar overall performance. Additionally, we also compare type (a) and type (e) with their 256 channel counterparts in Figure 4. We can see that increasing number of channels does not affect the performance of type (e). This further justifies that type (a) is a more proper choice for DualRC-Net. 2 Figure 4: Comparison between 256 and 1024 output feature channels for type (a) and type (e).
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > Canada (0.05)
Scalable Federated Learning for Clients with Different Input Image Sizes and Numbers of Output Categories
Nitta, Shuhei, Suzuki, Taiji, Mulet, Albert Rodríguez, Yaguchi, Atsushi, Hirai, Ryusuke
Federated learning is a privacy-preserving training method which consists of training from a plurality of clients but without sharing their confidential data. However, previous work on federated learning do not explore suitable neural network architectures for clients with different input images sizes and different numbers of output categories. In this paper, we propose an effective federated learning method named ScalableFL, where the depths and widths of the local models for each client are adjusted according to the clients' input image size and the numbers of output categories. In addition, we provide a new bound for the generalization gap of federated learning. In particular, this bound helps to explain the effectiveness of our scalable neural network approach. We demonstrate the effectiveness of ScalableFL in several heterogeneous client settings for both image classification and object detection tasks.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
MobileNet Model Information
Abstract-- We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce two simple global hyper-parameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to choose the right sized model for their application based on the constraints of the problem. We present extensive experiments on resource and accuracy tradeoffs and show strong performance compared to other popular models on ImageNet classification. We then demonstrate the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.
Implementing a fully convolutional network (FCN) in TensorFlow 2
Using a pre-trained model that is trained on huge datasets like ImageNet, COCO, etc. we can quickly specialize these architectures to work for our unique dataset. This process is termed as transfer learning. Pre-trained models for image classification and object detection tasks are usually trained on fixed input image sizes. These typically range from 224x224x3 to somewhere around 512x512x3 and mostly have an aspect ratio of 1 i.e. the width and height of the image are equal. If they are not equal then the images are resized to be of equal height and width.
Deep Learning Fundus Image Analysis for Diabetic Retinopathy and Macular Edema Grading
Sahlsten, Jaakko, Jaskari, Joel, Kivinen, Jyri, Turunen, Lauri, Jaanio, Esa, Hietala, Kustaa, Kaski, Kimmo
Diabetes is a globally prevalent disease that can cause visible microvascular complications such as diabetic retinopathy and macular edema in the human eye retina, the images of which are today used for manual disease screening. This labor-intensive task could greatly benefit from automatic detection using deep learning technique. Here we present a deep learning system that identifies referable diabetic retinopathy comparably or better than presented in the previous studies, although we use only a small fraction of images ( 1/4) in training but are aided with higher image resolutions. We also provide novel results for five different screening and clinical grading systems for diabetic retinopathy and macular edema classification, including results for accurately classifying images according to clinical five-grade diabetic retinopathy and four-grade diabetic macular edema scales. These results suggest, that a deep learning system could increase the cost-effectiveness of screening while attaining higher than recommended performance, and that the system could be applied in clinical examinations requiring finer grading.
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)