AITopics | input image size

Collaborating Authors

input image size

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dual-Resolution Correspondence Networks-Supplementary Material-Xinghui Li

Neural Information Processing SystemsAug-16-2025, 11:01:10 GMT

In section 1, we provide five alternatives to the FPN-like structure for fusing the dual-resolution feature maps of the feature backbone. The channels of all feature maps are aligned to 1024 by 1 1 conv layers. As shown in Figure 2, all five types of variants have similar overall performance. Additionally, we also compare type (a) and type (e) with their 256 channel counterparts in Figure 4. We can see that increasing number of channels does not affect the performance of type (e). This further justifies that type (a) is a more proper choice for DualRC-Net. 2 Figure 4: Comparison between 256 and 1024 output feature channels for type (a) and type (e).

benchmark, dualrc-net, input image size, (10 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
North America > Canada (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.31)
Information Technology > Artificial Intelligence > Vision (0.30)

Add feedback

Scalable Federated Learning for Clients with Different Input Image Sizes and Numbers of Output Categories

Nitta, Shuhei, Suzuki, Taiji, Mulet, Albert Rodríguez, Yaguchi, Atsushi, Hirai, Ryusuke

arXiv.org Artificial IntelligenceNov-15-2023

Federated learning is a privacy-preserving training method which consists of training from a plurality of clients but without sharing their confidential data. However, previous work on federated learning do not explore suitable neural network architectures for clients with different input images sizes and different numbers of output categories. In this paper, we propose an effective federated learning method named ScalableFL, where the depths and widths of the local models for each client are adjusted according to the clients' input image size and the numbers of output categories. In addition, we provide a new bound for the generalization gap of federated learning. In particular, this bound helps to explain the effectiveness of our scalable neural network approach. We demonstrate the effectiveness of ScalableFL in several heterogeneous client settings for both image classification and object detection tasks.

anchor, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2311.08716

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MobileNet Model Information

#artificialintelligenceOct-2-2021, 14:25:10 GMT

Abstract-- We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce two simple global hyper-parameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to choose the right sized model for their application based on the constraints of the problem. We present extensive experiments on resource and accuracy tradeoffs and show strong performance compared to other popular models on ImageNet classification. We then demonstrate the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.

artificial intelligence, machine learning, mobilenet model information, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

Implementing a fully convolutional network (FCN) in TensorFlow 2

#artificialintelligenceJan-18-2020, 10:07:40 GMT

Using a pre-trained model that is trained on huge datasets like ImageNet, COCO, etc. we can quickly specialize these architectures to work for our unique dataset. This process is termed as transfer learning. Pre-trained models for image classification and object detection tasks are usually trained on fixed input image sizes. These typically range from 224x224x3 to somewhere around 512x512x3 and mostly have an aspect ratio of 1 i.e. the width and height of the image are equal. If they are not equal then the images are resized to be of equal height and width.

artificial intelligence, dimension, machine learning, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep Learning Fundus Image Analysis for Diabetic Retinopathy and Macular Edema Grading

Sahlsten, Jaakko, Jaskari, Joel, Kivinen, Jyri, Turunen, Lauri, Jaanio, Esa, Hietala, Kustaa, Kaski, Kimmo

arXiv.org Machine LearningApr-16-2019

Diabetes is a globally prevalent disease that can cause visible microvascular complications such as diabetic retinopathy and macular edema in the human eye retina, the images of which are today used for manual disease screening. This labor-intensive task could greatly benefit from automatic detection using deep learning technique. Here we present a deep learning system that identifies referable diabetic retinopathy comparably or better than presented in the previous studies, although we use only a small fraction of images ( 1/4) in training but are aided with higher image resolutions. We also provide novel results for five different screening and clinical grading systems for diabetic retinopathy and macular edema classification, including results for accurately classifying images according to clinical five-grade diabetic retinopathy and four-grade diabetic macular edema scales. These results suggest, that a deep learning system could increase the cost-effectiveness of screening while attaining higher than recommended performance, and that the system could be applied in clinical examinations requiring finer grading.

artificial intelligence, diabetic retinopathy, machine learning, (17 more...)

arXiv.org Machine Learning

1904.08764

Country: Europe > Finland > Central Finland > Jyväskylä (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback