AITopics | partial convolution

Collaborating Authors

partial convolution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Partial Convolution Meets Visual Attention

Huang, Haiduo, Yang, Fuwei, Li, Dong, Liu, Ji, Tian, Lu, Peng, Jinzhang, Ren, Pengju, Barsoum, Emad

arXiv.org Artificial IntelligenceMar-4-2025

Designing an efficient and effective neural network has remained a prominent topic in computer vision research. Depthwise onvolution (DWConv) is widely used in efficient CNNs or ViTs, but it needs frequent memory access during inference, which leads to low throughput. FasterNet attempts to introduce partial convolution (PConv) as an alternative to DWConv but compromises the accuracy due to underutilized channels. To remedy this shortcoming and consider the redundancy between feature map channels, we introduce a novel Partial visual ATtention mechanism (PAT) that can efficiently combine PConv with visual attention. Our exploration indicates that the partial attention mechanism can completely replace the full attention mechanism and reduce model parameters and FLOPs. Our PAT can derive three types of blocks: Partial Channel-Attention block (PAT_ch), Partial Spatial-Attention block (PAT_sp) and Partial Self-Attention block (PAT_sf). First, PAT_ch integrates the enhanced Gaussian channel attention mechanism to infuse global distribution information into the untouched channels of PConv. Second, we introduce the spatial-wise attention to the MLP layer to further improve model accuracy. Finally, we replace PAT_ch in the last stage with the self-attention mechanism to extend the global receptive field. Building upon PAT, we propose a novel hybrid network family, named PATNet, which achieves superior top-1 accuracy and inference speed compared to FasterNet on ImageNet-1K classification and excel in both detection and segmentation on the COCO dataset. Particularly, our PATNet-T2 achieves 1.3% higher accuracy than FasterNet-T2, while exhibiting 25% higher GPU throughput and 24% lower CPU latency.

attention mechanism, mechanism, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2503.03148

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Partial Channel Network: Compute Fewer, Perform Better

Huang, Haiduo, Xia, Tian, zhao, Wenzhe, Ren, Pengju

arXiv.org Artificial IntelligenceFeb-3-2025

Designing a module or mechanism that enables a network to maintain low parameters and FLOPs without sacrificing accuracy and throughput remains a challenge. To address this challenge and exploit the redundancy within feature map channels, we propose a new solution: partial channel mechanism (PCM). Specifically, through the split operation, the feature map channels are divided into different parts, with each part corresponding to different operations, such as convolution, attention, pooling, and identity mapping. Based on this assumption, we introduce a novel partial attention convolution (PATConv) that can efficiently combine convolution with visual attention. Our exploration indicates that the PATConv can completely replace both the regular convolution and the regular visual attention while reducing model parameters and FLOPs. Moreover, PATConv can derive three new types of blocks: Partial Channel-Attention block (PAT_ch), Partial Spatial-Attention block (PAT_sp), and Partial Self-Attention block (PAT_sf). In addition, we propose a novel dynamic partial convolution (DPConv) that can adaptively learn the proportion of split channels in different layers to achieve better trade-offs. Building on PATConv and DPConv, we propose a new hybrid network family, named PartialNet, which achieves superior top-1 accuracy and inference speed compared to some SOTA models on ImageNet-1K classification and excels in both detection and segmentation on the COCO dataset. Our code is available at https://github.com/haiduo/PartialNet.

artificial intelligence, convolution, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.01303

Country: Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Masked Autoencoder with Swin Transformer Network for Mitigating Electrode Shift in HD-EMG-based Gesture Recognition

Laamerad, Kasra, Shabanpour, Mehran, Islam, Md. Rabiul, Mohammadi, Arash

arXiv.org Artificial IntelligenceOct-6-2024

Multi-channel surface Electromyography (sEMG), also referred to as high-density sEMG (HD-sEMG), plays a crucial role in improving gesture recognition performance for myoelectric control. Pattern recognition models developed based on HD-sEMG, however, are vulnerable to changing recording conditions (e.g., signal variability due to electrode shift). This has resulted in significant degradation in performance across subjects, and sessions. In this context, the paper proposes the Masked Autoencoder with Swin Transformer (MAST) framework, where training is performed on a masked subset of HDsEMG channels. A combination of four masking strategies, i.e., random block masking; temporal masking; sensor-wise random masking, and; multi-scale masking, is used to learn latent representations and increase robustness against electrode shift. The masked data is then passed through MAST's three-path encoder-decoder structure, leveraging a multi-path Swin-Unet architecture that simultaneously captures time-domain, frequency-domain, and magnitude-based features of the underlying HD-sEMG signal. These augmented inputs are then used in a self-supervised pre-training fashion to improve the model's generalization capabilities. Experimental results demonstrate the superior performance of the proposed MAST framework in comparison to its counterparts.

machine learning, pattern recognition, recognition, (15 more...)

arXiv.org Artificial Intelligence

2410.17261

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

YOLO-PPA based Efficient Traffic Sign Detection for Cruise Control in Autonomous Driving

Zhang, Jingyu, Zhang, Wenqing, Tan, Chaoyi, Li, Xiangtian, Sun, Qianyi

arXiv.org Artificial IntelligenceSep-5-2024

It is very important to detect traffic signs efficiently and accurately in autonomous driving systems. However, the farther the distance, the smaller the traffic signs. Existing object detection algorithms can hardly detect these small scaled signs.In addition, the performance of embedded devices on vehicles limits the scale of detection models.To address these challenges, a YOLO PPA based traffic sign detection algorithm is proposed in this paper.The experimental results on the GTSDB dataset show that compared to the original YOLO, the proposed method improves inference efficiency by 11.2%. The mAP 50 is also improved by 93.2%, which demonstrates the effectiveness of the proposed YOLO PPA.

accuracy, traffic sign, yolo-ppa, (15 more...)

arXiv.org Artificial Intelligence

2409.0332

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
Asia > Singapore (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report (0.82)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.73)
Information Technology > Robotics & Automation (0.63)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.73)

Add feedback

Adapting to Skew: Imputing Spatiotemporal Urban Data with 3D Partial Convolutions and Biased Masking

Han, Bin, Howe, Bill

arXiv.org Artificial IntelligenceJan-10-2023

We adapt image inpainting techniques to impute large, irregular missing regions in urban settings characterized by sparsity, variance in both space and time, and anomalous events. Missing regions in urban data can be caused by sensor or software failures, data quality issues, interference from weather events, incomplete data collection, or varying data use regulations; any missing data can render the entire dataset unusable for downstream applications. To ensure coverage and utility, we adapt computer vision techniques for image inpainting to operate on 3D histograms (2D space + 1D time) commonly used for data exchange in urban settings. Adapting these techniques to the spatiotemporal setting requires handling skew: urban data tend to follow population density patterns (small dense regions surrounded by large sparse areas); these patterns can dominate the learning process and fool the model into ignoring local or transient effects. To combat skew, we 1) train simultaneously in space and time, and 2) focus attention on dense regions by biasing the masks used for training to the skew in the data. We evaluate the core model and these two extensions using the NYC taxi data and the NYC bikeshare data, simulating different conditions for missing data. We show that the core model is effective qualitatively and quantitatively, and that biased masking during training reduces error in a variety of scenarios. We also articulate a tradeoff in varying the number of timesteps per training sample: too few timesteps and the model ignores transient events; too many timesteps and the model is slow to train with limited performance gain.

artificial intelligence, data quality, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2301.04233

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Transportation > Passenger (0.46)
Transportation > Ground > Road (0.46)
Transportation > Infrastructure & Services (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Efficient data-driven gap filling of satellite image time series using deep neural networks with partial convolutions

Appel, Marius

arXiv.org Artificial IntelligenceAug-18-2022

The abundance of gaps in satellite image time series often complicates the application of deep learning models such as convolutional neural networks for spatiotemporal modeling. Based on previous work in computer vision on image inpainting, this paper shows how three-dimensional spatiotemporal partial convolutions can be used as layers in neural networks to fill gaps in satellite image time series. To evaluate the approach, we apply a U-Net-like model on incomplete image time series of quasi-global carbon monoxide observations from the Sentinel-5P satellite. Prediction errors were comparable to two considered statistical approaches while computation times for predictions were up to three orders of magnitude faster, making the approach applicable to process large amounts of satellite data. Partial convolutions can be added as layers to other types of neural networks, making it relatively easy to integrate with existing deep learning models. However, the approach does not quantify prediction errors and further research is needed to understand and improve model transferability. The implementation of spatiotemporal partial convolutions and the U-Net-like model is available as open-source software.

convolution, partial convolution, prediction, (14 more...)

arXiv.org Artificial Intelligence

2208.08781

Country:

Europe > Austria > Vienna (0.14)
Oceania > Australia (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Does Generative Face Completion Help Face Recognition?

Mathai, Joe, Masi, Iacopo, AbdAlmageed, Wael

arXiv.org Artificial IntelligenceJun-6-2019

Face occlusions, covering either the majority or discriminative parts of the face, can break facial perception and produce a drastic loss of information. Biometric systems such as recent deep face recognition models are not immune to obstructions or other objects covering parts of the face. While most of the current face recognition methods are not optimized to handle occlusions, there have been a few attempts to improve robustness directly in the training stage. Unlike those, we propose to study the effect of generative face completion on the recognition. We offer a face completion encoder-decoder, based on a convolutional operator with a gating mechanism, trained with an ample set of face occlusions. To systematically evaluate the impact of realistic occlusions on recognition, we propose to play the occlusion game: we render 3D objects onto different face parts, providing precious knowledge of what the impact is of effectively removing those occlusions. Extensive experiments on the Labeled Faces in the Wild (LFW), and its more difficult variant LFW-BLUFR, testify that face completion is able to partially restore face perception in machine vision systems for improved recognition.

artificial intelligence, machine learning, occlusion, (17 more...)

arXiv.org Artificial Intelligence

1906.02858

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.66)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Multiple Sclerosis Lesion Inpainting Using Non-Local Partial Convolutions

Xiong, Hao, Tao, Dacheng

arXiv.org Machine LearningDec-24-2018

Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system (CNS) that results in focal injury to the grey and white matter. The presence of white matter lesions biases morphometric analyses such as registration, individual longitudinal measurements and tissue segmentation for brain volume measurements. Lesion-inpainting with intensities derived from surround healthy tissue represent one approach to alleviate such problems. However, existing methods inpaint lesions based on texture information derived from local surrounding tissue, often leading to inconsistent inpainting and the generation of artifacts such as intensity discrepancy and blurriness. Based on these observations, we propose non-local partial convolutions (NLPC) which integrates a Unet-like network with the non-local module. The non-local module is exploited to capture long range dependencies between the lesion area and remaining normal-appearing brain regions. Then, the lesion area is filled by referring to normal-appearing regions with more similar features. This method generates inpainted regions that appear more realistic and natural. Our quantitative experimental results also demonstrate superiority of this technique of existing state-of-the-art inpainting methods.

intensity, lesion, lesion mask, (13 more...)

arXiv.org Machine Learning

1901.00055

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
Europe > United Kingdom > England (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

deeppomf/DeepCreamPy

#artificialintelligenceNov-8-2018, 01:16:51 GMT

A deep learning-based tool to automatically replace censored artwork in hentai with plausible reconstructions. A neural network fills in the censored regions. DeepCreamPy has a pre-built binary for Windows 64-bit available here. DeepCreamPy works on Windows, Mac, and Linux. The decensorship is for color hentai images that have minor to moderate censorship of the penis or vagina.

artificial intelligence, deep learning, machine learning, (7 more...)

#artificialintelligence

Industry: Law > Civil Rights & Constitutional Law (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

Add feedback

New AI Imaging Technique Reconstructs Photos with Realistic Results - NVIDIA Developer News Center

#artificialintelligenceMay-7-2018, 21:05:51 GMT

The researchers said existing deep learning based image inpainting methods suffer because the outputs for missing pixels necessarily depend on the value of the input that must be supplied to the neural network for those missing pixels. This leads to artifacts such as color discrepancy and blurriness in the images. To fix this problem, the NVIDIA team developed a method that guarantees the output for missing pixels does not depend on the input value supplied for those pixels. This method uses a "partial convolution" layer that renormalizes each output depending on the validity of its corresponding receptive field. This renormalization ensures that the value of the output is independent of the values of the missing pixels in each receptive field.

ai imaging technique reconstruct photo, artificial intelligence, machine learning, (6 more...)

#artificialintelligence

Industry:

Information Technology > Hardware (0.66)
Media > News (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback