AITopics | Roy, Swalpa Kumar

Collaborating Authors

Roy, Swalpa Kumar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reliable or Deceptive? Investigating Gated Features for Smooth Visual Explanations in CNNs

Mitra, Soham, Sukul, Atri, Roy, Swalpa Kumar, Singh, Pravendra, Verma, Vinay

arXiv.org Artificial IntelligenceApr-30-2024

Deep learning models have achieved remarkable success across diverse domains. However, the intricate nature of these models often impedes a clear understanding of their decision-making processes. This is where Explainable AI (XAI) becomes indispensable, offering intuitive explanations for model decisions. In this work, we propose a simple yet highly effective approach, ScoreCAM++, which introduces modifications to enhance the promising ScoreCAM method for visual explainability. Our proposed approach involves altering the normalization function within the activation layer utilized in ScoreCAM, resulting in significantly improved results compared to previous efforts. Additionally, we apply an activation function to the upsampled activation layers to enhance interpretability. This improvement is achieved by selectively gating lower-priority values within the activation layer. Through extensive experiments and qualitative comparisons, we demonstrate that ScoreCAM++ consistently achieves notably superior performance and fairness in interpreting the decision-making process compared to both ScoreCAM and previous methods.

activation layer, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2404.19341

Country: Asia > India > Uttarakhand (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GLFNET: Global-Local (frequency) Filter Networks for efficient medical image segmentation

Tragakis, Athanasios, Liu, Qianying, Kaul, Chaitanya, Roy, Swalpa Kumar, Dai, Hang, Deligianni, Fani, Murray-Smith, Roderick, Faccio, Daniele

arXiv.org Artificial IntelligenceMar-1-2024

We propose a novel transformer-style architecture called Global-Local Filter Network (GLFNet) for medical image segmentation and demonstrate its state-of-the-art performance. We replace the self-attention mechanism with a combination of global-local filter blocks to optimize model efficiency. The global filters extract features from the whole feature map whereas the local filters are being adaptively created as 4x4 patches of the same feature map and add restricted scale information. In particular, the feature extraction takes place in the frequency domain rather than the commonly used spatial (image) domain to facilitate faster computations. The fusion of information from both spatial and frequency spaces creates an efficient model with regards to complexity, required data and performance. We test GLFNet on three benchmark datasets achieving state-of-the-art performance on all of them while being almost twice as efficient in terms of GFLOP operations.

artificial intelligence, machine learning, segmentation, (18 more...)

arXiv.org Artificial Intelligence

2403.00396

Country:

Europe > United Kingdom (0.29)
Asia (0.28)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

DeepArt: A Benchmark to Advance Fidelity Research in AI-Generated Content

Wang, Wentao, Huang, Xuanyao, Wang, Tianyang, Roy, Swalpa Kumar

arXiv.org Artificial IntelligenceDec-24-2023

This paper explores the image synthesis capabilities of GPT-4, a leading multi-modal large language model. We establish a benchmark for evaluating the fidelity of texture features in images generated by GPT-4, comprising manually painted pictures and their AI-generated counterparts. The contributions of this study are threefold: First, we provide an in-depth analysis of the fidelity of image synthesis features based on GPT-4, marking the first such study on this state-of-the-art model. Second, the quantitative and qualitative experiments fully reveals the limitations of the GPT-4 model in image synthesis. Third, we have compiled a unique benchmark of manual drawings and corresponding GPT-4-generated images, introducing a new task to advance fidelity research in AI-generated content (AIGC). The dataset is available at: \url{https://github.com/rickwang28574/DeepArt}.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.10407

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Spatial Gated Multi-Layer Perceptron for Land Use and Land Cover Mapping

Jamali, Ali, Roy, Swalpa Kumar, Hong, Danfeng, Atkinson, Peter M, Ghamisi, Pedram

arXiv.org Artificial IntelligenceAug-9-2023

Convolutional Neural Networks (CNNs) are models that are utilized extensively for the hierarchical extraction of features. Vision transformers (ViTs), through the use of a self-attention mechanism, have recently achieved superior modeling of global contextual information compared to CNNs. However, to realize their image classification strength, ViTs require substantial training datasets. Where the available training data are limited, current advanced multi-layer perceptrons (MLPs) can provide viable alternatives to both deep CNNs and ViTs. In this paper, we developed the SGU-MLP, a learning algorithm that effectively uses both MLPs and spatial gating units (SGUs) for precise land use land cover (LULC) mapping. Results illustrated the superiority of the developed SGU-MLP classification algorithm over several CNN and CNN-ViT-based models, including HybridSN, ResNet, iFormer, EfficientFormer and CoAtNet. The proposed SGU-MLP algorithm was tested through three experiments in Houston, USA, Berlin, Germany and Augsburg, Germany. The SGU-MLP classification model was found to consistently outperform the benchmark CNN and CNN-ViT-based algorithms. For example, for the Houston experiment, SGU-MLP significantly outperformed HybridSN, CoAtNet, Efficientformer, iFormer and ResNet by approximately 15%, 19%, 20%, 21%, and 25%, respectively, in terms of average accuracy. The code will be made publicly available at https://github.com/aj1365/SGUMLP

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2308.05235

Country: Europe > Germany > Berlin (0.24)

Genre: Research Report (0.64)

Industry: Law > Real Estate Law (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal Fusion Transformer for Remote Sensing Image Classification

Roy, Swalpa Kumar, Deria, Ankur, Hong, Danfeng, Rasti, Behnood, Plaza, Antonio, Chanussot, Jocelyn

arXiv.org Artificial IntelligenceJun-20-2023

Vision transformers (ViTs) have been trending in image classification tasks due to their promising performance when compared to convolutional neural networks (CNNs). As a result, many researchers have tried to incorporate ViTs in hyperspectral image (HSI) classification tasks. To achieve satisfactory performance, close to that of CNNs, transformers need fewer parameters. ViTs and other similar transformers use an external classification (CLS) token which is randomly initialized and often fails to generalize well, whereas other sources of multimodal datasets, such as light detection and ranging (LiDAR) offer the potential to improve these models by means of a CLS. In this paper, we introduce a new multimodal fusion transformer (MFT) network which comprises a multihead cross patch attention (mCrossPA) for HSI land-cover classification. Our mCrossPA utilizes other sources of complementary information in addition to the HSI in the transformer encoder to achieve better generalization. The concept of tokenization is used to generate CLS and HSI patch tokens, helping to learn a {distinctive representation} in a reduced and hierarchical feature space. Extensive experiments are carried out on {widely used benchmark} datasets {i.e.,} the University of Houston, Trento, University of Southern Mississippi Gulfpark (MUUFL), and Augsburg. We compare the results of the proposed MFT model with other state-of-the-art transformers, classical CNNs, and conventional classifiers models. The superior performance achieved by the proposed model is due to the use of multihead cross patch attention. The source code will be made available publicly at \url{https://github.com/AnkurDeria/MFT}.}

artificial intelligence, classification, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TGRS.2023.3286826

2203.16952

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Oklahoma (0.24)
(2 more...)

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.42)
Education > Educational Setting (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback