AITopics | mbconv

Collaborating Authors

mbconv

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

747d3443e319a22747fbb873e8b2f9f2-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 23:00:15 GMT

It can be derived that the posterior processf|O is also a GP, we denote its mean function and21 kernel function asµn and κn respectively. To reduce the time consumption and take advantage of parallelization, we train several different32 networks at a time. When selecting the first BSSC, equation 2 can be used directly. Therefore, we use the expectedvalue of EI function (EEI, [4])instead. ResNet18/50 consists of 6 stages as illustrated in Figure 1.

artificial intelligence, machine learning, mbconv, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.90)

Add feedback

747d3443e319a22747fbb873e8b2f9f2-Supplemental.pdf

Neural Information Processing SystemsOct-3-2025, 06:18:38 GMT

additional detail, autobss, bssc, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

STA-Net: A Decoupled Shape and Texture Attention Network for Lightweight Plant Disease Classification

Qiu, Zongsen

arXiv.org Artificial IntelligenceSep-5-2025

Responding to rising global food security needs, precision agriculture and deep learning-based plant disease diagnosis have become crucial. Yet, deploying high-precision models on edge devices is challenging. Most lightweight networks use attention mechanisms designed for generic object recognition, which poorly capture subtle pathological features like irregular lesion shapes and complex textures. To overcome this, we propose a twofold solution: first, using a training-free neural architecture search method (DeepMAD) to create an efficient network backbone for edge devices; second, introducing the Shape-Texture Attention Module (STAM). STAM splits attention into two branches -- one using deformable convolutions (DCNv4) for shape awareness and the other using a Gabor filter bank for texture awareness. On the public CCMT plant disease dataset, our STA-Net model (with 401K parameters and 51.1M FLOPs) reached 89.00% accuracy and an F1 score of 88.96%. Ablation studies confirm STAM significantly improves performance over baseline and standard attention models. Integrating domain knowledge via decoupled attention thus presents a promising path for edge-deployed precision agriculture AI. The source code is available at https://github.com/RzMY/STA-Net.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.03754

Genre: Research Report (0.64)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Searching a High-Performance Feature Extractor for Text Recognition Network

Zhang, Hui, Yao, Quanming, Kwok, James T., Bai, Xiang

arXiv.org Artificial IntelligenceSep-26-2022

Feature extractor plays a critical role in text recognition (TR), but customizing its architecture is relatively less explored due to expensive manual tweaking. In this work, inspired by the success of neural architecture search (NAS), we propose to search for suitable feature extractors. We design a domain-specific search space by exploring principles for having good feature extractors. The space includes a 3D-structured space for the spatial model and a transformed-based space for the sequential model. As the space is huge and complexly structured, no existing NAS algorithms can be applied. We propose a two-stage algorithm to effectively search in the space. In the first stage, we cut the space into several blocks and progressively train each block with the help of an auxiliary head. We introduce the latency constraint into the second stage and search sub-network from the trained supernet via natural gradient descent. In experiments, a series of ablation studies are performed to better understand the designed space, search algorithm, and searched architectures. We also compare the proposed method with various state-of-the-art ones on both hand-written and scene TR tasks. Extensive results show that our approach can achieve better recognition performance with less latency.

artificial intelligence, machine learning, pattern recognition, (20 more...)

arXiv.org Artificial Intelligence

2209.13139

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.63)

Add feedback

MaxViT: Multi-Axis Vision Transformer

Tu, Zhengzhong, Talebi, Hossein, Zhang, Han, Yang, Feng, Milanfar, Peyman, Bovik, Alan, Li, Yinxiao

arXiv.org Artificial IntelligenceSep-9-2022

Transformers have recently gained significant attention in the computer vision community. However, the lack of scalability of self-attention mechanisms with respect to image size has limited their wide adoption in state-of-the-art vision backbones. In this paper we introduce an efficient and scalable attention model we call multi-axis attention, which consists of two aspects: blocked local and dilated global attention. These design choices allow global-local spatial interactions on arbitrary input resolutions with only linear complexity. We also present a new architectural element by effectively blending our proposed attention model with convolutions, and accordingly propose a simple hierarchical vision backbone, dubbed MaxViT, by simply repeating the basic building block over multiple stages. Notably, MaxViT is able to ''see'' globally throughout the entire network, even in earlier, high-resolution stages. We demonstrate the effectiveness of our model on a broad spectrum of vision tasks. On image classification, MaxViT achieves state-of-the-art performance under various settings: without extra data, MaxViT attains 86.5% ImageNet-1K top-1 accuracy; with ImageNet-21K pre-training, our model achieves 88.7% top-1 accuracy. For downstream tasks, MaxViT as a backbone delivers favorable performance on object detection as well as visual aesthetic assessment. We also show that our proposed model expresses strong generative modeling capability on ImageNet, demonstrating the superior potential of MaxViT blocks as a universal vision module. The source code and trained models will be available at https://github.com/google-research/maxvit.

arxiv preprint arxiv, maxvit, proceedings, (9 more...)

arXiv.org Artificial Intelligence

2204.01697

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback