AITopics | top-1

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-9-2026, 14:36:36 GMT

82c2559140b95ccda9c6ca4a8b981f1e-Paper.pdf

Thispaper presents anefficient multi-scale vision Transformer,called ResT,that capably served as a general-purpose backbone for image recognition.

dimension, machine learning, pattern recognition, (20 more...)

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Veneto > Venice (0.04)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.34)

Neural Information Processing SystemsFeb-8-2026, 20:53:25 GMT

5bd529d5b07b647a8863cf71e98d651a-Supplemental.pdf

Kinetics-400 [1] is a large scale action recognition dataset with trimmed video clips of around 10-second durations. It is collected from realistic YouTube videos, which covers 400 categories of human activities. In total, it contains around240K training videos and20K validation videos. Specifically whentraining Kinetics-200/-400 from scratch, we adopt the cosine schedule of learning rate decaying with an initiallearningrateof0.1. The initial learning rate is 0.005anddecaysby 0.1atepoch20and40.

artificial intelligence, category, machine learning, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-8-2026, 10:46:14 GMT

LearningEfficientVisionTransformersvia Fine-GrainedManifoldDistillation

In the past few years, transformers have achieved promising performance on various computer vision tasks. Unfortunately, the immense inference overhead of most existing vision transformers withholds them from being deployed on edge devices such ascellphones andsmart watches. Knowledge distillation isa widely used paradigm for compressing cumbersome architectures into compact students via transferring information.

artificial intelligence, distillation method, transformer, (17 more...)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Budiutama, Gekko, Daimon, Shunsuke, Nishi, Hirofumi, Matsushita, Yu-ichiro

General Transform: A Unified Framework for Adaptive Transform to Enhance Representations

arXiv.org Artificial IntelligenceMay-9-2025

Discrete transforms, such as the discrete Fourier transform, a re widely used in machine learning to improve model performance by extracting mea ningful features. However, with numerous transforms available, selectin g an appropriate one often depends on understanding the dataset's proper ties, making the approach less effective when such knowledge is unavailable. In th is work, we propose General Transform (GT), an adaptive transform-ba sed representation designed for machine learning applications. Unlike convent ional transforms, GT learns data-driven mapping tailored to the datase t and task of interest. Here, we demonstrate that models incorporating GT o utperform conventional transform-based approaches across computer v ision and natural language processing tasks, highlighting its effectiveness in diverse learning scenarios. Keywords: machine learning, deep learning, feature extraction 1. Introduction Deep neural networks have consistently pushed the boundaries o f performance on tasks in computer vision, natural language processing, a nd beyond. Corresponding author Email address: bgekko@quemix.com

data quality, machine learning, natural language, (20 more...)

2505.04969

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.51)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-15-2025

Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition

Zou, Shun, Zou, Yi, Zhang, Mingya, Luo, Shipeng, Chen, Zhihao, Gao, Guangwei

In recent years, Transformer has witnessed significant progress in food recognition. However, most existing approaches still face two critical challenges in lightweight food recognition: (1) the quadratic complexity and redundant feature representation from interactions with irrelevant tokens; (2) static feature recognition and single-scale representation, which overlook the unstructured, non-fixed nature of food images and the need for multi-scale features. To address these, we propose an adaptive and efficient sparse Transformer architecture (Fraesormer) with two core designs: Adaptive Top-k Sparse Partial Attention (ATK-SPA) and Hierarchical Scale-Sensitive Feature Gating Network (HSSFGN). ATK-SPA uses a learnable Gated Dynamic Top-K Operator (GDTKO) to retain critical attention scores, filtering low query-key matches that hinder feature aggregation. It also introduces a partial channel mechanism to reduce redundancy and promote expert information flow, enabling local-global collaborative modeling. HSSFGN employs gating mechanism to achieve multi-scale feature representation, enhancing contextual semantic information. Extensive experiments show that Fraesormer outperforms state-of-the-art methods. code is available at https://zs1314.github.io/Fraesormer.

artificial intelligence, information, machine learning, (16 more...)

2503.11995

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Abdallah, Abdelrahman, Mozafari, Jamshid, Piryani, Bhawna, Jatowt, Adam

ASRank: Zero-Shot Re-Ranking with Answer Scent for Document Retrieval

arXiv.org Artificial IntelligenceJan-25-2025

Retrieval-Augmented Generation (RAG) models have drawn considerable attention in modern open-domain question answering. The effectiveness of RAG depends on the quality of the top retrieved documents. However, conventional retrieval methods sometimes fail to rank the most relevant documents at the top. In this paper, we introduce ASRank, a new re-ranking method based on scoring retrieved documents using zero-shot answer scent which relies on a pre-trained large language model to compute the likelihood of the document-derived answers aligning with the answer scent. Our approach demonstrates marked improvements across several datasets, including NQ, TriviaQA, WebQA, ArchivalQA, HotpotQA, and Entity Questions. Notably, ASRank increases Top-1 retrieval accuracy on NQ from $19.2\%$ to $46.5\%$ for MSS and $22.1\%$ to $47.3\%$ for BM25. It also shows strong retrieval performance on several datasets compared to state-of-the-art methods (47.3 Top-1 by ASRank vs 35.4 by UPR by BM25).

large language model, machine learning, natural language, (17 more...)

2501.15245

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > United States > Arizona (0.04)
(11 more...)

Genre:

Research Report > Promising Solution (0.47)
Research Report > New Finding (0.46)

Industry:

Media (1.00)
Leisure & Entertainment > Sports (0.67)
Energy > Power Industry > Utilities > Nuclear (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-19-2024

Continual Learning Using Only Large Language Model Prompting

Qiu, Jiabao, Ke, Zixuan, Liu, Bing

We introduce CLOB, a novel continual learning (CL) paradigm wherein a large language model (LLM) is regarded as a black box. Learning is done incrementally via only verbal prompting. CLOB does not fine-tune any part of the LLM or add any trainable parameters to it. It is particularly suitable for LLMs that are accessible via APIs. We also propose a new CL technique, called CIS, based on incremental summarization that also overcomes the LLM's input length limit. Experiments show CIS outperforms baselines by a very large margin.

computational linguistic, large language model, machine learning, (19 more...)