AITopics | waveformer

Collaborating Authors

waveformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

WaveFormer: A 3D Transformer with Wavelet-Driven Feature Representation for Efficient Medical Image Segmentation

Hasan, Md Mahfuz Al, Zaman, Mahdi, Jawad, Abdul, Santamaria-Pang, Alberto, Lee, Ho Hin, Tarapov, Ivan, See, Kyle, Imran, Md Shah, Roy, Antika, Fallah, Yaser Pourmohammadi, Asadizanjani, Navid, Forghani, Reza

arXiv.org Artificial IntelligenceMar-31-2025

Transformer-based architectures have advanced medical image analysis by effectively modeling long-range dependencies, yet they often struggle in 3D settings due to substantial memory overhead and insufficient capture of fine-grained local features. We address these limitations with WaveFormer, a novel 3D-transformer that: i) leverages the fundamental frequency-domain properties of features for contextual representation, and ii) is inspired by the top-down mechanism of the human visual recognition system, making it a biologically motivated architecture. By employing discrete wavelet transformations (DWT) at multiple scales, WaveFormer preserves both global context and high-frequency details while replacing heavy upsampling layers with efficient wavelet-based summarization and reconstruction. This significantly reduces the number of parameters, which is critical for real-world deployment where computational resources and training times are constrained. Furthermore, the model is generic and easily adaptable to diverse applications. Evaluations on BraTS2023, FLARE2021, and KiTS2023 demonstrate performance on par with state-of-the-art methods while offering substantially lower computational complexity. Keywords: Transformer Model Multi-level Attention Discrete Wavelet Transform 1 Introduction Medical image segmentation is fundamental to clinical applications such as tumor delineation, organ localization, and surgical planning. Deep learning-based approaches, particularly convolutional neural networks (CNNs), have demonstrated significant success by hierarchically extracting features. However, their limited receptive fields hinder the capture of long-range dependencies, a critical shortcoming in 3D applications where spatial context across distant slices arXiv:2503.23764v2

artificial intelligence, machine learning, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2503.23764

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CATSE: A Context-Aware Framework for Causal Target Sound Extraction

Baligar, Shrishail, Kegler, Mikolaj, Irvin, Bryce, Stamenovic, Marko, Newsam, Shawn

arXiv.org Artificial IntelligenceMar-21-2024

Target Sound Extraction (TSE) focuses on the problem of separating sources of interest, indicated by a user's cue, from the input mixture. Most existing solutions operate in an offline fashion and are not suited to the low-latency causal processing constraints imposed by applications in live-streamed content such as augmented hearing. We introduce a family of context-aware low-latency causal TSE models suitable for real-time processing. First, we explore the utility of context by providing the TSE model with oracle information about what sound classes make up the input mixture, where the objective of the model is to extract one or more sources of interest indicated by the user. Since the practical applications of oracle models are limited due to their assumptions, we introduce a composite multi-task training objective involving separation and classification losses. Our evaluation involving single- and multi-source extraction shows the benefit of using context information in the model either by means of providing full context or via the proposed multi-task training loss without the need for full context information. Specifically, we show that our proposed model outperforms size- and latency-matched Waveformer, a state-of-the-art model for real-time TSE.

ecatse, separation, waveformer, (16 more...)

arXiv.org Artificial Intelligence

2403.14246

Country: North America > United States > California > Merced County > Merced (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Waveformer for modelling dynamical systems

Navaneeth, N, Chakraborty, Souvik

arXiv.org Artificial IntelligenceOct-7-2023

Neural operators have gained recognition as potent tools for learning solutions of a family of partial differential equations. The state-of-the-art neural operators excel at approximating the functional relationship between input functions and the solution space, potentially reducing computational costs and enabling real-time applications. However, they often fall short when tackling time-dependent problems, particularly in delivering accurate long-term predictions. In this work, we propose "waveformer", a novel operator learning approach for learning solutions of dynamical systems. The proposed waveformer exploits wavelet transform to capture the spatial multi-scale behavior of the solution field and transformers for capturing the long horizon dynamics. We present four numerical examples involving Burgers's equation, KS-equation, Allen Cahn equation, and Navier Stokes equation to illustrate the efficacy of the proposed approach. Results obtained indicate the capability of the proposed waveformer in learning the solution operator and show that the proposed Waveformer can learn the solution operator with high accuracy, outperforming existing state-of-the-art operator learning algorithms by up to an order, with its advantage particularly visible in the extrapolation region

dynamical system, waveformer

arXiv.org Artificial Intelligence

2310.0499

Genre: Research Report (0.40)

Technology:

Information Technology > Scientific Computing (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback