AITopics | amplitude spectrum

Collaborating Authors

amplitude spectrum

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fourier Clouds: Fast Bias Correction for Imbalanced Semi-Supervised Learning

Neural Information Processing SystemsJun-20-2026, 19:21:16 GMT

Pseudo-label-based Semi-Supervised Learning (SSL) often suffers from classifier bias, particularly under class imbalance, as inaccurate pseudo-labels tend to exacerbate existing biases towards majority classes. Existing methods, such as CDMAD[30], utilize simplistic reference inputs--typically uniform or blank-colored images--to estimate and correct this bias. However, such simplistic references fundamentally ignore realistic statistical information inherent to real datasets, specifically typical color distributions, texture details, and frequency characteristics. This lack of statistical representativeness can lead the model to inaccurately estimate its inherent bias, limiting the effectiveness of bias correction, particularly under severe class imbalance or substantial distribution mismatches between labeled and unlabeled datasets. To overcome these limitations, we introduce the FARAD (Fourier-Adapted Reference for Accurate Debiasing) System.

artificial intelligence, machine learning, random-phase image, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain Fengpeng Li1 Kemou Li1 Haiwei Wu

Neural Information Processing SystemsFeb-18-2026, 12:14:48 GMT

Recent studies show that adversarial attacks disproportionately impact the patterns within the phase of the sample's frequency spectrum--typically containing

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Macao (0.04)
(25 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (0.36)
Government > Military (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science (0.68)

Add feedback

e5d5344481c495709eaa6397bf51fb89-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 19:49:22 GMT

phase pattern, proceedings, robustness, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Macao (0.04)
(26 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science (0.68)

Add feedback

SFIBA: Spatial-based Full-target Invisible Backdoor Attacks

Yin, Yangxu, Chen, Honglong, Gao, Yudong, Sun, Peng, Li, Zhishuai, Liu, Weifeng

arXiv.org Artificial IntelligenceMay-1-2025

Multi-target backdoor attacks pose significant security threats to deep neural networks, as they can preset multiple target classes through a single backdoor injection. This allows attackers to control the model to misclassify poisoned samples with triggers into any desired target class during inference, exhibiting superior attack performance compared with conventional backdoor attacks. However, existing multi-target backdoor attacks fail to guarantee trigger specificity and stealthiness in black-box settings, resulting in two main issues. First, they are unable to simultaneously target all classes when only training data can be manipulated, limiting their effectiveness in realistic attack scenarios. Second, the triggers often lack visual imperceptibility, making poisoned samples easy to detect. To address these problems, we propose a Spatial-based Full-target Invisible Backdoor Attack, called SFIBA. It restricts triggers for different classes to specific local spatial regions and morphologies in the pixel space to ensure specificity, while employing a frequency-domain-based trigger injection method to guarantee stealthiness. Specifically, for injection of each trigger, we first apply fast fourier transform to obtain the amplitude spectrum of clean samples in local spatial regions. Then, we employ discrete wavelet transform to extract the features from the amplitude spectrum and use singular value decomposition to integrate the trigger. Subsequently, we selectively filter parts of the trigger in pixel space to implement trigger morphology constraints and adjust injection coefficients based on visual effects. We conduct experiments on multiple datasets and models. The results demonstrate that SFIBA can achieve excellent attack performance and stealthiness, while preserving the model's performance on benign samples, and can also bypass existing backdoor defenses.

artificial intelligence, data quality, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.21052

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain

Li, Fengpeng, Li, Kemou, Wu, Haiwei, Tian, Jinyu, Zhou, Jiantao

arXiv.org Artificial IntelligenceOct-16-2024

To protect deep neural networks (DNNs) from adversarial attacks, adversarial training (AT) is developed by incorporating adversarial examples (AEs) into model training. Recent studies show that adversarial attacks disproportionately impact the patterns within the phase of the sample's frequency spectrum--typically containing crucial semantic information--more than those in the amplitude, resulting in the model's erroneous categorization of AEs. We find that, by mixing the amplitude of training samples' frequency spectrum with those of distractor images for AT, the model can be guided to focus on phase patterns unaffected by adversarial perturbations. As a result, the model's robustness can be improved. Unfortunately, it is still challenging to select appropriate distractor images, which should mix the amplitude without affecting the phase patterns. To this end, in this paper, we propose an optimized Adversarial Amplitude Generator (AAG) to achieve a better tradeoff between improving the model's robustness and retaining phase patterns. Based on this generator, together with an efficient AE production procedure, we design a new Dual Adversarial Training (DAT) strategy. Experiments on various datasets show that our proposed DAT leads to significantly improved robustness against diverse adversarial attacks. The source code is available at https:// github.com/Feng-peng-Li/DAT.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.12307

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Macao (0.04)
(26 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (0.76)
Government > Military (0.76)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stage-Wise and Prior-Aware Neural Speech Phase Prediction

Liu, Fei, Ai, Yang, Du, Hui-Peng, Lu, Ye-Xin, Zheng, Rui-Chen, Ling, Zhen-Hua

arXiv.org Artificial IntelligenceOct-7-2024

This paper proposes a novel Stage-wise and Prior-aware Neural Speech Phase Prediction (SP-NSPP) model, which predicts the phase spectrum from input amplitude spectrum by two-stage neural networks. In the initial prior-construction stage, we preliminarily predict a rough prior phase spectrum from the amplitude spectrum. The subsequent refinement stage transforms the amplitude spectrum into a refined high-quality phase spectrum conditioned on the prior phase. Networks in both stages use ConvNeXt v2 blocks as the backbone and adopt adversarial training by innovatively introducing a phase spectrum discriminator (PSD). To further improve the continuity of the refined phase, we also incorporate a time-frequency integrated difference (TFID) loss in the refinement stage. Experimental results confirm that, compared to neural network-based no-prior phase prediction methods, the proposed SP-NSPP achieves higher phase prediction accuracy, thanks to introducing the coarse phase priors and diverse training criteria. Compared to iterative phase estimation algorithms, our proposed SP-NSPP does not require multiple rounds of staged iterations, resulting in higher generation efficiency.

amplitude spectrum, sp-nspp, spectrum, (15 more...)

arXiv.org Artificial Intelligence

2410.0499

Country: Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech (0.94)

Add feedback

PhasePerturbation: Speech Data Augmentation via Phase Perturbation for Automatic Speech Recognition

Lei, Chengxi, Singh, Satwinder, Hou, Feng, Jia, Xiaoyun, Wang, Ruili

arXiv.org Artificial IntelligenceDec-13-2023

Most of the current speech data augmentation methods operate on either the raw waveform or the amplitude spectrum of speech. In this paper, we propose a novel speech data augmentation method called PhasePerturbation that operates dynamically on the phase spectrum of speech. Instead of statically rotating a phase by a constant degree, PhasePerturbation utilizes three dynamic phase spectrum operations, i.e., a randomization operation, a frequency masking operation, and a temporal masking operation, to enhance the diversity of speech data. We conduct experiments on wav2vec2.0 pre-trained ASR models by fine-tuning them with the PhasePerturbation augmented TIMIT corpus. The experimental results demonstrate 10.9\% relative reduction in the word error rate (WER) compared with the baseline model fine-tuned without any augmentation operation. Furthermore, the proposed method achieves additional improvements (12.9\% and 15.9\%) in WER by complementing the Vocal Tract Length Perturbation (VTLP) and the SpecAug, which are both amplitude spectrum-based augmentation methods. The results highlight the capability of PhasePerturbation to improve the current amplitude spectrum-based augmentation methods.

recognition, spectrum, speech recognition, (13 more...)

arXiv.org Artificial Intelligence

2312.08571

Country:

Asia > Taiwan (0.06)
Oceania > New Zealand (0.05)
Asia > China > Liaoning Province > Dalian (0.04)
(3 more...)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Domain Generalization with Fourier Transform and Soft Thresholding

Pan, Hongyi, Wang, Bin, Zhang, Zheyuan, Zhu, Xin, Jha, Debesh, Cetin, Ahmet Enis, Spampinato, Concetto, Bagci, Ulas

arXiv.org Artificial IntelligenceDec-12-2023

Domain generalization aims to train models on multiple source domains so that they can generalize well to unseen target domains. Among many domain generalization methods, Fourier-transform-based domain generalization methods have gained popularity primarily because they exploit the power of Fourier transformation to capture essential patterns and regularities in the data, making the model more robust to domain shifts. The mainstream Fourier-transform-based domain generalization swaps the Fourier amplitude spectrum while preserving the phase spectrum between the source and the target images. However, it neglects background interference in the amplitude spectrum. To overcome this limitation, we introduce a soft-thresholding function in the Fourier domain. We apply this newly designed algorithm to retinal fundus image segmentation, which is important for diagnosing ocular diseases but the neural network's performance can degrade across different sources due to domain shifts. The proposed technique basically enhances fundus image augmentation by eliminating small values in the Fourier domain and providing better generalization. The innovative nature of the soft thresholding fused with Fourier-transform-based domain generalization improves neural network models' performance by reducing the target images' background interference significantly. Experiments on public data validate our approach's effectiveness over conventional and state-of-the-art methods with superior segmentation metrics.

generalization, soft-thresholding function, spectrum, (12 more...)

arXiv.org Artificial Intelligence

2309.09866

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Italy (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.95)
Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.89)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

CheapNET: Improving Light-weight speech enhancement network by projected loss function

Tan, Kaijun, Dai, Benzhe, Li, Jiakui, Mao, Wenyu

arXiv.org Artificial IntelligenceNov-27-2023

Noise suppression and echo cancellation are critical in speech enhancement and essential for smart devices and real-time communication. Deployed in voice processing front-ends and edge devices, these algorithms must ensure efficient real-time inference with low computational demands. Traditional edge-based noise suppression often uses MSE-based amplitude spectrum mask training, but this approach has limitations. We introduce a novel projection loss function, diverging from MSE, to enhance noise suppression. This method uses projection techniques to isolate key audio components from noise, significantly improving model performance. For echo cancellation, the function enables direct predictions on LAEC pre-processed outputs, substantially enhancing performance. Our noise suppression model achieves near state-of-the-art results with only 3.1M parameters and 0.4GFlops/s computational load. Moreover, our echo cancellation model outperforms replicated industry-leading models, introducing a new perspective in speech enhancement.

enhancement, loss function, speech enhancement, (15 more...)

arXiv.org Artificial Intelligence

2311.15959

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Speech (0.67)

Add feedback

PASTA: Proportional Amplitude Spectrum Training Augmentation for Syn-to-Real Domain Generalization

Chattopadhyay, Prithvijit, Sarangmath, Kartik, Vijaykumar, Vivek, Hoffman, Judy

arXiv.org Artificial IntelligenceSep-22-2023

Synthetic data offers the promise of cheap and bountiful training data for settings where labeled real-world data is scarce. However, models trained on synthetic data significantly underperform when evaluated on real-world data. In this paper, we propose Proportional Amplitude Spectrum Training Augmentation (PASTA), a simple and effective augmentation strategy to improve out-of-the-box synthetic-to-real (syn-to-real) generalization performance. PASTA perturbs the amplitude spectra of synthetic images in the Fourier domain to generate augmented views. Specifically, with PASTA we propose a structured perturbation strategy where high-frequency components are perturbed relatively more than the low-frequency ones. For the tasks of semantic segmentation (GTAV-to-Real), object detection (Sim10K-to-Real), and object recognition (VisDA-C Syn-to-Real), across a total of 5 syn-to-real shifts, we find that PASTA outperforms more complex state-of-the-art generalization methods while being complementary to the same.

asta, generalization, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2212.00979

Country:

North America > United States (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.47)
Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback