AITopics | natural noise

Collaborating Authors

natural noise

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Diffusion Recommender Models and the Illusion of Progress: A Concerning Study of Reproducibility and a Conceptual Mismatch

Benigni, Michael, Dacrema, Maurizio Ferrari, Jannach, Dietmar

arXiv.org Artificial IntelligenceMay-19-2025

Countless new machine learning models are published every year and are reported to significantly advance the state-of-the-art in \emph{top-n} recommendation. However, earlier reproducibility studies indicate that progress in this area may be quite limited. Specifically, various widespread methodological issues, e.g., comparisons with untuned baseline models, have led to an \emph{illusion of progress}. In this work, our goal is to examine whether these problems persist in today's research. To this end, we aim to reproduce the latest advancements reported from applying modern Denoising Diffusion Probabilistic Models to recommender systems, focusing on four models published at the top-ranked SIGIR conference in 2023 and 2024. Our findings are concerning, revealing persistent methodological problems. Alarmingly, through experiments, we find that the latest recommendation techniques based on diffusion models, despite their computational complexity and substantial carbon footprint, are consistently outperformed by simpler existing models. Furthermore, we identify key mismatches between the characteristics of diffusion models and those of the traditional \emph{top-n} recommendation task, raising doubts about their suitability for recommendation. We also note that, in the papers we analyze, the generative capabilities of these models are constrained to a minimum. Overall, our results and continued methodological issues call for greater scientific rigor and a disruptive change in the research and publication culture in this area.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.09364

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization

He, Yuhao, Tian, Jinyu, Zheng, Xianwei, Dong, Li, Li, Yuanman, Zhou, Jiantao

arXiv.org Artificial IntelligenceDec-4-2024

Recent studies have shown that deep learning models are very vulnerable to poisoning attacks. Many defense methods have been proposed to address this issue. However, traditional poisoning attacks are not as threatening as commonly believed. This is because they often cause differences in how the model performs on the training set compared to the validation set. Such inconsistency can alert defenders that their data has been poisoned, allowing them to take the necessary defensive actions. In this paper, we introduce a more threatening type of poisoning attack called the Deferred Poisoning Attack. This new attack allows the model to function normally during the training and validation phases but makes it very sensitive to evasion attacks or even natural noise. We achieve this by ensuring the poisoned model's loss function has a similar value as a normally trained model at each input sample but with a large local curvature. A similar model loss ensures that there is no obvious inconsistency between the training and validation accuracy, demonstrating high stealthiness. On the other hand, the large curvature implies that a small perturbation may cause a significant increase in model loss, leading to substantial performance degradation, which reflects a worse robustness. We fulfill this purpose by making the model have singular Hessian information at the optimal point via our proposed Singularization Regularization term. We have conducted both theoretical and empirical analyses of the proposed method and validated its effectiveness through experiments on image classification tasks. Furthermore, we have confirmed the hazards of this form of poisoning attack under more general scenarios using natural noise, offering a new perspective for research in the field of security.

perturbation, poisoning attack, robustness, (17 more...)

arXiv.org Artificial Intelligence

2411.03752

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Asia > Macao (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Balancing Label Imbalance in Federated Environments Using Only Mixup and Artificially-Labeled Noise

Sang, Kyle, Rabbani, Tahseen, Huang, Furong

arXiv.org Artificial IntelligenceSep-20-2024

Clients in a distributed or federated environment will often hold data skewed towards differing subsets of labels. This scenario, referred to as heterogeneous or non-iid federated learning, has been shown to significantly hinder model training and performance. In this work, we explore the limits of a simple yet effective augmentation strategy for balancing skewed label distributions: filling in underrepresented samples of a particular label class using pseudo-images. While existing algorithms exclusively train on pseudo-images such as mixups of local training data, our augmented client datasets consist of both real and pseudo-images. In further contrast to other literature, we (1) use a DP-Instahide variant to reduce the decodability of our image encodings and (2) as a twist, supplement local data using artificially labeled, training-free 'natural noise' generated by an untrained StyleGAN. These noisy images mimic the power spectra patterns present in natural scenes which, together with mixup images, help homogenize label distribution among clients. We demonstrate that small amounts of augmentation via mixups and natural noise markedly improve label-skewed CIFAR-10 and MNIST training.

mixup, natural image, natural noise, (11 more...)

arXiv.org Artificial Intelligence

2409.13235

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Contextual Text Denoising with Masked Language Models

Sun, Yifu, Jiang, Haoming

arXiv.org Artificial IntelligenceMar-5-2024

Recently, with the help of deep learning models, significant advances have been made in different Natural Language Processing (NLP) tasks. Unfortunately, state-of-the-art models are vulnerable to noisy texts. We propose a new contextual text denoising algorithm based on the ready-to-use masked language model. The proposed algorithm does not require retraining of the model and can be integrated into any NLP system without additional training on paired cleaning training data. We evaluate our method under synthetic noise and natural noise and show that the proposed algorithm can use context information to correct noise text and improve the performance of noisy inputs in several downstream tasks.

algorithm, candidate list, noise, (13 more...)

arXiv.org Artificial Intelligence

1910.1408

Country: Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

TNANet: A Temporal-Noise-Aware Neural Network for Suicidal Ideation Prediction with Noisy Physiological Data

Liu, Niqi, Liu, Fang, Ji, Wenqi, Du, Xinxin, Liu, Xu, Zhao, Guozhen, Mu, Wenting, Liu, Yong-Jin

arXiv.org Artificial IntelligenceJan-23-2024

The robust generalization of deep learning models in the presence of inherent noise remains a significant challenge, especially when labels are subjective and noise is indiscernible in natural settings. This problem is particularly pronounced in many practical applications. In this paper, we address a special and important scenario of monitoring suicidal ideation, where time-series data, such as photoplethysmography (PPG), is susceptible to such noise. Current methods predominantly focus on image and text data or address artificially introduced noise, neglecting the complexities of natural noise in time-series analysis. To tackle this, we introduce a novel neural network model tailored for analyzing noisy physiological time-series data, named TNANet, which merges advanced encoding techniques with confidence learning, enhancing prediction accuracy. Another contribution of our work is the collection of a specialized dataset of PPG signals derived from real-world environments for suicidal ideation prediction. Employing this dataset, our TNANet achieves the prediction accuracy of 63.33% in a binary classification task, outperforming state-of-the-art models. Furthermore, comprehensive evaluations were conducted on three other well-known public datasets with artificially introduced noise to rigorously test the TNANet's capabilities. These tests consistently demonstrated TNANet's superior performance by achieving an accuracy improvement of more than 10% compared to baseline methods.

dataset, suicidal ideation, tnanet, (15 more...)

arXiv.org Artificial Intelligence

2401.12733

Country:

Europe > Denmark (0.04)
North America > United States > Colorado (0.04)
Europe > United Kingdom > Wales (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency

Wang, Yan, Li, Yuhang, Gong, Ruihao, Liu, Aishan, Wang, Yanfei, Hu, Jian, Yao, Yongqiang, Zhang, Yunchen, Xiao, Tianzi, Yu, Fengwei, Liu, Xianglong

arXiv.org Artificial IntelligenceJul-1-2023

Extensive studies have shown that deep learning models are vulnerable to adversarial and natural noises, yet little is known about model robustness on noises caused by different system implementations. In this paper, we for the first time introduce SysNoise, a frequently occurred but often overlooked noise in the deep learning training-deployment cycle. In particular, SysNoise happens when the source training system switches to a disparate target system in deployments, where various tiny system mismatch adds up to a non-negligible difference. We first identify and classify SysNoise into three categories based on the inference stage; we then build a holistic benchmark to quantitatively measure the impact of SysNoise on 20+ models, comprehending image classification, object detection, instance segmentation and natural language processing tasks. Our extensive experiments revealed that SysNoise could bring certain impacts on model robustness across different tasks and common mitigations like data augmentation and adversarial training show limited effects on it. Together, our findings open a new research topic and we hope this work will raise research attention to deep learning deployment systems accounting for model performance. Based in handling multiple tasks (Krizhevsky et al., 2012; Simonyan on where SysNoise could happen, we classify it into three & Zisserman, 2014; He et al., 2016a; Devlin et al., different types. Pre-processing: Depends on the implementation 2018; Brown et al., 2020), yet they are vulnerable against of input data. Despite the progress devoted to noises made by decoding (JPEG2RGB) algorithms and different interpolation human-being or nature (e.g., adversarial noises (Goodfellow methods for image resize and crop. In practice, have different results when the upsampling operator is different.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2307.0028

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Florida > Miami-Dade County > Miami Beach (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Word Shape Matters: Robust Machine Translation with Visual Embedding

Wang, Haohan, Zhang, Peiyan, Xing, Eric P.

arXiv.org Artificial IntelligenceOct-20-2020

Neural machine translation has achieved remarkable empirical performance over standard benchmark datasets, yet recent evidence suggests that the models can still fail easily dealing with substandard inputs such as misspelled words, To overcome this issue, we introduce a new encoding heuristic of the input symbols for character-level NLP models: it encodes the shape of each character through the images depicting the letters when printed. We name this new strategy visual embedding and it is expected to improve the robustness of NLP models because humans also process the corpus visually through printed letters, instead of machinery one-hot vectors. Empirically, our method improves models' robustness against substandard inputs, even in the test scenario where the models are tested with the noises that are beyond what is available during the training phase.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2010.09997

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation

Karpukhin, Vladimir, Levy, Omer, Eisenstein, Jacob, Ghazvininejad, Marjan

arXiv.org Machine LearningFeb-4-2019

We consider the problem of making machine translation more robust to character-level variation at the source side, such as typos. Existing methods achieve greater coverage by applying subword models such as byte-pair encoding (BPE) and character-level encoders, but these methods are highly sensitive to spelling mistakes. We show how training on a mild amount of random synthetic noise can dramatically improve robustness to these variations, without diminishing performance on clean text. We focus on translation performance on natural noise, as captured by frequent corrections in Wikipedia edit logs, and show that robustness to such noise can be achieved using a balanced diet of simple synthetic noises at training time, without access to the natural noise data or distribution.

natural noise, noise, synthetic noise, (12 more...)

arXiv.org Machine Learning

1902.01509

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

On Adversarial Examples for Character-Level Neural Machine Translation

Ebrahimi, Javid, Lowd, Daniel, Dou, Dejing

arXiv.org Artificial IntelligenceJun-23-2018

Evaluating on adversarial examples has become a standard procedure to measure robustness of deep learning models. Due to the difficulty of creating white-box adversarial examples for discrete text input, most analyses of the robustness of NLP models have been done through black-box adversarial examples. We investigate adversarial examples for character-level neural machine translation (NMT), and contrast black-box adversaries with a novel white-box adversary, which employs differentiable string-edit operations to rank adversarial changes. We propose two novel types of attacks which aim to remove or change a word in a translation, rather than simply break the NMT. We demonstrate that white-box adversarial examples are significantly stronger than their black-box counterparts in different attack scenarios, which show more serious vulnerabilities than previously known. In addition, after performing adversarial training, which takes only 3 times longer than regular training, we can improve the model's robustness significantly.

adversarial example, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

1806.0903

Country:

Europe > France (0.04)
North America > United States > Oregon (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.66)
Government > Military (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback