AITopics | quality assessment

Collaborating Authors

quality assessment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation

Neural Information Processing SystemsJun-19-2026, 03:03:47 GMT

Speech signal analysis poses significant challenges, particularly in tasks such as speech quality evaluation and profiling, where the goal is to predict multiple perceptual and objective metrics. For instance, metrics like PESQ (Perceptual Evaluation of Speech Quality), STOI (Short-Time Objective Intelligibility), and MOS (Mean Opinion Score) each capture different aspects of speech quality. However, these metrics often have different scales, assumptions, and dependencies, making joint estimation non-trivial. To address these issues, we introduce ARECHO (Autoregressive Evaluation via Chain-based Hypothesis Optimization), a chain-based, versatile evaluation system for speech assessment grounded in autoregressive dependency modeling. ARECHO is distinguished by three key innovations: (1) a comprehensive speech information tokenization pipeline; (2) a dynamic classifier chain that explicitly captures inter-metric dependencies; and (3) a two-step confidence-oriented decoding algorithm that enhances inference reliability. Experiments demonstrate that ARECHO significantly outperforms the baseline framework across diverse evaluation scenarios, including enhanced speech analysis, speech generation evaluation, and, noisy speech evaluation. Furthermore, its dynamic dependency modeling improves interpretability by capturing inter-metric relationships. Across tasks, ARECHO offers reference-free evaluation using its dynamic classifier chain to support subset queries (single or multiple metrics) and reduces error propagation via confidence-oriented decoding.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia (0.67)
North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Speech (1.00)
(4 more...)

Add feedback

Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models

Neural Information Processing SystemsJun-14-2026, 08:00:56 GMT

Vision-Language Models (VLMs) have demonstrated great potential in interpreting remote sensing (RS) images through language-guided semantic. However, the effectiveness of these VLMs critically depends on high-quality image-text training data that captures rich semantic relationships between visual content and language descriptions. Unlike natural images, RS lacks large-scale interleaved image-text pairs from web data, making data collection challenging. While current approaches rely primarily on rule-based methods or flagship VLMs for data synthesis, a systematic framework for automated quality assessment of such synthetically generated RS vision-language data is notably absent. To fill this gap, we propose a novel score model trained on large-scale RS vision-language preference data for automated quality assessment. Our empirical results demonstrate that fine-tuning CLIP or advanced VLMs (e.g., Qwen2-VL) with the top 30% of data ranked by our score model achieves superior accuracy compared to both full-data fine-tuning and CLIP-score-based ranking approaches. Furthermore, we demonstrate applications of our scoring model for reinforcement learning (RL) training and best-of-N (BoN) test-time scaling, enabling significant improvements in VLM performance for RS tasks. Our code, model, and dataset are publicly available.

artificial intelligence, machine learning, natural language, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.77)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.60)
Information Technology > Artificial Intelligence > Natural Language (0.60)

Add feedback

Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

Neural Information Processing SystemsJun-11-2026, 22:39:56 GMT

Despite rapid advances in 3D content generation, quality assessment for the generated 3D assets remains challenging.Existing methods mainly rely on image-based metrics and operate solely at the object level, limiting their ability to capture spatial Despite rapid advances in 3D content generation, quality assessment for the generated 3D assets remains challenging.Existing methods mainly rely on image-based metrics and operate solely at the object level, limiting their ability to capture spatial coherence, material authenticity, and high-fidelity local details.1) To address these challenges, we introduce Hi3DEval, a hierarchical evaluation framework tailored for 3D generative content. It combines both object-level and part-level evaluation, enabling holistic assessments across multiple dimensions as well as fine-grained quality analysis. Additionally, we extend texture evaluation beyond aesthetic appearance by explicitly assessing material realism, focusing on attributes such as albedo, saturation, and metallicness.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.62)

Add feedback

Influence Guided Context Selection for Effective Retrieval-Augmented Generation

Neural Information Processing SystemsJun-11-2026, 08:35:32 GMT

Retrieval-Augmented Generation (RAG) addresses large language model (LLM) hallucinations by grounding responses in external knowledge, but its effectiveness is compromised by poor-quality retrieved contexts containing irrelevant or noisy information. While existing approaches attempt to improve performance through context selection based on predefined context quality assessment metrics, they show limited gains over standard RAG. We attribute this limitation to their failure in holistically utilizing available information (query, context list, and generator) for comprehensive quality assessment. Inspired by recent advances in data selection, we reconceptualize context quality assessment as an inference-time data valuation problem and introduce the Contextual Influence Value (CI value).

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Probabilistic data quality assessment for structural monitoring data via outlier-resistant conditional diffusion model

Li, Qi, Huang, Yong, Li, Hui

arXiv.org Machine LearningApr-30-2026

Data quality assessment is an essential step that ensures the reliability of the subsequent structural health monitoring (SHM) tasks. This study proposes a prediction deviation-based SHM data quality assessment method using a univariate implicit auto-regressive model, enabling outlier diagnosis and data cleaning. The proposed conditional diffusion model (CDM) augments the standard diffusion model with a conditional embedding module to incorporate temporal context, quartile normalization to mitigate distribution skew, and a Huber loss to enhance robustness against outliers. Within this univariate implicit autoregressive framework, each data point is assigned an outlier probability, quantifying its degree of "outlier-ness", and a global quality evaluation score is computed to characterize the overall dataset quality. Extensive case studies utilizing operational data from real-world structures demonstrate that the proposed framework significantly improves the accuracy of data quality assessment, outperforming other strong baselines representative of clustering, isolation-based, and deep reconstruction methods. The effectiveness and robustness of the proposed framework are further demonstrated by the findings of ablation experiments and hyperparameter analysis.

artificial intelligence, data quality, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1016/j.eswa.2026.132181

2604.26366

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Ground > Rail (0.69)
Energy (0.67)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Supplementary Materials for Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment

Neural Information Processing SystemsApr-29-2026, 19:21:13 GMT

The details of multiple datasets for OIQA task are presented in Table A. For the dataset that contains scanpath coordinates, we can directly sample viewport sequences from it and use our network to predict the quality scores. However, it is challenging and costly to record user scanpath data for every ODI in realistic scenarios. The scanpath information is likely unavailable when evaluating the quality of a panorama. Therefore, we propose a generalized Recursive Probability Sampling (RPS) method to generate multiple pseudo viewport sequences for the panorama, which assists the network to predict an accurate quality score in a way that is similar to the observer's actual scoring process. In JUFE and JXUFE, each ODI consists of 300 viewport coordinates, recorded using a head-mounted display (HMD).

artificial intelligence, machine learning, quality assessment, (15 more...)

Neural Information Processing Systems

Country: Asia (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.72)

Add feedback

Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment

Neural Information Processing SystemsApr-29-2026, 19:21:09 GMT

Blind Omnidirectional Image Quality Assessment (BOIQA) aims to objectively assess the human perceptual quality of omnidirectional images (ODIs) without relying on pristine-quality image information. It is becoming more significant with the increasing advancement of virtual reality (VR) technology. However, the quality assessment of ODIs is severely hampered by the fact that the existing BOIQA pipeline lacks the modeling of the observer's browsing process. To tackle this issue, we propose a novel multi-sequence network for BOIQA called Assessor360, which is derived from the realistic multi-assessor ODI quality assessment procedure. Specifically, we propose a generalized Recursive Probability Sampling (RPS) method for the BOIQA task, combining content and details information to generate multiple pseudo viewport sequences from a given starting point.

artificial intelligence, human computer interaction, machine learning, (20 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop

Neural Information Processing SystemsApr-24-2026, 17:09:32 GMT

No-reference image quality assessment (NR-IQA) aims to quantify how humans perceive visual distortions of digital images without access to their undistorted references. NR-IQA models are extensively studied in computational vision, and are widely used for performance evaluation and perceptual optimization of man-made vision systems. Here we make one of the first attempts to examine the perceptual robustness of NR-IQA models. Under a Lagrangian formulation, we identify insightful connections of the proposed perceptual attack to previous beautiful ideas in computer vision and machine learning. We test one knowledgedriven and three data-driven NR-IQA methods under four full-reference IQA models (as approximations to human perception of just-noticeable differences). Through carefully designed psychophysical experiments, we find that all four NRIQA models are vulnerable to the proposed perceptual attack. More interestingly, we observe that the generated counterexamples are not transferable, manifesting themselves as distinct design flows of respective NR-IQA methods.

artificial intelligence, machine learning, nr-iqa model, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Technology: