AITopics | gop

Collaborating Authors

gop

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating Logit-Based GOP Scores for Mispronunciation Detection

Parikh, Aditya Kamlesh, Tejedor-Garcia, Cristian, Cucchiarini, Catia, Strik, Helmer

arXiv.org Artificial IntelligenceSep-1-2025

Pronunciation assessment relies on goodness of pronunciation (GOP) scores, traditionally derived from softmax-based posterior probabilities. However, posterior probabilities may suffer from overconfidence and poor phoneme separation, limiting their effectiveness. This study compares logit-based GOP scores with probability-based GOP scores for mispronunciation detection. We conducted our experiment on two L2 English speech datasets spoken by Dutch and Mandarin speakers, assessing classification performance and correlation with human ratings. Logit-based methods outperform probability-based GOP in classification, but their effectiveness depends on dataset characteristics. The maximum logit GOP shows the strongest alignment with human perception, while a combination of different GOP scores balances probability and logit features. The findings suggest that hybrid GOP methods incorporating uncertainty modeling and phoneme-specific weighting improve pronunciation assessment.

artificial intelligence, machine learning, speech recognition, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2025-1012

2506.12067

Country: Europe (0.68)

Genre: Research Report > New Finding (0.88)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.47)

Add feedback

Enhancing GOP in CTC-Based Mispronunciation Detection with Phonological Knowledge

Parikh, Aditya Kamlesh, Tejedor-Garcia, Cristian, Cucchiarini, Catia, Strik, Helmer

arXiv.org Artificial IntelligenceSep-1-2025

Computer-Assisted Pronunciation Training (CAPT) systems employ automatic measures of pronunciation quality, such as the goodness of pronunciation (GOP) metric. GOP relies on forced alignments, which are prone to labeling and segmentation errors due to acoustic variability. While alignment-free methods address these challenges, they are computationally expensive and scale poorly with phoneme sequence length and inventory size. To enhance efficiency, we introduce a substitution-aware alignment-free GOP that restricts phoneme substitutions based on phoneme clusters and common learner errors. We evaluated our GOP on two L2 English speech datasets, one with child speech, My Pronunciation Coach (MPC), and Spee-chOcean762, which includes child and adult speech. We compared RPS (restricted phoneme substitutions) and UPS (unrestricted phoneme substitutions) setups within alignment-free methods, which outperformed the baseline. We discuss our results and outline avenues for future research.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2025-829

2506.0208

Country: Europe (0.93)

Genre: Research Report > New Finding (0.66)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Segmentation-free Goodness of Pronunciation

Cao, Xinwei, Fan, Zijian, Svendsen, Torbjørn, Salvi, Giampiero

arXiv.org Artificial IntelligenceJul-30-2025

Mispronunciation detection and diagnosis (MDD) is a significant part in modern computer aided language learning (CALL) systems. Within MDD, phoneme-level pronunciation assessment is key to helping L2 learners improve their pronunciation. However, most systems are based on a form of goodness of pronunciation (GOP) which requires pre-segmentation of speech into phonetic units. This limits the accuracy of these methods and the possibility to use modern CTC-based acoustic models for their evaluation. In this study, we first propose self-alignment GOP (GOP-SA) that enables the use of CTC-trained ASR models for MDD. Next, we define a more general alignment-free method that takes all possible alignments of the target phoneme into account (GOP-AF). We give a theoretical account of our definition of GOP-AF, an implementation that solves potential numerical issues as well as a proper normalization which makes the method applicable with acoustic models with different peakiness over time. We provide extensive experimental results on the CMU Kids and Speechocean762 datasets comparing the different definitions of our methods, estimating the dependency of GOP-AF on the peakiness of the acoustic models and on the amount of context around the target phoneme. Finally, we compare our methods with recent studies over the Speechocean762 data showing that the feature vectors derived from the proposed method achieve state-of-the-art results on phoneme-level pronunciation assessment.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.16838

Genre: Research Report > New Finding (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(3 more...)

Add feedback

LINR-PCGC: Lossless Implicit Neural Representations for Point Cloud Geometry Compression

Huang, Wenjie, Yang, Qi, Xia, Shuting, Huang, He, Li, Zhu, Xu, Yiling

arXiv.org Artificial IntelligenceJul-22-2025

Existing AI-based point cloud compression methods struggle with dependence on specific training data distributions, which limits their real-world deployment. Implicit Neural Representation (INR) methods solve the above problem by encoding overfitted network parameters to the bitstream, resulting in more distribution-agnostic results. However, due to the limitation of encoding time and decoder size, current INR based methods only consider lossy geometry compression. In this paper, we propose the first INR based lossless point cloud geometry compression method called Lossless Implicit Neural Representations for Point Cloud Geometry Compression (LINR-PCGC). To accelerate encoding speed, we design a group of point clouds level coding framework with an effective network initialization strategy, which can reduce around 60% encoding time. A lightweight coding network based on multiscale SparseConv, consisting of scale context extraction, child node prediction, and model compression modules, is proposed to realize fast inference and compact decoder size. Experimental results show that our method consistently outperforms traditional and AI-based methods: for example, with the convergence time in the MVUB dataset, our method reduces the bitstream by approximately 21.21% compared to G-PCC TMC13v23 and 21.95% compared to SparsePCGC. Our project can be seen on https://huangwenjie2023.github.io/LINR-PCGC/.

artificial intelligence, machine learning, point cloud, (16 more...)

arXiv.org Artificial Intelligence

2507.15686

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

UAR-NVC: A Unified AutoRegressive Framework for Memory-Efficient Neural Video Compression

Wang, Jia, Zhang, Xinfeng, Zhang, Gai, Zhu, Jun, Tang, Lv, Zhang, Li

arXiv.org Artificial IntelligenceMar-4-2025

Implicit Neural Representations (INRs) have demonstrated significant potential in video compression by representing videos as neural networks. However, as the number of frames increases, the memory consumption for training and inference increases substantially, posing challenges in resource-constrained scenarios. Inspired by the success of traditional video compression frameworks, which process video frame by frame and can efficiently compress long videos, we adopt this modeling strategy for INRs to decrease memory consumption, while aiming to unify the frameworks from the perspective of timeline-based autoregressive modeling. In this work, we present a novel understanding of INR models from an autoregressive (AR) perspective and introduce a Unified AutoRegressive Framework for memory-efficient Neural Video Compression (UAR-NVC). UAR-NVC integrates timeline-based and INR-based neural video compression under a unified autoregressive paradigm. It partitions videos into several clips and processes each clip using a different INR model instance, leveraging the advantages of both compression frameworks while allowing seamless adaptation to either in form. To further reduce temporal redundancy between clips, we design two modules to optimize the initialization, training, and compression of these model parameters. UAR-NVC supports adjustable latencies by varying the clip length. Extensive experimental results demonstrate that UAR-NVC, with its flexible video clip setting, can adapt to resource-constrained environments and significantly improve performance compared to different baseline models.

compression, gop, video, (15 more...)

arXiv.org Artificial Intelligence

2503.02733

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multi-Task Decision-Making for Multi-User 360 Video Processing over Wireless Networks

Badnava, Babak, Chakareski, Jacob, Hashemi, Morteza

arXiv.org Artificial IntelligenceJul-3-2024

We study a multi-task decision-making problem for 360 video processing in a wireless multi-user virtual reality (VR) system that includes an edge computing unit (ECU) to deliver 360 videos to VR users and offer computing assistance for decoding/rendering of video frames. However, this comes at the expense of increased data volume and required bandwidth. To balance this trade-off, we formulate a constrained quality of experience (QoE) maximization problem in which the rebuffering time and quality variation between video frames are bounded by user and video requirements. To solve the formulated multi-user QoE maximization, we leverage deep reinforcement learning (DRL) for multi-task rate adaptation and computation distribution (MTRC). The proposed MTRC approach does not rely on any predefined assumption about the environment and relies on video playback statistics (i.e., past throughput, decoding time, transmission time, etc.), video information, and the resulting performance to adjust the video bitrate and computation distribution. We train MTRC with real-world wireless network traces and 360 video datasets to obtain evaluation results in terms of the average QoE, peak signal-to-noise ratio (PSNR), rebuffering time, and quality variation. Our results indicate that the MTRC improves the users' QoE compared to state-of-the-art rate adaptation algorithm. Specifically, we show a 5.97 dB to 6.44 dB improvement in PSNR, a 1.66X to 4.23X improvement in rebuffering time, and a 4.21 dB to 4.35 dB improvement in quality variation.

headset, quality variation, video, (15 more...)

arXiv.org Artificial Intelligence

2407.03426

Country:

North America > United States > New Jersey (0.04)
North America > United States > Kansas (0.04)

Genre: Research Report (0.84)

Industry:

Telecommunications (0.46)
Information Technology > Hardware (0.35)
Leisure & Entertainment (0.35)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

GOP, Dems push to expand Trump-era deal, Father's Day celebration takes deadly turn and more top headlines

FOX NewsJun-19-2023, 10:41:34 GMT

US President Joe Biden meets with China's President Xi Jinping during a virtual summit from the Roosevelt Room of the White House in Washington, DC, November 15, 2021. OLIVE BRANCH - GOP, Dems push Biden admin to expand Trump-era deal to blunt American adversaries. 'VIOLENCE PREVAILS' - Police commander doesn't mince words after Father's Day celebration takes deadly turn. IN GOOD COMPANY? - Americans react to Mark Cuban's claim on wokeness and business. JOURNALISM JAB - Media ignores Biden's'dumb question' slam on reporter after hounding Trump.

day celebration take deadly turn, expand trump-era deal, top headline, (3 more...)

FOX News

Country:

Asia > China (0.95)
North America > United States > District of Columbia > Washington (0.26)
North America > United States > Texas (0.06)
(2 more...)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence (0.37)

Add feedback

Speech Intelligibility Assessment of Dysarthric Speech by using Goodness of Pronunciation with Uncertainty Quantification

Yeo, Eun Jung, Choi, Kwanghee, Kim, Sunhee, Chung, Minhwa

arXiv.org Artificial IntelligenceMay-28-2023

This paper proposes an improved Goodness of Pronunciation (GoP) that utilizes Uncertainty Quantification (UQ) for automatic speech intelligibility assessment for dysarthric speech. Current GoP methods rely heavily on neural network-driven overconfident predictions, which is unsuitable for assessing dysarthric speech due to its significant acoustic differences from healthy speech. To alleviate the problem, UQ techniques were used on GoP by 1) normalizing the phoneme prediction (entropy, margin, maxlogit, logit-margin) and 2) modifying the scoring function (scaling, prior normalization). As a result, prior-normalized maxlogit GoP achieves the best performance, with a relative increase of 5.66%, 3.91%, and 23.65% compared to the baseline GoP for English, Korean, and Tamil, respectively. Furthermore, phoneme analysis is conducted to identify which phoneme scores significantly correlate with intelligibility scores in each language.

artificial intelligence, assessment, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.18392

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

GOP reacts to Hunter Biden IRS whistleblower, Fetterman raises eyebrows and more top headlines

FOX NewsApr-20-2023, 11:11:14 GMT

'DEEPLY CONCERNING' - Republicans respond after IRS whistleblower says Hunter Biden investigation is being mishandled. 'FRIGHTENING' - Fetterman's opening statement upon return to Senate after hospitalization raises eyebrows. WATCH THE WALLET - Crypto criminals beware: AI is after you. MISINFORMATION MACHINES – AI chatbot'hallucinations' could pose political, intellectual, institutional dangers. Continue reading … ROYAL REACTION - Lee Cohen explains why Meghan Markle deserves praise for skipping the coronation,.

fetterman raise eyebrow, hunter biden irs whistleblower, irs whistleblower, (4 more...)

FOX News

Country:

Oceania > Australia > New South Wales (0.06)
North America > United States > North Carolina (0.06)

Industry:

Media > News (0.80)
Government > Tax (0.74)
Government > Regional Government > North America Government > United States Government (0.74)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.75)

Add feedback

Spatially Constrained Geodesign Optimization (GOP) for Improving Agricultural Watershed Sustainability

Xie, Yiqun (University of Minnesota, Twin Cities) | Yang, KwangSoo (Florida Atlantic University) | Shekhar, Shashi (University of Minnesota, Twin Cities) | Dalzell, Brent (University of Minnesota, Twin Cities) | Mulla, David (University of Minnesota, Twin Cities)

AAAI ConferencesFeb-4-2017

Given an agricultural watershed containing a set of spatial units, and a set of land management practices, the Geodesign Optimization (GOP) aims to find a land management practice for each spatial unit that optimizes overall water quality improvements in the watershed under both budget constraint and spatial constraints (e.g., minimum contiguous area, shape) arising from farm equipment operation practicalities. GOP is important for redesign of agricultural watersheds in Midwestern US to mitigate soil and water quality degradation and loss of habitat. The problem is computationally challenging as a large-scale combinatorial problem (NP-hard) under spatial constraints. Existing optimization techniques do not address spatial constraints, and lead to impractical solutions requiring frequent farm equipment reconfiguration. In this paper, we formalize the spatially-constrained GOP and propose a novel spatial optimizer which explores optimal solution without constraint violations. Our approach is further validated through a Geodesign case study at Seven Mile Creek watershed in Midwestern US.

artificial intelligence, constraint, optimization problem, (16 more...)

AAAI Conferences

Workshops at the Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > United States > Minnesota (0.05)
North America > United States > Mississippi (0.04)
North America > United States > Kentucky (0.04)
(2 more...)

Industry:

Food & Agriculture > Agriculture (1.00)
Water & Waste Management > Water Management > Water Supplies & Services (0.57)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback