AITopics | youden

Collaborating Authors

youden

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

OpenAUC: TowardsAUC-Oriented Open-SetRecognition

Neural Information Processing SystemsFeb-11-2026, 01:41:22 GMT

However, how to evaluate model performance in this challenging task is still unsolved.

data mining, machine learning, openauc, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Italy (0.04)
Asia > China (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.30)

Add feedback

HyperCore: Coreset Selection under Noise via Hypersphere Models

Moser, Brian B., Shanbhag, Arundhati S., Nauen, Tobias C., Frolov, Stanislav, Raue, Federico, Folz, Joachim, Dengel, Andreas

arXiv.org Artificial IntelligenceNov-18-2025

The goal of coreset selection methods is to identify representative subsets of datasets for efficient model training. Yet, existing methods often ignore the possibility of annotation errors and require fixed pruning ratios, making them impractical in real-world settings. We present HyperCore, a robust and adaptive coreset selection framework designed explicitly for noisy environments. HyperCore leverages lightweight hypersphere models learned per class, embedding in-class samples close to a hypersphere center while naturally segregating out-of-class samples based on their distance. By using Youden's J statistic, HyperCore can adaptively select pruning thresholds, enabling automatic, noise-aware data pruning without hyperparameter tuning. Our experiments reveal that HyperCore consistently surpasses state-of-the-art coreset selection methods, especially under noisy and low-data regimes. HyperCore effectively discards mislabeled and ambiguous points, yielding compact yet highly informative subsets suitable for scalable and noise-free learning.

artificial intelligence, data mining, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2509.21746

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

VietBinoculars: A Zero-Shot Approach for Detecting Vietnamese LLM-Generated Text

Nguyen, Trieu Hai, Akilesh, Sivaswamy

arXiv.org Artificial IntelligenceOct-1-2025

The rapid development research of Large Language Models (LLMs) based on transformer architectures raises key challenges, one of them being the task of distinguishing between human-written text and LLM-generated text. As LLM-generated textual content, becomes increasingly complex over time, and resembles human writing, traditional detection methods are proving less effective, especially as the number and diversity of LLMs continue to grow with new models and versions being released at a rapid pace. This study proposes VietBinoculars, an adaptation of the Binoculars method with optimized global thresholds, to enhance the detection of Vietnamese LLM-generated text. We have constructed new Vietnamese AI-generated datasets to determine the optimal thresholds for VietBinoculars and to enable benchmarking. The results from our experiments show results show that VietBinoculars achieves over 99\% in all two domains of accuracy, F1-score, and AUC on multiple out-of-domain datasets. It outperforms the original Binoculars model, traditional detection methods, and other state-of-the-art approaches, including commercial tools such as ZeroGPT and DetectGPT, especially under specially modified prompting strategies.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2509.26189

Country:

Asia > Vietnam (0.28)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine (0.68)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semantic Similarity-Informed Bayesian Borrowing for Quantitative Signal Detection of Adverse Events

Haguinet, François, Painter, Jeffery L, Powell, Gregory E, Callegaro, Andrea, Bate, Andrew

arXiv.org Artificial IntelligenceMay-20-2025

We present a Bayesian dynamic borrowing (BDB) approach to enhance the quantitative identification of adverse events (AEs) in spontaneous reporting systems (SRSs). The method embeds a robust meta-analytic predictive (MAP) prior with a Bayesian hierarchical model and incorporates semantic similarity measures (SSMs) to enable weighted information sharing from clinically similar MedDRA Preferred Terms (PTs) to the target PT. This continuous similarity-based borrowing overcomes limitations of rigid hierarchical grouping in current disproportionality analysis (DPA). Using data from the FDA Adverse Event Reporting System (FAERS) between 2015 and 2019, we evaluate our approach -- termed IC SSM -- against traditional Information Component (IC) analysis and IC with borrowing at the MedDRA high-level group term level (IC HLGT). A reference set (PVLens), derived from FDA product label update, enabled prospective evaluation of method performance in identifying AEs prior to official labeling. The IC SSM approach demonstrated higher sensitivity (1332/2337=0.570, Youden's J=0.246) than traditional IC (Se=0.501, J=0.250) and IC HLGT (Se=0.556, J=0.225), consistently identifying more true positives and doing so on average 5 months sooner than traditional IC. Despite a marginally lower aggregate F1-score and Youden's index, IC SSM showed higher performance in early post-marketing periods or when the detection threshold was raised, providing more stable and relevant alerts than IC HLGT and traditional IC. These findings support the use of SSM-informed Bayesian borrowing as a scalable and context-aware enhancement to traditional DPA methods, with potential for validation across other datasets and exploration of additional similarity metrics and Bayesian strategies using case-level data.

machine learning, natural language, youden, (19 more...)

arXiv.org Artificial Intelligence

2504.12052

Country: North America > United States (0.86)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.68)
Government > Regional Government > North America Government > United States Government > FDA (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Add feedback

A Cross-lingual Natural Language Processing Framework for Infodemic Management

Pal, Ridam, Pandey, Rohan, Gautam, Vaibhav, Bhagat, Kanav, Sethi, Tavpritesh

arXiv.org Artificial IntelligenceOct-30-2020

The COVID-19 pandemic has put immense pressure on health systems which are further strained due to the misinformation surrounding it. Under such a situation, providing the right information at the right time is crucial. There is a growing demand for the management of information spread using Artificial Intelligence. Hence, we have exploited the potential of Natural Language Processing for identifying relevant information that needs to be disseminated amongst the masses. In this work, we present a novel Cross-lingual Natural Language Processing framework to provide relevant information by matching daily news with trusted guidelines from the World Health Organization. The proposed pipeline deploys various techniques of NLP such as summarizers, word embeddings, and similarity metrics to provide users with news articles along with a corresponding healthcare guideline. A total of 36 models were evaluated and a combination of LexRank based summarizer on Word2Vec embedding with Word Mover distance metric outperformed all other models. This novel open-source approach can be used as a template for proactive dissemination of relevant healthcare information in the midst of misinformation spread associated with epidemics.

machine learning, natural language, news article, (20 more...)

arXiv.org Artificial Intelligence

2010.16357

Country:

Asia > India > Uttar Pradesh > Lucknow (0.04)
North America > Canada > Ontario > Middlesex County > London (0.04)
Asia > China (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.72)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback