AITopics

2507.12669

Country: North America > United States > California > San Francisco County > San Francisco (0.29)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Dorner, Florian E., Chen, Yatong, Cruz, André F., Yang, Fanny

ROC-n-reroll: How verifier imperfection affects test-time scaling

arXiv.org Machine LearningJul-17-2025

Test-time scaling aims to improve language model performance by leveraging additional compute during inference. While many works have empirically studied techniques like Best-of-N (BoN) and rejection sampling that make use of a verifier to enable test-time scaling, there is little theoretical understanding of how verifier imperfection affects performance. In this work, we address this gap. Specifically, we prove how instance-level accuracy of these methods is precisely characterized by the geometry of the verifier's ROC curve. Interestingly, while scaling is determined by the local geometry of the ROC curve for rejection sampling, it depends on global properties of the ROC curve for BoN. As a consequence when the ROC curve is unknown, it is impossible to extrapolate the performance of rejection sampling based on the low-compute regime. Furthermore, while rejection sampling outperforms BoN for fixed compute, in the infinite-compute limit both methods converge to the same level of accuracy, determined by the slope of the ROC curve near the origin. Our theoretical results are confirmed by experiments on GSM8K using different versions of Llama and Qwen to generate and verify solutions.

artificial intelligence, machine learning, roc curve, (16 more...)

arXiv.org Machine Learning

2507.12399

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Romanus, Ogonnaya Michael, Boulaguiem, Younes, Molinari, Roberto

Fiducial Matching: Differentially Private Inference for Categorical Data

arXiv.org Machine LearningJul-17-2025

The task of statistical inference, which includes the building of confidence intervals and tests for parameters and effects of interest to a researcher, is still an open area of investigation in a differentially private (DP) setting. Indeed, in addition to the randomness due to data sampling, DP delivers another source of randomness consisting of the noise added to protect an individual's data from being disclosed to a potential attacker. As a result of this convolution of noises, in many cases it is too complicated to determine the stochastic behavior of the statistics and parameters resulting from a DP procedure. In this work, we contribute to this line of investigation by employing a simulation-based matching approach, solved through tools from the fiducial framework, which aims to replicate the data generation pipeline (including the DP step) and retrieve an approximate distribution of the estimates resulting from this pipeline. For this purpose, we focus on the analysis of categorical (nominal) data that is common in national surveys, for which sensitivity is naturally defined, and on additive privacy mechanisms. We prove the validity of the proposed approach in terms of coverage and highlight its good computational and statistical performance for different inferential tasks in simulated and applied data settings.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2507.11762

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Alabama > Lee County > Auburn (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.96)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)

Kwak, Jaehyun, Inhar, Ramahdani Muhammad Izaaz, Yun, Se-Young, Lee, Sung-Ju

QuRe: Query-Relevant Retrieval through Hard Negative Sampling in Composed Image Retrieval

Composed Image Retrieval (CIR) retrieves relevant images based on a reference image and accompanying text describing desired modifications. However, existing CIR methods only focus on retrieving the target image and disregard the relevance of other images. This limitation arises because most methods employing contrastive learning-which treats the target image as positive and all other images in the batch as negatives-can inadvertently include false negatives. This may result in retrieving irrelevant images, reducing user satisfaction even when the target image is retrieved. To address this issue, we propose Query-Relevant Retrieval through Hard Negative Sampling (QuRe), which optimizes a reward model objective to reduce false negatives. Additionally, we introduce a hard negative sampling strategy that selects images positioned between two steep drops in relevance scores following the target image, to effectively filter false negatives. In order to evaluate CIR models on their alignment with human satisfaction, we create Human-Preference FashionIQ (HP-FashionIQ), a new dataset that explicitly captures user preferences beyond target retrieval. Extensive experiments demonstrate that QuRe achieves state-of-the-art performance on FashionIQ and CIRR datasets while exhibiting the strongest alignment with human preferences on the HP-FashionIQ dataset. The source code is available at https://github.com/jackwaky/QuRe.

artificial intelligence, machine learning, target image, (11 more...)

2507.12416

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.77)

Smit, Reuben, Louw, Retief, Kamper, Herman

Towards few-shot isolated word reading assessment

We explore an ASR-free method for isolated word reading assessment in low-resource settings. Our few-shot approach compares input child speech to a small set of adult-provided reference templates. Inputs and templates are encoded using intermediate layers from large self-supervised learned (SSL) models. Using an Afrikaans child speech benchmark, we investigate design options such as discretising SSL features and barycentre averaging of the templates. Idealised experiments show reasonable performance for adults, but a substantial drop for child speech input, even with child templates. Despite the success of employing SSL representations in low-resource speech tasks, our work highlights the limitations of SSL representations for processing child data when used in a few-shot classification system.

artificial intelligence, machine learning, template, (16 more...)

2507.12217

Country: Africa (0.14)

Genre: Research Report > New Finding (0.94)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Self-Adaptive and Robust Federated Spectrum Sensing without Benign Majority for Cellular Networks

Pham, Ngoc Duy, Dayaratne, Thusitha, Vo, Viet, Lai, Shangqi, Abuadbba, Sharif, Suzuki, Hajime, Yuan, Xingliang, Rudolph, Carsten

Advancements in wireless and mobile technologies, including 5G advanced and the envisioned 6G, are driving exponential growth in wireless devices. However, this rapid expansion exacerbates spectrum scarcity, posing a critical challenge. Dynamic spectrum allocation (DSA)--which relies on sensing and dynamically sharing spectrum--has emerged as an essential solution to address this issue. While machine learning (ML) models hold significant potential for improving spectrum sensing, their adoption in centralized ML-based DSA systems is limited by privacy concerns, bandwidth constraints, and regulatory challenges. To overcome these limitations, distributed ML-based approaches such as Federated Learning (FL) offer promising alternatives. This work addresses two key challenges in FL-based spectrum sensing (FLSS). First, the scarcity of labeled data for training FL models in practical spectrum sensing scenarios is tackled with a semi-supervised FL approach, combined with energy detection, enabling model training on unlabeled datasets. Second, we examine the security vulnerabilities of FLSS, focusing on the impact of data poisoning attacks. Our analysis highlights the shortcomings of existing majority-based defenses in countering such attacks. To address these vulnerabilities, we propose a novel defense mechanism inspired by vaccination, which effectively mitigates data poisoning attacks without relying on majority-based assumptions. Extensive experiments on both synthetic and real-world datasets validate our solutions, demonstrating that FLSS can achieve near-perfect accuracy on unlabeled datasets and maintain Byzantine robustness against both targeted and untargeted data poisoning attacks, even when a significant proportion of participants are malicious.

artificial intelligence, dataset, machine learning, (17 more...)

2507.12127

Genre:

Research Report > New Finding (1.00)
Overview (0.93)
Research Report > Promising Solution (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.69)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Interpretable Prediction of Lymph Node Metastasis in Rectal Cancer MRI Using Variational Autoencoders

Keel, Benjamin, Quyn, Aaron, Jayne, David, Mohsin, Maryam, Relton, Samuel D.

Effective treatment for rectal cancer relies on accurate lymph node metastasis (LNM) staging. However, radiological criteria based on lymph node (LN) size, shape and texture morphology have limited diagnostic accuracy. In this work, we investigate applying a Variational Autoencoder (VAE) as a feature encoder model to replace the large pre-trained Convolutional Neural Network (CNN) used in existing approaches. The motivation for using a VAE is that the generative model aims to reconstruct the images, so it directly encodes visual features and meaningful patterns across the data. This leads to a disentangled and structured latent space which can be more interpretable than a CNN. Models are deployed on an in-house MRI dataset with 168 patients who did not undergo neo-adjuvant treatment. The post-operative pathological N stage was used as the ground truth to evaluate model predictions. Our proposed model 'VAE-MLP' achieved state-of-the-art performance on the MRI dataset, with cross-validated metrics of AUC 0.86 +/- 0.05, Sensitivity 0.79 +/- 0.06, and Specificity 0.85 +/- 0.05. Code is available at: https://github.com/benkeel/Lymph_Node_Classification_MIUA.

artificial intelligence, machine learning, prediction, (15 more...)

doi: 10.1007/978-3-031-98691-8_2

2507.11638

Country: Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Synthetic Tabular Data Generation: A Comparative Survey for Modern Techniques

Challagundla, Raju, Dorodchi, Mohsen, Wang, Pu, Lee, Minwoo

As privacy regulations become more stringent and access to real-world data becomes increasingly constrained, synthetic data generation has emerged as a vital solution, especially for tabular datasets, which are central to domains like finance, healthcare and the social sciences. This survey presents a comprehensive and focused review of recent advances in synthetic tabular data generation, emphasizing methods that preserve complex feature relationships, maintain statistical fidelity, and satisfy privacy requirements. A key contribution of this work is the introduction of a novel taxonomy based on practical generation objectives, including intended downstream applications, privacy guarantees, and data utility, directly informing methodological design and evaluation strategies. Therefore, this review prioritizes the actionable goals that drive synthetic data creation, including conditional generation and risk-sensitive modeling. Additionally, the survey proposes a benchmark framework to align technical innovation with real-world demands. By bridging theoretical foundations with practical deployment, this work serves as both a roadmap for future research and a guide for implementing synthetic tabular data in privacy-critical environments.

data mining, machine learning, natural language, (20 more...)

2507.1159

Country: North America > United States > North Carolina (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.46)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(11 more...)

What's Pulling the Strings? Evaluating Integrity and Attribution in AI Training and Inference through Concept Shift

Chang, Jiamin, Li, Haoyang, Pearce, Hammond, Sun, Ruoxi, Li, Bo, Xue, Minhui

The growing adoption of artificial intelligence (AI) has amplified concerns about trustworthiness, including integrity, privacy, robustness, and bias. To assess and attribute these threats, we propose ConceptLens, a generic framework that leverages pre-trained multimodal models to identify the root causes of integrity threats by analyzing Concept Shift in probing samples. ConceptLens demonstrates strong detection performance for vanilla data poisoning attacks and uncovers vulnerabilities to bias injection, such as the generation of covert advertisements through malicious concept shifts. It identifies privacy risks in unaltered but high-risk samples, filters them before training, and provides insights into model weaknesses arising from incomplete or imbalanced training data. Additionally, at the model level, it attributes concepts that the target model is overly dependent on, identifies misleading concepts, and explains how disrupting key concepts negatively impacts the model. Furthermore, it uncovers sociological biases in generative content, revealing disparities across sociological contexts. Strikingly, ConceptLens reveals how safe training and inference data can be unintentionally and easily exploited, potentially undermining safety alignment. Our study informs actionable insights to breed trust in AI systems, thereby speeding adoption and driving greater innovation.

conceptlen, data mining, machine learning, (18 more...)

2504.21042

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.68)
Marketing (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)

Ochab, Jeremi K., Matias, Mateusz, Boba, Tymoteusz, Walkowiak, Tomasz

StylOch at PAN: Gradient-Boosted Trees with Frequency-Based Stylometric Features

This submission to the binary AI detection task is based on a modular stylometric pipeline, where: public spaCy models are used for text preprocessing (including tokenisation, named entity recognition, dependency parsing, part-of-speech tagging, and morphology annotation) and extracting several thousand features (frequencies of n-grams of the above linguistic annotations); light-gradient boosting machines are used as the classifier. We collect a large corpus of more than 500 000 machine-generated texts for the classifier's training. We explore several parameter options to increase the classifier's capacity and take advantage of that training set. Our approach follows the non-neural, computationally inexpensive but explainable approach found effective previously.

large language model, machine learning, natural language, (21 more...)

2507.12064

Country:

Europe > Poland (0.29)
Europe > Spain (0.28)

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)