diagnostic
A Hybrid Machine Learning Approach for Synthetic Data Generation with Post Hoc Calibration for Clinical Tabular Datasets
Mahin, Md Ibrahim Shikder, Arefin, Md Shamsul, Hasan, Md Tanvir
Healthcare research and development face significant obstacles due to data scarcity and stringent privacy regulations, such as HIPAA and the GDPR, restricting access to essential real-world medical data. These limitations impede innovation, delay robust AI model creation, and hinder advancements in patient-centered care. Synthetic data generation offers a transformative solution by producing artificial datasets that emulate real data statistics while safeguarding patient privacy. We introduce a novel hybrid framework for high-fidelity healthcare data synthesis integrating five augmentation methods: noise injection, interpolation, Gaussian Mixture Model (GMM) sampling, Conditional Variational Autoencoder (CVAE) sampling, and SMOTE, combined via a reinforcement learning-based dynamic weight selection mechanism. Its key innovations include advanced calibration techniques -- moment matching, full histogram matching, soft and adaptive soft histogram matching, and iterative refinement -- that align marginal distributions and preserve joint feature dependencies. Evaluated on the Breast Cancer Wisconsin (UCI Repository) and Khulna Medical College cardiology datasets, our calibrated hybrid achieves Wasserstein distances as low as 0.001 and Kolmogorov-Smirnov statistics around 0.01, demonstrating near-zero marginal discrepancy. Pairwise trend scores surpass 90%, and Nearest Neighbor Adversarial Accuracy approaches 50%, confirming robust privacy protection. Downstream classifiers trained on synthetic data achieve up to 94% accuracy and F1 scores above 93%, comparable to models trained on real data. This scalable, privacy-preserving approach matches state-of-the-art methods, sets new benchmarks for joint-distribution fidelity in healthcare, and supports sensitive AI applications.
- North America > United States > Wisconsin (0.25)
- Europe (0.14)
- North America > United States > Michigan (0.04)
- (2 more...)
- Research Report > Promising Solution (0.54)
- Research Report > New Finding (0.46)
Diagnostics of cognitive failures in multi-agent expert systems using dynamic evaluation protocols and subsequent mutation of the processing context
Sorstkins, Andrejs, Bailey, Josh, Baron, Dr Alistair
The rapid evolution of neural architectures - from multilayer perceptrons to large-scale Transformer-based models - has enabled language models (LLMs) to exhibit emergent agentic behaviours when equipped with memory, planning, and external tool use. However, their inherent stochasticity and multi-step decision processes render classical evaluation methods inadequate for diagnosing agentic performance. This work introduces a diagnostic framework for expert systems that not only evaluates but also facilitates the transfer of expert behaviour into LLM-powered agents. The framework integrates (i) curated golden datasets of expert annotations, (ii) silver datasets generated through controlled behavioural mutation, and (iii) an LLM-based Agent Judge that scores and prescribes targeted improvements. These prescriptions are embedded into a vectorized recommendation map, allowing expert interventions to propagate as reusable improvement trajectories across multiple system instances. We demonstrate the framework on a multi-agent recruiter-assistant system, showing that it uncovers latent cognitive failures - such as biased phrasing, extraction drift, and tool misrouting - while simultaneously steering agents toward expert-level reasoning and style. The results establish a foundation for standardized, reproducible expert behaviour transfer in stochastic, tool-augmented LLM agents, moving beyond static evaluation to active expert system refinement.
- Europe > France (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Gradient-Optimized Fuzzy Classifier: A Benchmark Study Against State-of-the-Art Models
Sieverding, Magnus, Steffen, Nathan, Cohen, Kelly
This paper presents a performance benchmarking study of a Gradient-Optimized Fuzzy Inference System (GF) classifier against several state-of-the-art machine learning models, including Random Forest, XGBoost, Logistic Regression, Support Vector Machines, and Neural Networks. The evaluation was conducted across five datasets from the UCI Machine Learning Repository, each chosen for their diversity in input types, class distributions, and classification complexity. Unlike traditional Fuzzy Inference Systems that rely on derivative-free optimization methods, the GF leverages gradient descent to significantly improving training efficiency and predictive performance. Results demonstrate that the GF model achieved competitive, and in several cases superior, classification accuracy while maintaining high precision and exceptionally low training times. In particular, the GF exhibited strong consistency across folds and datasets, underscoring its robustness in handling noisy data and variable feature sets. These findings support the potential of gradient optimized fuzzy systems as interpretable, efficient, and adaptable alternatives to more complex deep learning models in supervised learning tasks.
- North America > United States > Wisconsin (0.05)
- North America > United States > Ohio > Hamilton County > Cincinnati (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- (2 more...)
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models
Dai, Wei, Chen, Peilin, Lu, Malinda, Li, Daniel, Wei, Haowen, Cui, Hejie, Liang, Paul Pu
Recent advances in clinical AI have enabled remarkable progress across many clinical domains. However, existing benchmarks and models are primarily limited to a small set of modalities and tasks, which hinders the development of large-scale multimodal methods that can make holistic assessments of patient health and well-being. To bridge this gap, we introduce Clinical Large-Scale Integrative Multimodal Benchmark (CLIMB), a comprehensive clinical benchmark unifying diverse clinical data across imaging, language, temporal, and graph modalities. CLIMB comprises 4.51 million patient samples totaling 19.01 terabytes distributed across 2D imaging, 3D video, time series, graphs, and multimodal data. Through extensive empirical evaluation, we demonstrate that multitask pretraining significantly improves performance on understudied domains, achieving up to 29% improvement in ultrasound and 23% in ECG analysis over single-task learning. Pretraining on CLIMB also effectively improves models' generalization capability to new tasks, and strong unimodal encoder performance translates well to multimodal performance when paired with task-appropriate fusion strategies. Our findings provide a foundation for new architecture designs and pretraining strategies to advance clinical AI research. Code is released at https://github.com/DDVD233/climb.
- South America > Brazil (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- (23 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- (7 more...)
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Jalal-Kamali, Ali, Gurney, Nikolos, Pynadath, David
Understanding how individual traits influence team performance is valuable, but these traits are not always directly observable. Prior research has inferred traits like trust from behavioral data. We analyze conversational data to identify team traits and their correlation with teaming outcomes. Using transcripts from a Minecraft-based search-and-rescue experiment, we apply topic modeling and clustering to uncover key interaction patterns. Our findings show that variations in teaming outcomes can be explained through these inferences, with different levels of predictive power derived from individual traits and team dynamics.
- North America > United States > California > Los Angeles County > Los Angeles (0.15)
- North America > United States > Michigan > Wayne County > Detroit (0.04)
- Europe > Sweden > Skåne County > Malmö (0.04)
- Health & Medicine (1.00)
- Government (0.71)
The MRI Scanner as a Diagnostic: Image-less Active Sampling
Du, Yuning, Dharmakumar, Rohan, Tsaftaris, Sotirios A.
Despite the high diagnostic accuracy of Magnetic Resonance Imaging (MRI), using MRI as a Point-of-Care (POC) disease identification tool poses significant accessibility challenges due to the use of high magnetic field strength and lengthy acquisition times. We ask a simple question: Can we dynamically optimise acquired samples, at the patient level, according to an (automated) downstream decision task, while discounting image reconstruction? We propose an ML-based framework that learns an active sampling strategy, via reinforcement learning, at a patient-level to directly infer disease from undersampled k-space. We validate our approach by inferring Meniscus Tear in undersampled knee MRI data, where we achieve diagnostic performance comparable with ML-based diagnosis, using fully sampled k-space data. We analyse task-specific sampling policies, showcasing the adaptability of our active sampling approach. The introduced frugal sampling strategies have the potential to reduce high field strength requirements that in turn strengthen the viability of MRI-based POC disease identification and associated preliminary screening tools.
- South America > Peru > Lima Department > Lima Province > Lima (0.04)
- Europe > United Kingdom (0.04)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
A Targeted Accuracy Diagnostic for Variational Approximations
Wang, Yu, Kasprzak, Mikołaj, Huggins, Jonathan H.
Variational Inference (VI) is an attractive alternative to Markov Chain Monte Carlo (MCMC) due to its computational efficiency in the case of large datasets and/or complex models with high-dimensional parameters. However, evaluating the accuracy of variational approximations remains a challenge. Existing methods characterize the quality of the whole variational distribution, which is almost always poor in realistic applications, even if specific posterior functionals such as the component-wise means or variances are accurate. Hence, these diagnostics are of practical value only in limited circumstances. To address this issue, we propose the TArgeted Diagnostic for Distribution Approximation Accuracy (TADDAA), which uses many short parallel MCMC chains to obtain lower bounds on the error of each posterior functional of interest. We also develop a reliability check for TADDAA to determine when the lower bounds should not be trusted. Numerical experiments validate the practical utility and computational efficiency of our approach on a range of synthetic distributions and real-data examples, including sparse logistic regression and Bayesian neural network models.
- Asia > Middle East > Jordan (0.04)
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Diagnostics
This Special Issue focuses on recent developments in the use of artificial intelligence (AI) for stroke imaging in acute and chronic phases. The use of AI has attracted widespread attention as it relates to the detection of steno-occlusive lesions in the cerebral circulation, tissue level markers of injury in ischemia and hemorrhage and perfusion imaging techniques. Manuscripts should be submitted online at www.mdpi.com Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline.
Deduce Healing with New Scientific Breakthroughs in Diagnostics - Coruzant Technologies
Diagnosis is crucial to providing efficacious healthcare. Let's have a look at the history. Hippocrates (400 BCE) was the first to systemize the process of diagnosis based on a person's symptoms and medical history. About 2500 years later in 1895 the X-ray was discovered by Konrad Roentgen. With this invention, we could now watch radiation pass through the body, leaving shadows of the solid objects.
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.51)
- Health & Medicine > Therapeutic Area > Neurology (0.31)
Mixed Diagnostics for Longitudinal Properties of Electron Bunches in a Free-Electron Laser
Longitudinal properties of electron bunches are critical for the performance of a wide range of scientific facilities. In a free-electron laser, for example, the existing diagnostics only provide very limited longitudinal information of the electron bunch during online tuning and optimization. We leverage the power of artificial intelligence to build a neural network model using experimental data, in order to bring the destructive longitudinal phase space (LPS) diagnostics online virtually and improve the existing current profile online diagnostics which uses a coherent transition radiation (CTR) spectrometer. The model can also serve as a digital twin of the real machine on which algorithms can be tested efficiently and effectively. We demonstrate at the FLASH facility that the encoder-decoder model with more than one decoder can make highly accurate predictions of megapixel LPS images and coherent transition radiation spectra concurrently for electron bunches in a bunch train with broad ranges of LPS shapes and peak currents, which are obtained by scanning all the major control knobs for LPS manipulation.