AITopics

doi: 10.1515/nanoph-2025-0049

2506.20056

Country:

North America > United States > Massachusetts (0.46)
North America > United States > California (0.27)

Genre:

Workflow (1.00)
Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Health & Medicine (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.92)
Energy > Power Industry (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

arXiv.org Machine LearningJul-28-2025

Flow Stochastic Segmentation Networks

Ribeiro, Fabio De Sousa, Todd, Omar, Jones, Charles, Kori, Avinash, Mehta, Raghav, Glocker, Ben

We introduce the Flow Stochastic Segmentation Network (Flow-SSN), a generative segmentation model family featuring discrete-time autoregressive and modern continuous-time flow variants. We prove fundamental limitations of the low-rank parameterisation of previous methods and show that Flow-SSNs can estimate arbitrarily high-rank pixel-wise covariances without assuming the rank or storing the distributional parameters. Flow-SSNs are also more efficient to sample from than standard diffusion-based segmentation models, thanks to most of the model capacity being allocated to learning the base distribution of the flow, constituting an expressive prior. We apply Flow-SSNs to challenging medical imaging benchmarks and achieve state-of-the-art results. Code available: https://github.com/biomedia-mira/flow-ssn.

artificial intelligence, flow-ssn, machine learning, (14 more...)

arXiv.org Machine Learning

2507.18838

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
(2 more...)

arXiv.org Machine LearningJul-28-2025

Probably Approximately Correct Causal Discovery

Wei, Mian, Jha, Somesh, Page, David

The discovery of causal relationships is a foundational problem in artificial intelligence, statistics, epidemiology, economics, and beyond. While elegant theories exist for accurate causal discovery given infinite data, real-world applications are inherently resource-constrained. Effective methods for inferring causal relationships from observational data must perform well under finite data and time constraints, where "performing well" implies achieving high, though not perfect accuracy. In his seminal paper A Theory of the Learnable (Valiant, 1984), Valiant highlighted the importance of resource constraints in supervised machine learning, introducing the concept of Probably Approximately Correct (PAC) learning as an alternative to exact learning. Inspired by Valiant's work, we propose the Probably Approximately Correct Causal (PACC) Discovery framework, which extends PAC learning principles to the causal field. This framework emphasizes both computational and sample efficiency for established causal methods such as propensity score techniques and instrumental variable approaches. Furthermore, we show that it can provide theoretical guarantees for other widely used methods, such as the Self-Controlled Case Series (SCCS) method, which had previously lacked such guarantees.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2507.18903

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)
Research Report > Strength High (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Jahn, Erik, Eberhardt, Frederick, Schulman, Leonard J.

Lower Bounds on the Size of Markov Equivalence Classes

arXiv.org Machine LearningJul-28-2025

Causal discovery algorithms typically recover causal graphs only up to their Markov equivalence classes unless additional parametric assumptions are made. The sizes of these equivalence classes reflect the limits of what can be learned about the underlying causal graph from purely observational data. Under the assumptions of acyclicity, causal sufficiency, and a uniform model prior, Markov equivalence classes are known to be small on average. In this paper, we show that this is no longer the case when any of these assumptions is relaxed. Specifically, we prove exponentially large lower bounds for the expected size of Markov equivalence classes in three settings: sparse random directed acyclic graphs, uniformly random acyclic directed mixed graphs, and uniformly random directed cyclic graphs.

artificial intelligence, equivalence class, machine learning, (15 more...)

arXiv.org Machine Learning

2506.20933

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Masouleh, Shayan S. Mousavi, Sanz, Corey A., Jansonius, Ryan P., Cronin, Cara, Hein, Jason E., Hattrick-Simpers, Jason

Human-AI Synergy in Adaptive Active Learning for Continuous Lithium Carbonate Crystallization Optimization

As demand for high-purity lithium surges with the growth of the electric vehicle (EV) industry, cost-effective extraction from lower-grade North American sources like the Smackover Formation is critical. These resources, unlike high-purity South American brines, require innovative purification techniques to be economically viable. Continuous crystallization is a promising method for producing battery-grade lithium carbonate, but its optimization is challenged by a complex parameter space and limited data. This study introduces a Human-in-the-Loop (HITL) assisted active learning framework to optimize the continuous crystallization of lithium carbonate. By integrating human expertise with data-driven insights, our approach accelerates the optimization of lithium extraction from challenging sources. Our results demonstrate the framework's ability to rapidly adapt to new data, significantly improving the process's tolerance to critical impurities like magnesium from the industry standard of a few hundred ppm to as high as 6000 ppm. This breakthrough makes the exploitation of low-grade, impurity-rich lithium resources feasible, potentially reducing the need for extensive pre-refinement processes. By leveraging artificial intelligence, we have refined operational parameters and demonstrated that lower-grade materials can be used without sacrificing product quality. This advancement is a significant step towards economically harnessing North America's vast lithium reserves, such as those in the Smackover Formation, and enhancing the sustainability of the global lithium supply chain.

artificial intelligence, battery, machine learning, (17 more...)

2507.19316

Country:

North America > United States (1.00)
North America > Canada (0.70)

Genre: Research Report > New Finding (1.00)

Industry: Materials > Metals & Mining > Lithium (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Kumari, HMNS, Kumari, HMLS, Nawarathne, UMMPK

Differentiated Thyroid Cancer Recurrence Classification Using Machine Learning Models and Bayesian Neural Networks with Varying Priors: A SHAP-Based Interpretation of the Best Performing Model

Differentiated thyroid cancer DTC recurrence is a major public health concern, requiring classification and predictive models that are not only accurate but also interpretable and uncertainty aware. This study introduces a comprehensive framework for DTC recurrence classification using a dataset containing 383 patients and 16 clinical and pathological variables. Initially, 11 machine learning ML models were employed using the complete dataset, where the Support Vector Machines SVM model achieved the highest accuracy of 0.9481. To reduce complexity and redundancy, feature selection was carried out using the Boruta algorithm, and the same ML models were applied to the reduced dataset, where it was observed that the Logistic Regression LR model obtained the maximum accuracy of 0.9611. However, these ML models often lack uncertainty quantification, which is critical in clinical decision making. Therefore, to address this limitation, the Bayesian Neural Networks BNN with six varying prior distributions, including Normal 0,1, Normal 0,10, Laplace 0,1, Cauchy 0,1, Cauchy 0,2.5, and Horseshoe 1, were implemented on both the complete and reduced datasets. The BNN model with Normal 0,10 prior distribution exhibited maximum accuracies of 0.9740 and 0.9870 before and after feature selection, respectively.

artificial intelligence, machine learning, recurrence, (16 more...)

2507.18987

Country:

Asia (0.46)
Europe (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Thyroid Cancer (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Ivey, Jonathan, Gauch, Susan, Jurgens, David

NUTMEG: Separating Signal From Noise in Annotator Disagreement

NLP models often rely on human-labeled data for training and evaluation. Many approaches crowdsource this data from a large number of annotators with varying skills, backgrounds, and motivations, resulting in conflicting annotations. These conflicts have traditionally been resolved by aggregation methods that assume disagreements are errors. Recent work has argued that for many tasks annotators may have genuine disagreements and that variation should be treated as signal rather than noise. However, few models separate signal and noise in annotator disagreement. In this work, we introduce NUTMEG, a new Bayesian model that incorporates information about annotator backgrounds to remove noisy annotations from human-labeled training data while preserving systematic disagreements. Using synthetic data, we show that NUTMEG is more effective at recovering ground-truth from annotations with systematic disagreement than traditional aggregation methods. We provide further analysis characterizing how differences in subpopulation sizes, rates of disagreement, and rates of spam affect the performance of our model. Finally, we demonstrate that downstream models trained on NUTMEG-aggregated data significantly outperform models trained on data from traditionally aggregation methods. Our results highlight the importance of accounting for both annotator competence and systematic disagreements when training on human-labeled data.

artificial intelligence, machine learning, natural language, (18 more...)

2507.1889

Country:

North America > United States (1.00)
Asia (0.68)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Communications > Social Media > Crowdsourcing (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)

Early Mortality Prediction in ICU Patients with Hypertensive Kidney Disease Using Interpretable Machine Learning

Si, Yong, Fan, Junyi, Sun, Li, Chen, Shuheng, Ahmadi, Minoo, Pishgar, Elham, Alaei, Kamiar, Placencia, Greg, Pishgar, Maryam

Background: Hypertensive kidney disease (HKD) patients in intensive care units (ICUs) face high short-term mortality, but tailored risk prediction tools are lacking. Early identification of high-risk individuals is crucial for clinical decision-making. Methods: We developed a machine learning framework to predict 30-day in-hospital mortality among ICU patients with HKD using early clinical data from the MIMIC-IV v2.2 database. A cohort of 1,366 adults was curated with strict criteria, excluding malignancy cases. Eighteen clinical features-including vital signs, labs, comorbidities, and therapies-were selected via random forest importance and mutual information filtering. Several models were trained and compared with stratified five-fold cross-validation; CatBoost demonstrated the best performance. Results: CatBoost achieved an AUROC of 0.88 on the independent test set, with sensitivity of 0.811 and specificity of 0.798. SHAP values and Accumulated Local Effects (ALE) plots showed the model relied on meaningful predictors such as altered consciousness, vasopressor use, and coagulation status. Additionally, the DREAM algorithm was integrated to estimate patient-specific posterior risk distributions, allowing clinicians to assess both predicted mortality and its uncertainty. Conclusions: We present an interpretable machine learning pipeline for early, real-time risk assessment in ICU patients with HKD. By combining high predictive performance with uncertainty quantification, our model supports individualized triage and transparent clinical decisions. This approach shows promise for clinical deployment and merits external validation in broader critical care populations.

artificial intelligence, kidney disease, machine learning, (16 more...)

2507.18866

Country:

Asia (0.68)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Sun, Minghui, Goldstein, Benjamin A., Engelhard, Matthew M.

CLEAR: Unlearning Spurious Style-Content Associations with Contrastive LEarning with Anti-contrastive Regularization

Learning representations unaffected by superficial characteristics is important to ensure that shifts in these characteristics at test time do not compromise downstream prediction performance. For instance, in healthcare applications, we might like to learn features that contain information about pathology yet are unaffected by race, sex, and other sources of physiologic variability, thereby ensuring predictions are equitable and generalizable across all demographics. Here we propose Contrastive LEarning with Anti-contrastive Regularization (CLEAR), an intuitive and easy-to-implement framework that effectively separates essential (i.e., task-relevant) characteristics from superficial (i.e., task-irrelevant) characteristics during training, leading to better performance when superficial characteristics shift at test time. We begin by supposing that data representations can be semantically separated into task-relevant content features, which contain information relevant to downstream tasks, and task-irrelevant style features, which encompass superficial attributes that are irrelevant to these tasks, yet may degrade performance due to associations with content present in training data that do not generalize. We then prove that our anti-contrastive penalty, which we call Pair-Switching (PS), minimizes the Mutual Information between the style attributes and content labels. Finally, we instantiate CLEAR in the latent space of a Variational Auto-Encoder (VAE), then perform experiments to quantitatively and qualitatively evaluate the resulting CLEAR-VAE over several image datasets. Our results show that CLEAR-VAE allows us to: (a) swap and interpolate content and style between any pair of samples, and (b) improve downstream classification performance in the presence of previously unseen combinations of content and style. Our code will be made publicly available.

artificial intelligence, machine learning, representation, (17 more...)

2507.18794

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Diagnostic Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution

Park, Yonghyun, Lai, Chieh-Hsin, Hayakawa, Satoshi, Takida, Yuhta, Murata, Naoki, Liao, Wei-Hsiang, Choi, Woosung, Cheuk, Kin Wai, Koo, Junghyun, Mitsufuji, Yuki

While diffusion models excel at image generation, their growing adoption raises critical concerns around copyright issues and model transparency. Existing attribution methods identify training examples influencing an entire image, but fall short in isolating contributions to specific elements, such as styles or objects, that matter most to stakeholders. To bridge this gap, we introduce \emph{concept-level attribution} via a novel method called \emph{Concept-TRAK}. Concept-TRAK extends influence functions with two key innovations: (1) a reformulated diffusion training loss based on diffusion posterior sampling, enabling robust, sample-specific attribution; and (2) a concept-aware reward function that emphasizes semantic relevance. We evaluate Concept-TRAK on the AbC benchmark, showing substantial improvements over prior methods. Through diverse case studies--ranging from identifying IP-protected and unsafe content to analyzing prompt engineering and compositional learning--we demonstrate how concept-level attribution yields actionable insights for responsible generative AI development and governance.

artificial intelligence, attribution, machine learning, (16 more...)

2507.06547

Country: Europe > Austria (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)