AITopics | cld

Collaborating Authors

cld

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes

Neural Information Processing SystemsJun-22-2026, 22:43:31 GMT

We analyze the generalization gap (gap between the training and test errors) when training a potentially over-parametrized model using a Markovian stochastic training algorithm, initialized from some distribution θ0 p0. We focus on Langevin dynamics with a positive temperature β 1, i.e. gradient descent on a training loss Lwith infinitesimal step size, perturbed with β 1-variances Gaussian noise, and lightly regularized or bounded. There, we bound the generalization gap, at any time during training, by p (βEL(θ0)+ln(1/δ))/N with probability 1 δ over the dataset, where N is the sample size, and EL(θ0) = O(1)with standard initialization scaling. In contrast to previous guarantees, we have no dependence on either training time or reliance on mixing, nor a dependence on dimensionality, gradient norms, or any other properties of the loss or model. This guarantee follows from a general analysis of any Markov process-based training that has a Gibbs-style stationary distribution. The proof is surprisingly simple, once we observe that the marginal distribution divergence from initialization remains bounded, as implied by a generalized second law of thermodynamics.

artificial intelligence, generalization, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States > New York (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Add feedback

Coresets from Trajectories: Selecting Data via Correlation of Loss Differences

Nagaraj, Manish, Ravikumar, Deepak, Roy, Kaushik

arXiv.org Artificial IntelligenceNov-20-2025

Deep learning models achieve state-of-the-art performance across domains but face scalability challenges in real-time or resource-constrained scenarios. To address this, we propose Correlation of Loss Differences (CLD), a simple and scalable metric for coreset selection that identifies the most impactful training samples by measuring their alignment with the loss trajectories of a held-out validation set. CLD is highly efficient, requiring only per-sample loss values computed at training checkpoints, and avoiding the costly gradient and curvature computations used in many existing subset selection methods. We develop a general theoretical framework that establishes convergence guarantees for CLD-based coresets, demonstrating that the convergence error is upper-bounded by the alignment of the selected samples and the representativeness of the validation set. On CIFAR-100 and ImageNet-1k, CLD-based coresets typically outperform or closely match state-of-the-art methods across subset sizes, and remain within 1% of more computationally expensive baselines even when not leading. CLD transfers effectively across architectures (ResNet, VGG, DenseNet), enabling proxy-to-target selection with <1% degradation. Moreover, CLD is stable when using only early checkpoints, incurring negligible accuracy loss. Finally, CLD exhibits inherent bias reduction via per-class validation alignment, obviating the need for additional stratified sampling. Together, these properties make CLD a principled, efficient, stable, and transferable tool for scalable dataset optimization.

artificial intelligence, coreset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.2023

Country: North America (0.27)

Genre: Research Report (1.00)

Industry: Government > Military (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diagrams-to-Dynamics (D2D): Exploring Causal Loop Diagram Leverage Points under Uncertainty

Uleman, Jeroen F., Crielaard, Loes, Elsenburg, Leonie K., Veldhuis, Guido A., Stronks, Karien, Rod, Naja Hulvej, Quax, Rick, Vasconcelos, Vítor V.

arXiv.org Machine LearningAug-12-2025

Causal loop diagrams (CLDs) are widely used in health and environmental research to represent hypothesized causal structures underlying complex problems. However, as qualitative and static representations, CLDs are limited in their ability to support dynamic analysis and inform intervention strategies. Additionally, quantitative CLD analysis methods like network centrality analysis often lead to false inference. We propose Diagrams-to-Dynamics (D2D), a method for converting CLDs into exploratory system dynamics models (SDMs) in the absence of empirical data. With minimal user input - following a protocol to label variables as stocks, flows or auxiliaries, and constants - D2D leverages the structural information already encoded in CLDs, namely, link existence and polarity, to simulate hypothetical interventions and explore potential leverage points under uncertainty. Results suggest that D2D helps distinguish between high- and low-ranked leverage points. We compare D2D to a data-driven SDM constructed from the same CLD and variable labels. D2D showed greater consistency with the data-driven model than network centrality analysis, while providing uncertainty estimates and guidance for future data collection. The method is implemented in an open-source Python package and a web-based application to support further testing and lower the barrier to dynamic modeling for researchers working with CLDs. We expect additional validation will further establish the approach's utility across a broad range of cases and domains.

artificial intelligence, intervention, machine learning, (16 more...)

arXiv.org Machine Learning

2508.05659

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Public Health (0.68)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Lyme Disease Controversy: An AI-Driven Discourse Analysis of a Quarter Century of Academic Debate and Divides

Susnjak, Teo, Palffy, Cole, Zimina, Tatiana, Altynbekova, Nazgul, Garg, Kunal, Gilbert, Leona

arXiv.org Artificial IntelligenceApr-15-2025

The scientific discourse surrounding Chronic Lyme Disease (CLD) and Post-Treatment Lyme Disease Syndrome (PTLDS) has evolved over the past twenty-five years into a complex and polarised debate, shaped by shifting research priorities, institutional influences, and competing explanatory models. This study presents the first large-scale, systematic examination of this discourse using an innovative hybrid AI-driven methodology, combining large language models with structured human validation to analyse thousands of scholarly abstracts spanning 25 years. By integrating Large Language Models (LLMs) with expert oversight, we developed a quantitative framework for tracking epistemic shifts in contested medical fields, with applications to other content analysis domains. Our analysis revealed a progressive transition from infection-based models of Lyme disease to immune-mediated explanations for persistent symptoms. This study offers new empirical insights into the structural and epistemic forces shaping Lyme disease research, providing a scalable and replicable methodology for analysing discourse, while underscoring the value of AI-assisted methodologies in social science and medical research.

classification, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.08777

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Instructional Material (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Leveraging Large Language Models for Automated Causal Loop Diagram Generation: Enhancing System Dynamics Modeling through Curated Prompting Techniques

Liu, Ning-Yuan Georgia, Keith, David R.

arXiv.org Artificial IntelligenceMar-23-2025

T ransforming a dynamic hypothesis into a causal loop diagram (CLD) is crucial for System Dynamics Modelling. Extracting key variables and causal relationships from text to build a CLD is often challenging and time - consuming for novice modelers, limiting SD tool adoption. This paper introduces and tests a method for automating the translation of dynamic hypotheses into CLDs using large language models (LLMs) with curated prompting techniques. We first describe how LLMs work and how they can make the inferences needed to build CLDs using a standard digraph structure. Next, we develop a set of simple dynamic hypothe ses and corresponding CLDs from leading SD textbooks. We then compare the four different combinations of prompting technique s, evaluating their performance against CLD s labeled by expert modelers . Results show that for simple model structures and using curated prompting techniques, LLMs can generate CLDs of a similar quality to expert - built ones, accelerating CLD creation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.21798

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Measuring Similarity in Causal Graphs: A Framework for Semantic and Structural Analysis

Liu, Ning-Yuan Georgia, Yang, Flower, Jalali, Mohammad S.

arXiv.org Artificial IntelligenceMar-13-2025

Causal graphs are commonly used to understand and model complex systems. Researchers often construct these graphs from different perspectives, leading to significant variations for the same problem. Comparing causal graphs is, therefore, essential for evaluating assumptions, integrating insights, and resolving disagreements. The rise of AI tools has further amplified this need, as they are increasingly used to generate hypothesized causal graphs by synthesizing information from various sources such as prior research and community inputs, providing the potential for automating and scaling causal modeling for complex systems. Similar to humans, these tools also produce inconsistent results across platforms, versions, and iterations. Despite its importance, research on causal graph comparison remains scarce. Existing methods often focus solely on structural similarities, assuming identical variable names, and fail to capture nuanced semantic relationships, which is essential for causal graph comparison. We address these gaps by investigating methods for comparing causal graphs from both semantic and structural perspectives. First, we reviewed over 40 existing metrics and, based on predefined criteria, selected nine for evaluation from two threads of machine learning: four semantic similarity metrics and five learning graph kernels. We discuss the usability of these metrics in simple examples to illustrate their strengths and limitations. We then generated a synthetic dataset of 2,000 causal graphs using generative AI based on a reference diagram. Our findings reveal that each metric captures a different aspect of similarity, highlighting the need to use multiple metrics.

causal graph, graph, similarity, (17 more...)

arXiv.org Artificial Intelligence

2503.11046

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
(9 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Text Attribute Control via Closed-Loop Disentanglement

Sha, Lei, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceNov-30-2023

Changing an attribute of a text without changing the content usually requires to first disentangle the text into irrelevant attributes and content representations. After that, in the inference phase, the representation of one attribute is tuned to a different value, expecting that the corresponding attribute of the text can also be changed accordingly. The usual way of disentanglement is to add some constraints on the latent space of an encoder-decoder architecture, including adversarial-based constraints and mutual-information-based constraints. However, the previous semi-supervised processes of attribute change are usually not enough to guarantee the success of attribute change and content preservation. In this paper, we propose a novel approach to achieve a robust control of attributes while enhancing content preservation. In this approach, we use a semi-supervised contrastive learning method to encourage the disentanglement of attributes in latent spaces. Differently from previous works, we re-disentangle the reconstructed sentence and compare the re-disentangled latent space with the original latent space, which makes a closed-loop disentanglement process. This also helps content preservation. In addition, the contrastive learning method is also able to replace the role of minimizing mutual information and adversarial training in the disentanglement process, which alleviates the computation cost. We conducted experiments on three text datasets, including the Yelp Service review dataset, the Amazon Product review dataset, and the GoEmotions dataset. The experimental results show the effectiveness of our model.

disentanglement, latent space, representation, (13 more...)

arXiv.org Artificial Intelligence

2312.00277

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Austria > Vienna (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification

Zheng, Hao, Wang, Runqi, Liu, Jianzhuang, Kanezaki, Asako

arXiv.org Artificial IntelligenceNov-4-2023

The conventional few-shot classification aims at learning a model on a large labeled base dataset and rapidly adapting to a target dataset that is from the same distribution as the base dataset. However, in practice, the base and the target datasets of few-shot classification are usually from different domains, which is the problem of cross-domain few-shot classification. We tackle this problem by making a small proportion of unlabeled images in the target domain accessible in the training stage. In this setup, even though the base data are sufficient and labeled, the large domain shift still makes transferring the knowledge from the base dataset difficult. We meticulously design a cross-level knowledge distillation method, which can strengthen the ability of the model to extract more discriminative features in the target dataset by guiding the network's shallow layers to learn higher-level information. Furthermore, in order to alleviate the overfitting in the evaluation stage, we propose a feature denoising operation which can reduce the feature redundancy and mitigate overfitting. Our approach can surpass the previous state-of-the-art method, Dynamic-Distillation, by 5.44% on 1-shot and 1.37% on 5-shot classification tasks on average in the BSCD-FSL benchmark. The implementation code will be available at https://github.com/jarucezh/cldfd.

base dataset, dataset, student, (16 more...)

arXiv.org Artificial Intelligence

2311.02392

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Generative Modeling with Phase Stochastic Bridges

Chen, Tianrong, Gu, Jiatao, Dinh, Laurent, Theodorou, Evangelos A., Susskind, Josh, Zhai, Shuangfei

arXiv.org Artificial IntelligenceOct-12-2023

Diffusion models (DMs) represent state-of-the-art generative models for continuous inputs. DMs work by constructing a Stochastic Differential Equation (SDE) in the input space (ie, position space), and using a neural network to reverse it. In this work, we introduce a novel generative modeling framework grounded in \textbf{phase space dynamics}, where a phase space is defined as {an augmented space encompassing both position and velocity.} Leveraging insights from Stochastic Optimal Control, we construct a path measure in the phase space that enables efficient sampling. {In contrast to DMs, our framework demonstrates the capability to generate realistic data points at an early stage of dynamics propagation.} This early prediction sets the stage for efficient data generation by leveraging additional velocity information along the trajectory. On standard image generation benchmarks, our model yields favorable performance over baselines in the regime of small Number of Function Evaluations (NFEs). Furthermore, our approach rivals the performance of diffusion models equipped with efficient sampling techniques, underscoring its potential as a new tool generative modeling.

agm-ode uncurated cifar-10 sample, arxiv preprint arxiv, trajectory, (12 more...)

arXiv.org Artificial Intelligence

2310.07805

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

A Complete Recipe for Diffusion Generative Models

Pandey, Kushagra, Mandt, Stephan

arXiv.org Machine LearningOct-11-2023

Score-based Generative Models (SGMs) have demonstrated exceptional synthesis outcomes across various tasks. However, the current design landscape of the forward diffusion process remains largely untapped and often relies on physical heuristics or simplifying assumptions. Utilizing insights from the development of scalable Bayesian posterior samplers, we present a complete recipe for formulating forward processes in SGMs, ensuring convergence to the desired target distribution. Our approach reveals that several existing SGMs can be seen as specific manifestations of our framework. Building upon this method, we introduce Phase Space Langevin Diffusion (PSLD), which relies on score-based modeling within an augmented space enriched by auxiliary variables akin to physical phase space. Empirical results exhibit the superior sample quality and improved speed-quality trade-off of PSLD compared to various competing approaches on established image synthesis benchmarks. Remarkably, PSLD achieves sample quality akin to state-of-the-art SGMs (FID: 2.10 for unconditional CIFAR-10 generation). Lastly, we demonstrate the applicability of PSLD in conditional synthesis using pre-trained score networks, offering an appealing alternative as an SGM backbone for future advancements. Code and model checkpoints can be accessed at \url{https://github.com/mandt-lab/PSLD}.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2303.01748

Country:

North America > United States > California > Orange County > Irvine (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.61)

Add feedback