AITopics | cdi

Collaborating Authors

cdi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dimension-Uniform Discretization Analysis of Preconditioned Annealed Langevin Dynamics for Multimodal Gaussian Mixtures

Baldassari, Lorenzo, Garnier, Josselin, Solna, Knut, de Hoop, Maarten V.

arXiv.org Machine LearningMay-19-2026

Obtaining stable diffusion-based samplers in high- and infinite-dimensional settings is challenging because errors can accumulate across high-frequency coordinates and make the dynamics unstable under refinement of the finite-dimensional approximation of the underlying function-space problem. Discretization is a typical source of such errors, and preconditioning with a suitable spectral decay is one way to control their accumulation. In this paper, we study this problem for preconditioned annealed Langevin dynamics (ALD) applied to Gaussian mixtures. We first show that Euler-Maruyama (EM) discretization, by treating the stiff linear part of the annealed score with a forward Euler step, imposes a stability constraint coupling the preconditioner with the annealed covariance scale. Together with the conditions ensuring dimension-uniform control of the annealed dynamics, this constraint forces the initial smoothed law to remain uniformly close to the target across dimensions. We then consider an exponential-integrator scheme that integrates the stiff linear part of the annealed score exactly. Under explicit spectral summability conditions coupling the smoothing covariance, the component covariance spectra, and the preconditioner, we prove a dimension-uniform Kullback-Leibler (KL) bound for this scheme. This bound can be made arbitrarily small, uniformly in dimension, by allowing enough time for annealing and then refining the time mesh accordingly. Importantly, these conditions allow regimes in which the KL divergence between the target and the initial smoothed law diverges with dimension, showing that the restrictions imposed by EM are scheme-dependent rather than intrinsic to ALD.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Machine Learning

2605.16473

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Automatic coherence-driven inference on arguments

Huntsman, Steve

arXiv.org Artificial IntelligenceSep-24-2025

CDI also offers a plausible approach for automatically making sense of competing arguments in a way that accords with the features enumerated here. This paper is part of an argument that it is now feasible to computationally instantiate a reasonable approximation of a coherence theory of truth [64]: the recent benchmark [12] provides additional quantitative evidence in this direction. By "hard-coding" acceptance of conclusively established propositions, this theory can furthermore be anchored in a correspondence theory of truth [65]. In other words, coherence computations can be required to incorporate privileged information that also coheres with observed reality. While it is easy to imagine attempts to try the same thing with privileged information that does not cohere with observed reality, lies cannot persist when they can easily be unraveled. Even with flawless technology (which this will not be), obstacles will be manifold. For example, in a pluralistic society, legal coherence may actually require sacrificing fairness in some ways [66]. Ultimately, people must decide matters for themselves. It is only reasonable to hope that technology can serve as a reliable tool to help people make their decisions more coherent.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.18523

Country:

North America > United States (1.00)
Europe (0.70)

Genre: Research Report (0.50)

Industry:

Education (1.00)
Law > Government & the Courts (0.94)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)

Add feedback

Coherence-driven inference for cybersecurity

Huntsman, Steve

arXiv.org Artificial IntelligenceSep-24-2025

Large language models (LLMs) can compile weighted graphs on natural language data to enable automatic coherence-driven inference (CDI) relevant to red and blue team operations in cybersecurity. This represents an early application of automatic CDI that holds near- to medium-term promise for decision-making in cybersecurity and eventually also for autonomous blue team operations.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.1852

Genre: Research Report (0.51)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)

Add feedback

The Einstein Test: Towards a Practical Test of a Machine's Ability to Exhibit Superintelligence

Benrimoh, David, Mikus, Nace, Rosenfeld, Ariel

arXiv.org Artificial IntelligenceJan-12-2025

Creative and disruptive insights (CDIs), such as the development of the theory of relativity, have punctuated human history, marking pivotal shifts in our intellectual trajectory. Recent advancements in artificial intelligence (AI) have sparked debates over whether state of the art models possess the capacity to generate CDIs. We argue that the ability to create CDIs should be regarded as a significant feature of machine superintelligence (SI).To this end, we propose a practical test to evaluate whether an approach to AI targeting SI can yield novel insights of this kind. We propose the Einstein test: given the data available prior to the emergence of a known CDI, can an AI independently reproduce that insight (or one that is formally equivalent)? By achieving such a milestone, a machine can be considered to at least match humanity's past top intellectual achievements, and therefore to have the potential to surpass them.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.06948

Country:

Europe > United Kingdom > England (0.46)
North America > Canada > Quebec (0.28)

Genre: Research Report (0.84)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.30)

Add feedback

CDI: Copyrighted Data Identification in Diffusion Models

Dubiński, Jan, Kowalczuk, Antoni, Boenisch, Franziska, Dziedzic, Adam

arXiv.org Artificial IntelligenceNov-24-2024

Diffusion Models (DMs) benefit from large and diverse datasets for their training. Since this data is often scraped from the Internet without permission from the data owners, this raises concerns about copyright and intellectual property protections. While (illicit) use of data is easily detected for training samples perfectly re-created by a DM at inference time, it is much harder for data owners to verify if their data was used for training when the outputs from the suspect DM are not close replicas. Conceptually, membership inference attacks (MIAs), which detect if a given data point was used during training, present themselves as a suitable tool to address this challenge. However, we demonstrate that existing MIAs are not strong enough to reliably determine the membership of individual images in large, state-of-the-art DMs. To overcome this limitation, we propose CDI, a framework for data owners to identify whether their dataset was used to train a given DM. CDI relies on dataset inference techniques, i.e., instead of using the membership signal from a single data point, CDI leverages the fact that most data owners, such as providers of stock photography, visual media companies, or even individual artists, own datasets with multiple publicly exposed data points which might all be included in the training of a given DM. By selectively aggregating signals from existing MIAs and using new handcrafted methods to extract features for these datasets, feeding them to a scoring model, and applying rigorous statistical testing, CDI allows data owners with as little as 70 data points to identify with a confidence of more than 99% whether their data was used to train a given DM. Thereby, CDI represents a valuable tool for data owners to claim illegitimate use of their copyrighted data.

cdi, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.12858

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Orange County > Anaheim (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Limits to classification performance by relating Kullback-Leibler divergence to Cohen's Kappa

Crow, L., Watts, S. J.

arXiv.org Machine LearningMar-3-2024

The performance of machine learning classification algorithms are evaluated by estimating metrics, often from the confusion matrix, using training data and cross-validation. However, these do not prove that the best possible performance has been achieved. Fundamental limits to error rates can be estimated using information distance measures. To this end, the confusion matrix has been formulated to comply with the Chernoff-Stein Lemma. This links the error rates to the Kullback-Leibler divergences between the probability density functions describing the two classes. This leads to a key result that relates Cohen's Kappa to the Resistor Average Distance which is the parallel resistor combination of the two Kullback-Leibler divergences. The Resistor Average Distance has units of bits and is estimated from the same training data used by the classification algorithm, using kNN estimates of the KullBack-Leibler divergences. The classification algorithm gives the confusion matrix and Kappa. Theory and methods are discussed in detail and then applied to Monte Carlo data and real datasets. Four very different real datasets - Breast Cancer, Coronary Heart Disease, Bankruptcy, and Particle Identification - are analysed, with both continuous and discrete values, and their classification performance compared to the expected theoretical limit. In all cases this analysis shows that the algorithms could not have performed any better due to the underlying probability density functions for the two classes. Important lessons are learnt on how to predict the performance of algorithms for imbalanced data using training datasets that are approximately balanced. Machine learning is very powerful but classification performance ultimately depends on the quality of the data and the relevance of the variables to the problem.

algorithm, classification algorithm, divergence, (12 more...)

arXiv.org Machine Learning

2403.01571

Country:

Europe > United Kingdom (0.14)
Africa > South Africa (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.89)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.76)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Viewpoint: Regulatory Interest in Big Data, AI More Than a Carrier Problem - Carrier Management

#artificialintelligenceNov-15-2022, 15:28:20 GMT

The California Insurance Commissioner and the California Department of Insurance (CDI) recently issued a bulletin regarding industry bias and discrimination. The bulletin acknowledged allegations of bias and discrimination in the industry and gave notice to insurance players that the CDI is watching and that "bias and discrimination in any form will be investigated and will not be tolerated." The bulletin is addressed to "All Admitted and Non-Admitted Insurance Companies, Licensees, and Other Interested Parties" -- clearly intending to cause awareness and attention beyond the carrier ecosystem. So, what does this mean? California has been a leader in following Europe regarding consumer protection laws.

bias and discrimination, discrimination, regulation, (11 more...)

#artificialintelligence

Country:

North America > United States > California (0.75)
Europe (0.26)

Industry: Banking & Finance > Insurance (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.42)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.34)

Add feedback

Pivoting CDI: The World of Healthcare Watches

#artificialintelligenceMar-30-2022, 03:55:05 GMT

Is CDI about to embark on a long journey to reinvent Itself? There is no arguing that artificial intelligence (AI) and natural language processing (NLP) are making inroads in the healthcare revenue cycle, creating better efficiencies with the automation of a multitude of historically manually performed tasks, thereby reducing positions that were once performed by staff. AI is clearly beginning to take hold and make significant inroads in the clinical documentation integrity (CDI) space. I have noticed serval posts on LinkedIn, as well as in Becker's Healthcare e-newsletters, discussing the role of AI in the revenue cycle. Just recently, there was a blog post published in KevinMD titled "How an AI bot transformed my EHR experience (KevinMD blog)" centering on how AI streamlined the provider's documentation and charting in the electronic health record (EHR) by scanning through the documentation as the note is being completed, providing suggested diagnoses with associated ICD-10 codes.

documentation, physician documentation, profession, (16 more...)

#artificialintelligence

Country: North America > United States (0.15)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.71)
Health & Medicine > Health Care Providers & Services > Reimbursement (0.48)

Technology:

Information Technology > Communications > Social Media (0.90)
Information Technology > Artificial Intelligence > Natural Language (0.90)

Add feedback

Job: CDD (6 months), Linguist, Yseop, 6 academic posts, Job: CDI, Young doctor in data science / ML / DL / NLP, Post-doc (CEA List and LISN), CIFRE thesis proposal

#artificialintelligenceOct-18-2021, 11:53:21 GMT

Scientific context: The ambition of the CATCH project is to propose artificial intelligence and deep learning tools to take into account and automatically exploit the multitude of human testimonies related to an industrial accident and its consequences on the environment and health. By involving the population in the collection and analysis of data, particularly through social networks, and by providing effective means for interpreting this data, the proposed solution should contribute to providing answers to the worrying problem of industrial accidents and their consequences.

cea list and lisn, cifre thesis proposal, data science ml dl nlp, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Industry Voices--Not all automation is created equally for clinical documentation improvement

#artificialintelligenceSep-24-2021, 22:50:45 GMT

Healthcare system survival pivots on many metrics, but the ability to generate revenue and to evidence high quality of care are two of the most essential. At the center of both metrics is the clinical documentation process, where an accurate representation of every patient's clinical experience while in a provider's care must be recorded. As simple as it may sound, achieving that accurate reflection of diagnoses, interventions and the clinical picture is anything but simple. Medicine is as much science as it is art, and complex definitions, levels of specificity and complex medical terminology mean that most hospitals struggle to document everything properly, leading to significant lost revenues and under-reporting on quality metrics. Health systems have answered this challenge by standing up clinical documentation integrity (CDI) programs, staffed with clinicians.

automation, cdi, clinician, (13 more...)

#artificialintelligence

Industry: Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.55)
Information Technology > Artificial Intelligence > Natural Language (0.49)

Add feedback