AITopics | Generative AI

Collaborating Authors

Generative AI

News Overviews Instructional Materials AI-Alerts Classics

On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds

Neural Information Processing SystemsMay-29-2025, 14:44:24 GMT

Generative networks have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that generative networks can generate high-dimensional complex data from a low-dimensional easy-tosample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that real-world data sets, such as natural images and signals, exhibit low-dimensional geometric structures. In this paper, we take such low-dimensional data structures into consideration by assuming that data distributions are supported on a low-dimensional manifold. We prove statistical guarantees of generative networks under the Wasserstein-1 loss. We show that the Wasserstein-1 loss converges to zero at a fast rate depending on the intrinsic dimension instead of the ambient data dimension. Our theory leverages the low-dimensional geometric structures in data sets and justifies the practical power of generative networks. We require no smoothness assumptions on the data distribution which is desirable in practice.

artificial intelligence, machine learning, manifold, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.42)

Add feedback

Out-of-Distribution Detection with a Single Unconditional Diffusion Model Alvin Heng 1, Harold Soh

Neural Information Processing SystemsMay-29-2025, 11:47:04 GMT

Out-of-distribution (OOD) detection is a critical task in machine learning that seeks to identify abnormal samples. Traditionally, unsupervised methods utilize a deep generative model for OOD detection. However, such approaches require a new model to be trained for each inlier dataset. This paper explores whether a single model can perform OOD detection across diverse tasks. To that end, we introduce Diffusion Paths (DiffPath), which uses a single diffusion model originally trained to perform unconditional generation for OOD detection. We introduce a novel technique of measuring the rate-of-change and curvature of the diffusion paths connecting samples to the standard normal. Extensive experiments show that with a single model, DiffPath is competitive with prior work using individual models on a variety of OOD tasks involving different distributions.

artificial intelligence, detection, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > Singapore (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Debiasing Synthetic Data Generated by Deep Generative Models Ghent University Hospital - SYNDARA Ghent University Hospital - SYNDARA Paloma Rabaey

Neural Information Processing SystemsMay-29-2025, 10:13:06 GMT

While synthetic data hold great promise for privacy protection, their statistical analysis poses significant challenges that necessitate innovative solutions. The use of deep generative models (DGMs) for synthetic data generation is known to induce considerable bias and imprecision into synthetic data analyses, compromising their inferential utility as opposed to original data analyses. This bias and uncertainty can be substantial enough to impede statistical convergence rates, even in seemingly straightforward analyses like mean calculation.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > New York > New York County > New York City (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Health Care Providers & Services (0.76)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

How ChatGPT could replace the internet as we know it

ZDNetMay-29-2025, 10:01:04 GMT

As I watched the internet debate over who would win between 100 men and 1 gorilla (yes, this is an actual debate online), I couldn't help but become invested in the outcome. I laughed at the ridiculous comments and marveled at the stunning AI videos. Then something hit me like a ton of bricks, or in this case, like a gorilla. "Self, OpenAI will probably be the biggest company we have ever seen." I know, I know -- that's a crazy thing to say, and it's incredibly random.

large language model, machine learning, natural language, (18 more...)

ZDNet

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.33)

Add feedback

A Geometric View of Data Complexity: Efficient Local Intrinsic Dimension Estimation with Diffusion Models

Neural Information Processing SystemsMay-29-2025, 07:57:35 GMT

High-dimensional data commonly lies on low-dimensional submanifolds, and estimating the local intrinsic dimension (LID) of a datum - i.e. the dimension of the submanifold it belongs to - is a longstanding problem. LID can be understood as the number of local factors of variation: the more factors of variation a datum has, the more complex it tends to be. Estimating this quantity has proven useful in contexts ranging from generalization in neural networks to detection of out-of-distribution data, adversarial examples, and AI-generated text. The recent successes of deep generative models present an opportunity to leverage them for LID estimation, but current methods based on generative models produce inaccurate estimates, require more than a single pre-trained model, are computationally intensive, or do not exploit the best available deep generative models: diffusion models (DMs). In this work, we show that the Fokker-Planck equation associated with a DM can provide an LID estimator which addresses the aforementioned deficiencies.

dimension, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Germany (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

Barking up the right tree: an approach to search over molecule synthesis DAGs

Neural Information Processing SystemsMay-29-2025, 06:39:53 GMT

When designing new molecules with particular properties, it is not only important what to make but crucially how to make it. These instructions form a synthesis directed acyclic graph (DAG), describing how a large vocabulary of simple building blocks can be recursively combined through chemical reactions to create more complicated molecules of interest. In contrast, many current deep generative models for molecules ignore synthesizability. We therefore propose a deep generative model that better represents the real world process, by directly outputting molecule synthesis DAGs. We argue that this provides sensible inductive biases, ensuring that our model searches over the same chemical space that chemists would also have access to, as well as interpretability. We show that our approach is able to model chemical space well, producing a wide range of diverse molecules, and allows for unconstrained optimization of an inherently constrained problem: maximize certain chemical properties such that discovered molecules are synthesizable.

artificial intelligence, machine learning, molecule, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America (0.28)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Add feedback

Estimating the Hallucination Rate of Generative AI Andrew Jesson Nicolas Beltran-Velez * Quentin Chu

Neural Information Processing SystemsMay-29-2025, 03:08:25 GMT

This paper presents a method for estimating the hallucination rate for in-context learning (ICL) with generative AI. In ICL, a conditional generative model (CGM) is prompted with a dataset and a prediction question and asked to generate a response. One interpretation of ICL assumes that the CGM computes the posterior predictive of an unknown Bayesian model, which implicitly defines a joint distribution over observable datasets and latent mechanisms. This joint distribution factorizes into two components: the model prior over mechanisms and the model likelihood of datasets given a mechanism. With this perspective, we define a hallucination as a generated response to the prediction question with low model likelihood given the mechanism. We develop a new method that takes an ICL problem and estimates the probability that a CGM will generate a hallucination. Our method only requires generating prediction questions and responses from the CGM and evaluating its response log probability. We empirically evaluate our method using large language models for synthetic regression and natural language ICL tasks.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.70)

Add feedback

The OpenAI empire - podcast

The GuardianMay-29-2025, 02:00:14 GMT

In 2019, before most of the world had heard of the company, the technology journalist Karen Hao spent three days embedded in the offices of OpenAI. What she saw, she tells Michael Safi, was a company vastly at odds with its public image: that of a transparent non-profit developing artificial intelligence technology purely for the benefit of humanity. "They said that they were transparent. They said that they were collaborative. They were actually very secretive."

large language model, machine learning, natural language, (5 more...)

The Guardian

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.82)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.82)

Add feedback

Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation

Neural Information Processing SystemsMay-29-2025, 01:49:44 GMT

Generative AI and large language models hold great promise in enhancing programming education by generating individualized feedback and hints for learners. Recent works have primarily focused on improving the quality of generated feedback to achieve human tutors' quality. While quality is an important performance criterion, it is not the only criterion to optimize for real-world educational deployments.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Industry: