AITopics | Generative AI

On the Identifiability of Hybrid Deep Generative Models: Meta-Learning as a Solution

Neural Information Processing SystemsMay-28-2025, 11:08:39 GMT

The interest in leveraging physics-based inductive bias in deep learning has resulted in recent development of hybrid deep generative models (hybrid-DGMs) that integrates known physics-based mathematical expressions in neural generative models. To identify these hybrid-DGMs requires inferring parameters of the physics-based component along with their neural component. The identifiability of these hybrid-DGMs, however, has not yet been theoretically probed or established. How does the existing theory of the un-identifiability of general DGMs apply to hybrid-DGMs? What may be an effective approach to consutrct a hybrid-DGM with theoretically-proven identifiability? This paper provides the first theoretical probe into the identifiability of hybrid-DGMs, and present meta-learning as a novel solution to construct identifiable hybrid-DGMs. On synthetic and real-data benchmarks, we provide strong empirical evidence for the un-identifiability of existing hybrid-DGMs using unconditional priors, and strong identifiability results of the presented meta-formulations of hybrid-DGMs.

artificial intelligence, identifiability, machine learning, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.60)

Add feedback

OpenAI: The power and the pride

MIT Technology ReviewMay-28-2025, 10:07:09 GMT

There is no question that OpenAI pulled off something historic with its release of ChatGPT 3.5 in 2022. It set in motion an AI arms race that has already changed the world in a number of ways and seems poised to have an even greater long-term effect than the short-term disruptions to things like education and employment that we are already beginning to see. How that turns out for humanity is something we are still reckoning with and may be for quite some time. But a pair of recent books both attempt to get their arms around it with accounts of what two leading technology journalists saw at the OpenAI revolution. In Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI, Karen Hao tells the story of the company's rise to power and its far-reaching impact all over the world.

large language model, machine learning, natural language, (9 more...)

MIT Technology Review

Industry: Media > News (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion Robin San Roman, Antoine Deleforge

Neural Information Processing SystemsMay-28-2025, 07:37:37 GMT

Deep generative models can generate high-fidelity audio conditioned on various types of representations (e.g., mel-spectrograms, Mel-frequency Cepstral Coefficients (MFCC)). Recently, such models have been used to synthesize audio waveforms conditioned on highly compressed representations. Although such methods produce impressive results, they are prone to generate audible artifacts when the conditioning is flawed or imperfect. An alternative modeling approach is to use diffusion models. However, these have mainly been used as speech vocoders (i.e., conditioned on mel-spectrograms) or generating relatively low sampling rate signals. In this work, we propose a high-fidelity multi-band diffusion-based framework that generates any type of audio modality (e.g., speech, music, environmental sounds) from low-bitrate discrete representations. At equal bit rate, the proposed approach outperforms state-of-the-art generative techniques in terms of perceptual quality. Training and evaluation code are available on the facebookresearch/audiocraft github project. Samples are available on the following link.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe > France (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Diffusion4Audio (12)

Robin San Roman

Neural Information Processing SystemsMay-28-2025, 07:37:33 GMT

Deep generative models can generate high-fidelity audio conditioned on various types of representations (e.g., mel-spectrograms, Mel-frequency Cepstral Coefficients (MFCC)). Recently, such models have been used to synthesize audio waveforms conditioned on highly compressed representations. Although such methods produce impressive results, they are prone to generate audible artifacts when the conditioning is flawed or imperfect. An alternative modeling approach is to use diffusion models. However, these have mainly been used as speech vocoders (i.e., conditioned on mel-spectrograms) or generating relatively low sampling rate signals. In this work, we propose a high-fidelity multi-band diffusion-based framework that generates any type of audio modality (e.g., speech, music, environmental sounds) from low-bitrate discrete representations. At equal bit rate, the proposed approach outperforms state-of-the-art generative techniques in terms of perceptual quality. Training and evaluation code are available on the facebookresearch/audiocraft github project. Samples are available on the following link.

arxiv preprint arxiv, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe > France (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Learning Disentangled Representations with Semi-Supervised Deep Generative Models

Siddharth N, Brooks Paige, Jan-Willem van de Meent, Alban Desmaison, Noah Goodman, Pushmeet Kohli, Frank Wood, Philip Torr

Neural Information Processing SystemsMay-28-2025, 03:18:44 GMT

Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)

Add feedback

Apple's triple threat: tariffs, AI troubles and a Fortnite fail

The GuardianMay-27-2025, 14:05:01 GMT

This week in tech: Apple struggles on multiple fronts, OpenAI grows increasingly ambitious, and Trump helps some of his fans lose money on cryptocurrency. Long dominant and unassailable, Apple is showing signs of weakness. The CEO, Tim Cook, can't tame Donald Trump's threats of tariffs that would spike the price of an iPhone; Apple's AI offerings pale against its competitors; and the company can't win a Fortnite match – or a single battle in its legal war with Epic Games – to save its life. On Friday, the president threatened to levy a 25% tariff on any iPhone not made in the US. Trump said in the post: "I have long ago informed Tim Cook of Apple that I expect their iPhones that will be sold in the United States of America will be manufactured and built in the United States, not India, or anyplace else. If that is not the case, a Tariff of at least 25% must be paid by Apple to the US."

artificial intelligence, machine learning, natural language, (20 more...)

The Guardian

Country: North America > United States (1.00)

Industry:

Information Technology (0.96)
Leisure & Entertainment > Games > Computer Games (0.73)
Banking & Finance > Trading (0.70)
Law (0.69)

Technology:

Information Technology > Communications > Mobile (0.81)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

OpenAI's 6.5B new acquisition signals Apple's biggest AI crisis yet

FOX NewsMay-26-2025, 12:00:44 GMT

OpenAI CEO Sam Altman sits down with Shannon Bream to discuss the positives and potential negatives of artificial intelligence and the importance of maintaining a lead in the AI industry over China. OpenAI has just made a move that's turning heads across the tech world. The company is acquiring io, the AI device startup founded by Jony Ive, for nearly 6.5 billion. It's a collaboration between Sam Altman, who leads OpenAI, and the designer responsible for some of Apple's most iconic products, including the iPhone and Apple Watch. Together, they want to create a new generation of AI-powered devices that could completely change how we use technology.

large language model, machine learning, natural language, (16 more...)

FOX News

Country: Asia > China (0.25)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.77)

Add feedback

Deep Generative Models with Learnable Knowledge Constraints

Zhiting Hu, Zichao Yang, Russ R. Salakhutdinov, LIANHUI Qin, Xiaodan Liang, Haoye Dong, Eric P. Xing

Neural Information Processing SystemsMay-26-2025, 10:17:39 GMT

Neural Information Processing Systems http://nips.cc/

constraint, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback

Learning semantic similarity in a continuous space

Michel Deudon

Neural Information Processing SystemsMay-26-2025, 08:33:23 GMT

We address the problem of learning semantic representation of questions to measure similarity between pairs as a continuous distance metric. Our work naturally extends Word Mover's Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words. Our learned metric measures the dissimilarity between two questions as the minimum amount of distance the intent (hidden representation) of one question needs to "travel" to match the intent of another question. We first learn to repeat, reformulate questions to infer intents as normal distributions with a deep generative model [2] (variational auto encoder). Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework. Among known models that can read sentences individually, our proposed framework achieves competitive results on Quora duplicate questions dataset. Our work sheds light on how deep generative models can approximate distributions (semantic representations) to effectively measure semantic similarity with meaningful distance metrics from Information Theory.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Middle East > Malta (0.14)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Add feedback

Flexible and accurate inference and learning for deep generative models

Neural Information Processing SystemsMay-26-2025, 08:27:40 GMT

We introduce a new approach to learning in hierarchical latent-variable generative models called the "distributed distributional code Helmholtz machine", which emphasises flexibility and accuracy in the inferential process. Like the original Helmholtz machine and later variational autoencoder algorithms (but unlike adversarial methods) our approach learns an explicit inference or "recognition" model to approximate the posterior distribution over the latent variables. Unlike these earlier methods, it employs a posterior representation that is not limited to a narrow tractable parametrised form (nor is it represented by samples). To train the generative and recognition models we develop an extended wake-sleep algorithm inspired by the original Helmholtz machine. This makes it possible to learn hierarchical latent models with both discrete and continuous variables, where an accurate posterior representation is essential. We demonstrate that the new algorithm outperforms current state-of-the-art methods on synthetic, natural image patch and the MNIST data sets.

artificial intelligence, generative model, machine learning, (19 more...)

Neural Information Processing Systems

Country: