AITopics | generative foundation model

Collaborating Authors

generative foundation model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild Can Qin

Neural Information Processing SystemsFeb-15-2026, 15:59:00 GMT

Achieving machine autonomy and human control often represent divergent objectives in the design of interactive AI systems.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > India > Uttar Pradesh (0.04)
Asia > Cambodia (0.04)

Genre: Research Report (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

Neural Information Processing SystemsDec-26-2025, 06:52:51 GMT

Achieving machine autonomy and human control often represent divergent objectives in the design of interactive AI systems. Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when prompted with arbitrary languages. However, they often fall short in generating images with spatial, structural, or geometric controls. The integration of such controls, which can accommodate various visual conditions in a single unified model, remains an unaddressed challenge. In response, we introduce UniControl, a new generative foundation model that consolidates a wide array of controllable condition-to-image (C2I) tasks within a singular framework, while still allowing for arbitrary language prompts. UniControl enables pixel-level-precise image generation, where visual conditions primarily influence the generated structures and language prompts guide the style and context. To equip UniControl with the capacity to handle diverse visual conditions, we augment pretrained text-to-image diffusion models and introduce a task-aware HyperNet to modulate the diffusion models, enabling the adaptation to different C2I tasks simultaneously. Trained on nine unique C2I tasks, UniControl demonstrates impressive zero-shot generation abilities with unseen visual conditions. Experimental results show that UniControl often surpasses the performance of single-task-controlled methods of comparable model sizes.

controllable visual generation, unicontrol, unified diffusion model, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

Neural Information Processing SystemsJan-19-2025, 12:40:21 GMT

controllable visual generation, unicontrol, unified diffusion model, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Add feedback

MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model

Li, Junjie, Liu, Yang, Liu, Weiqing, Fang, Shikai, Wang, Lewen, Xu, Chang, Bian, Jiang

arXiv.org Artificial IntelligenceSep-4-2024

Generative models aim to simulate realistic effects of various actions across different contexts, from text generation to visual effects. Despite efforts to build real-world simulators, leveraging generative models for virtual worlds, like financial markets, remains underexplored. In financial markets, generative models can simulate market effects of various behaviors, enabling interaction with market scenes and players, and training strategies without financial risk. This simulation relies on the finest structured data in financial market like orders thus building the finest realistic simulation. We propose Large Market Model (LMM), an order-level generative foundation model, for financial market simulation, akin to language modeling in the digital world. Our financial Market Simulation engine (MarS), powered by LMM, addresses the need for realistic, interactive and controllable order generation. Key objectives of this paper include evaluating LMM's scaling law in financial markets, assessing MarS's realism, balancing controlled generation with market impact, and demonstrating MarS's potential applications. We showcase MarS as a forecast tool, detection system, analysis platform, and agent training environment. Our contributions include pioneering a generative model for financial markets, designing MarS to meet domain-specific needs, and demonstrating MarS-based applications' industry potential.

market impact, mars, simulation, (15 more...)

arXiv.org Artificial Intelligence

2409.07486

Country:

Asia > India > West Bengal > Kolkata (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

A generative foundation model for five-class sleep staging with arbitrary sensor input

van Gorp, Hans, van Gilst, Merel M., Fonseca, Pedro, van Meulen, Fokke B., van Dijk, Johannes P., Overeem, Sebastiaan, van Sloun, Ruud J. G.

arXiv.org Artificial IntelligenceAug-9-2024

Gold-standard sleep scoring as performed by human technicians is based on a subset of PSG signals, namely the EEG, EOG, and EMG. The PSG, however, consists of many more signal derivations that could potentially be used to perform sleep staging, including cardiac and respiratory modalities. Leveraging this variety in signals would offer advantages, for example by increasing reliability, resilience to signal loss, and application to long-term non-obtrusive recordings. This paper proposes a deep generative foundation model for fully automatic sleep staging from a plurality of sensors and any combination thereof. We trained a score-based diffusion model with a transformer backbone using a dataset of 1947 expert-labeled overnight sleep recordings with 36 different signals, including neurological, cardiac, and respiratory signals. We achieve zero-shot inference on any sensor set by using a novel Bayesian factorization of the score function across the sensors, i.e., it does not require retraining on specific combinations of signals. On single-channel EEG, our method reaches the performance limit in terms of PSG inter-rater agreement (5-class accuracy 85.6%, kappa 0.791). At the same time, the method offers full flexibility to use any sensor set derived from other modalities, for example, as typically used in home recordings that include finger PPG, nasal cannula and thoracic belt (5-class accuracy 79.0%, kappa of 0.697), or by combining derivations not typically used for sleep staging such as the tibialis and sternocleidomastoid EMG (5-class accuracy 71.0%, kappa of 0.575). Additionally, we propose a novel interpretability metric in terms of information gain per sensor and show that this is linearly correlated with classification performance. Lastly, our foundation model allows for post-hoc addition of entirely new sensor modalities by merely training a score estimator on the novel input.

arbitrary sensor input, five-class sleep staging, generative foundation model

arXiv.org Artificial Intelligence

2408.15253

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

On the Benefit of Generative Foundation Models for Human Activity Recognition

Leng, Zikang, Kwon, Hyeokhyen, Plötz, Thomas

arXiv.org Artificial IntelligenceOct-18-2023

In human activity recognition (HAR), the limited availability of annotated data presents a significant challenge. Drawing inspiration from the latest advancements in generative AI, including Large Language Models (LLMs) and motion synthesis models, we believe that generative AI can address this data scarcity by autonomously generating virtual IMU data from text descriptions. Beyond this, we spotlight several promising research pathways that could benefit from generative AI for the community, including the generating benchmark datasets, the development of foundational models specific to HAR, the exploration of hierarchical structures within HAR, breaking down complex activities, and applications in health sensing and activity summarization.

generative foundation model, human activity recognition

arXiv.org Artificial Intelligence

2310.12085

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.64)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback

Generative Semi-supervised Learning with Meta-Optimized Synthetic Samples

Yamaguchi, Shin'ya

arXiv.org Machine LearningSep-27-2023

Semi-supervised learning (SSL) is a promising approach for training deep classification models using labeled and unlabeled datasets. However, existing SSL methods rely on a large unlabeled dataset, which may not always be available in many real-world applications due to legal constraints (e.g., GDPR). In this paper, we investigate the research question: Can we train SSL models without real unlabeled datasets? Instead of using real unlabeled datasets, we propose an SSL method using synthetic datasets generated from generative foundation models trained on datasets containing millions of samples in diverse domains (e.g., ImageNet). Our main concepts are identifying synthetic samples that emulate unlabeled samples from generative foundation models and training classifiers using these synthetic samples. To achieve this, our method is formulated as an alternating optimization problem: (i) meta-learning of generative foundation models and (ii) SSL of classifiers using real labeled and synthetic unlabeled samples. For (i), we propose a meta-learning objective that optimizes latent variables to generate samples that resemble real labeled samples and minimize the validation loss. For (ii), we propose a simple unsupervised loss function that regularizes the feature extractors of classifiers to maximize the performance improvement obtained from synthetic samples. We confirm that our method outperforms baselines using generative foundation models on SSL. We also demonstrate that our methods outperform SSL using real unlabeled datasets in scenarios with extremely small amounts of labeled datasets. This suggests that synthetic samples have the potential to provide improvement gains more efficiently than real unlabeled data.

dataset, mp-ssl, synthetic sample, (11 more...)

arXiv.org Machine Learning

2309.16143

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Law (0.86)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback