Generative AI
Can Artificial Intelligence Accelerate Technological Progress? Researchers' Perspectives on AI in Manufacturing and Materials Science
Nelson, John P., Olugbade, Olajide, Shapira, Philip, Biddle, Justin B.
Applications of artificial intelligence or machine learning in research Modes of use Surrogate modeling for physics - based models Modeling of poorly understood phenomena Data preprocessing Large language model use Applications AI/ML as research tool Production process design, monitoring, & output prediction Part design & properties prediction Materials design & properties prediction AI/ML as research product Generative AI design tool for consumers Generic research tasks Large language models for coding Large language models for literature review Benefits of artificial intelligence or machine learning in research Reduction in accuracy/cost/speed trade - off in research, especially computer modeling Reduced computation time Replacing experimentation Reducing need for computationally intensive, physics - based models Saving research labor Exploring larger design spaces Address of previously unsolvable problems Model poorly understood relationships between variables Identify human - unidentifiable patterns or phenomena Downsides of artificial intelligence or machine learning in research Accuracy weaknesses Predict poorly outside regions of dense, high - quality training data Interpretability weaknesses Bounds of accuracy can be unclear Accuracy assessment can be difficult Long - run scientific progress concerns AI/ML cannot develop novel scientific theory AI/ML may bypass opportunities to identify empirical or theoretical novelties Resource issues Data acquisition and cleaning is time - intensive AI/ML models are computation - and energy - intensive to develop Inappropriate use issues Easy to over - trust May be inappropriately used to address problems soluble with simpler methods 8 Second, AI/ML models can be trained on input and output data for phenomena (e.g., complex production processes) which lack robust theoretical models, developing novel predictive capabilities in the absence of explicit, human - designed theory. This is somet imes referred to as "phenomenological modeling," as it attempts to model phenomena in the absence of mechanistic, explanatory understanding: [T]he first reason we choose to use AI is because we don't have a good model of what our system is. . . I get a bunch of data coming in and I have a bunch of sensor readings, you know. . . And I use the AI to map the bunch of sensor readings to the process health or process status or machine status that I have.
A First-Principles Based Risk Assessment Framework and the IEEE P3396 Standard
Tong, Richard J., Cortês, Marina, DeFalco, Jeanine A., Underwood, Mark, Zalewski, Janusz
Generative Artificial Intelligence (AI) is enabling unprecedented automation in content creation and decision support, but it also raises novel risks. This paper presents a first-principles risk assessment framework underlying the IEEE P3396 Recommended Practice for AI Risk, Safety, Trustworthiness, and Responsibility. We distinguish between process risks (risks arising from how AI systems are built or operated) and outcome risks (risks manifest in the AI system's outputs and their real-world effects), arguing that generative AI governance should prioritize outcome risks. Central to our approach is an information-centric ontology that classifies AI-generated outputs into four fundamental categories: (1) Perception-level information, (2) Knowledge-level information, (3) Decision/Action plan information, and (4) Control tokens (access or resource directives). This classification allows systematic identification of harms and more precise attribution of responsibility to stakeholders (developers, deployers, users, regulators) based on the nature of the information produced. We illustrate how each information type entails distinct outcome risks (e.g. deception, misinformation, unsafe recommendations, security breaches) and requires tailored risk metrics and mitigations. By grounding the framework in the essence of information, human agency, and cognition, we align risk evaluation with how AI outputs influence human understanding and action. The result is a principled approach to AI risk that supports clear accountability and targeted safeguards, in contrast to broad application-based risk categorizations. We include example tables mapping information types to risks and responsibilities. This work aims to inform the IEEE P3396 Recommended Practice and broader AI governance with a rigorous, first-principles foundation for assessing generative AI risks while enabling responsible innovation.
Toward Valid Generative Clinical Trial Data with Survival Endpoints
Chassat, Perrine, Nguyen, Van Tuan, Ducrot, Lucas, Lanoy, Emilie, Guilloux, Agathe
Clinical trials face mounting challenges: fragmented patient populations, slow enrollment, and unsustainable costs, particularly for late phase trials in oncology and rare diseases. While external control arms built from real-world data have been explored, a promising alternative is the generation of synthetic control arms using generative AI. A central challenge is the generation of time-to-event outcomes, which constitute primary endpoints in oncology and rare disease trials, but are difficult to model under censoring and small sample sizes. Existing generative approaches, largely GAN-based, are data-hungry, unstable, and rely on strong assumptions such as independent censoring. We introduce a variational autoencoder (VAE) that jointly generates mixed-type covariates and survival outcomes within a unified latent variable framework, without assuming independent censoring. Across synthetic and real trial datasets, we evaluate our model in two realistic scenarios: (i) data sharing under privacy constraints, where synthetic controls substitute for original data, and (ii) control-arm augmentation, where synthetic patients mitigate imbalances between treated and control groups. Our method outperforms GAN baselines on fidelity, utility, and privacy metrics, while revealing systematic miscalibration of type I error and power. We propose a post-generation selection procedure that improves calibration, highlighting both progress and open challenges for generative survival modeling.
Deep Generative Models with Learnable Knowledge Constraints
The broad set of deep generative models (DGMs) has achieved remarkable advances. However, it is often difficult to incorporate rich structured domain knowledge with the end-to-end DGMs. Posterior regularization (PR) offers a principled framework to impose structured constraints on probabilistic models, but has limited applicability to the diverse DGMs that can lack a Bayesian formulation or even explicit density evaluation. PR also requires constraints to be fully specified {\it a priori}, which is impractical or suboptimal for complex knowledge with learnable uncertain parts. In this paper, we establish mathematical correspondence between PR and reinforcement learning (RL), and, based on the connection, expand PR to learn constraints as the extrinsic reward in RL. The resulting algorithm is model-agnostic to apply to any DGMs, and is flexible to adapt arbitrary constraints with the model jointly. Experiments on human image generation and templated sentence generation show models with learned knowledge constraints by our algorithm greatly improve over base generative models.
Learning semantic similarity in a continuous space
We address the problem of learning semantic representation of questions to measure similarity between pairs as a continuous distance metric. Our work naturally extends Word Mover's Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words. Our learned metric measures the dissimilarity between two questions as the minimum amount of distance the intent (hidden representation) of one question needs to travel to match the intent of another question. We first learn to repeat, reformulate questions to infer intents as normal distributions with a deep generative model [2] (variational auto encoder). Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework. Among known models that can read sentences individually, our proposed framework achieves competitive results on Quora duplicate questions dataset. Our work sheds light on how deep generative models can approximate distributions (semantic representations) to effectively measure semantic similarity with meaningful distance metrics from Information Theory.
Bias and Generalization in Deep Generative Models: An Empirical Study
In high dimensional settings, density estimation algorithms rely crucially on their inductive bias. Despite recent empirical success, the inductive bias of deep generative models is not well understood. In this paper we propose a framework to systematically investigate bias and generalization in deep generative models of images by probing the learning algorithm with carefully designed training datasets. By measuring properties of the learned distribution, we are able to find interesting patterns of generalization. We verify that these patterns are consistent across datasets, common models and architectures.
Semi-crowdsourced Clustering with Deep Generative Models
We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework. Empirical results on synthetic and real-world datasets show that our model outperforms previous crowdsourced clustering methods.