Materials
Sample-Efficient Training for Diffusion
Gupta, Shivam, Parulekar, Aditya, Price, Eric, Xun, Zhiyang
Score-based diffusion models have become the most popular approach to deep generative modeling of images, largely due to their empirical performance and reliability. Recently, a number of theoretical works \citep{chen2022, Chen2022ImprovedAO, Chenetal23flowode, benton2023linear} have shown that diffusion models can efficiently sample, assuming $L^2$-accurate score estimates. The score-matching objective naturally approximates the true score in $L^2$, but the sample complexity of existing bounds depends \emph{polynomially} on the data radius and desired Wasserstein accuracy. By contrast, the time complexity of sampling is only logarithmic in these parameters. We show that estimating the score in $L^2$ \emph{requires} this polynomial dependence, but that a number of samples that scales polylogarithmically in the Wasserstein accuracy actually do suffice for sampling. We show that with a polylogarithmic number of samples, the ERM of the score-matching objective is $L^2$ accurate on all but a probability $\delta$ fraction of the true distribution, and that this weaker guarantee is sufficient for efficient sampling.
Density of States Prediction of Crystalline Materials via Prompt-guided Multi-Modal Transformer
Lee, Namkyeong, Noh, Heewoong, Kim, Sungwon, Hyun, Dongmin, Na, Gyoung S., Park, Chanyoung
The density of states (DOS) is a spectral property of crystalline materials, which provides fundamental insights into various characteristics of the materials. While previous works mainly focus on obtaining high-quality representations of crystalline materials for DOS prediction, we focus on predicting the DOS from the obtained representations by reflecting the nature of DOS: DOS determines the general distribution of states as a function of energy. That is, DOS is not solely determined by the crystalline material but also by the energy levels, which has been neglected in previous works. In this paper, we propose to integrate heterogeneous information obtained from the crystalline materials and the energies via a multi-modal transformer, thereby modeling the complex relationships between the atoms in the crystalline materials and various energy levels for DOS prediction. Moreover, we propose to utilize prompts to guide the model to learn the crystal structural system-specific interactions between crystalline materials and energies. Extensive experiments on two types of DOS, i.e., Phonon DOS and Electron DOS, with various real-world scenarios demonstrate the superiority of DOSTransformer.
From Concept to Manufacturing: Evaluating Vision-Language Models for Engineering Design
Picard, Cyril, Edwards, Kristen M., Doris, Anna C., Man, Brandon, Giannone, Giorgio, Alam, Md Ferdous, Ahmed, Faez
Engineering Design is undergoing a transformative shift with the advent of AI, marking a new era in how we approach product, system, and service planning. Large language models have demonstrated impressive capabilities in enabling this shift. Yet, with text as their only input modality, they cannot leverage the large body of visual artifacts that engineers have used for centuries and are accustomed to. This gap is addressed with the release of multimodal vision language models, such as GPT-4V, enabling AI to impact many more types of tasks. In light of these advancements, this paper presents a comprehensive evaluation of GPT-4V, a vision language model, across a wide spectrum of engineering design tasks, categorized into four main areas: Conceptual Design, System-Level and Detailed Design, Manufacturing and Inspection, and Engineering Education Tasks. Our study assesses GPT-4V's capabilities in design tasks such as sketch similarity analysis, concept selection using Pugh Charts, material selection, engineering drawing analysis, CAD generation, topology optimization, design for additive and subtractive manufacturing, spatial reasoning challenges, and textbook problems. Through this structured evaluation, we not only explore GPT-4V's proficiency in handling complex design and manufacturing challenges but also identify its limitations in complex engineering design applications. Our research establishes a foundation for future assessments of vision language models, emphasizing their immense potential for innovating and enhancing the engineering design and manufacturing landscape. It also contributes a set of benchmark testing datasets, with more than 1000 queries, for ongoing advancements and applications in this field.
From Microbes to Methane: AI-Based Predictive Modeling of Feed Additive Efficacy in Dairy Cows
Altshuler, Yaniv, Chebach, Tzruya Calvao, Cohen, Shalom
In an era of increasing pressure to achieve sustainable agriculture, the optimization of livestock feed for enhancing yield and minimizing environmental impact is a paramount objective. This study presents a pioneering approach towards this goal, using rumen microbiome data to predict the efficacy of feed additives in dairy cattle. We collected an extensive dataset that includes methane emissions from 2,190 Holstein cows distributed across 34 distinct sites. The cows were divided into control and experimental groups in a double-blind, unbiased manner, accounting for variables such as age, days in lactation, and average milk yield. The experimental groups were administered one of four leading commercial feed additives: Agolin, Kexxtone, Allimax, and Relyon. Methane emissions were measured individually both before the administration of additives and over a subsequent 12-week period. To develop our predictive model for additive efficacy, rumen microbiome samples were collected from 510 cows from the same herds prior to the study's onset. These samples underwent deep metagenomic shotgun sequencing, yielding an average of 15.7 million reads per sample. Utilizing innovative artificial intelligence techniques we successfully estimated the efficacy of these feed additives across different farms. The model's robustness was further confirmed through validation with independent cohorts, affirming its generalizability and reliability. Our results underscore the transformative capability of using targeted feed additive strategies to both optimize dairy yield and milk composition, and to significantly reduce methane emissions. Specifically, our predictive model demonstrates a scenario where its application could guide the assignment of additives to farms where they are most effective. In doing so, we could achieve an average potential reduction of over 27\% in overall emissions.
Exploring the Trie of Rules: a fast data structure for the representation of association rules
Kudriavtsev, Mikhail, Bezbradica, Marija, McCarren, Andrew
Association rule mining techniques can generate a large volume of sequential data when implemented on transactional databases. Extracting insights from a large set of association rules has been found to be a challenging process. When examining a ruleset, the fundamental question is how to summarise and represent meaningful mined knowledge efficiently. Many algorithms and strategies have been developed to address issue of knowledge extraction; however, the effectiveness of this process can be limited by the data structures. A better data structure can sufficiently affect the speed of the knowledge extraction process. This paper proposes a novel data structure, called the Trie of rules, for storing a ruleset that is generated by association rule mining. The resulting data structure is a prefix-tree graph structure made of pre-mined rules. This graph stores the rules as paths within the prefix-tree in a way that similar rules overlay each other. Each node in the tree represents a rule where a consequent is this node, and an antecedent is a path from this node to the root of the tree. The evaluation showed that the proposed representation technique is promising. It compresses a ruleset with almost no data loss and benefits in terms of time for basic operations such as searching for a specific rule and sorting, which is the base for many knowledge discovery methods. Moreover, our method demonstrated a significant improvement in traversing time, achieving an 8-fold increase compared to traditional data structures.
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
Rein, David, Hou, Betty Li, Stickland, Asa Cooper, Petty, Jackson, Pang, Richard Yuanzhe, Dirani, Julien, Michael, Julian, Bowman, Samuel R.
We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry. We ensure that the questions are high-quality and extremely difficult: experts who have or are pursuing PhDs in the corresponding domains reach 65% accuracy (74% when discounting clear mistakes the experts identified in retrospect), while highly skilled non-expert validators only reach 34% accuracy, despite spending on average over 30 minutes with unrestricted access to the web (i.e., the questions are "Google-proof"). The questions are also difficult for state-of-the-art AI systems, with our strongest GPT-4 based baseline achieving 39% accuracy. If we are to use future AI systems to help us answer very hard questions, for example, when developing new scientific knowledge, we need to develop scalable oversight methods that enable humans to supervise their outputs, which may be difficult even if the supervisors are themselves skilled and knowledgeable. The difficulty of GPQA both for skilled non-experts and frontier AI systems should enable realistic scalable oversight experiments, which we hope can help devise ways for human experts to reliably get truthful information from AI systems that surpass human capabilities.
Sustainable Concrete via Bayesian Optimization
Ament, Sebastian, Witte, Andrew, Garg, Nishant, Kusuma, Julius
Eight percent of global carbon dioxide emissions can be attributed to the production of cement, the main component of concrete, which is also the dominant source of CO2 emissions in the construction of data centers. The discovery of lower-carbon concrete formulae is therefore of high significance for sustainability. However, experimenting with new concrete formulae is time consuming and labor intensive, as one usually has to wait to record the concrete's 28-day compressive strength, a quantity whose measurement can by its definition not be accelerated. This provides an opportunity for experimental design methodology like Bayesian Optimization (BO) to accelerate the search for strong and sustainable concrete formulae. Herein, we 1) propose modeling steps that make concrete strength amenable to be predicted accurately by a Gaussian process model with relatively few measurements, 2) formulate the search for sustainable concrete as a multi-objective optimization problem, and 3) leverage the proposed model to carry out multi-objective BO with real-world strength measurements of the algorithmically proposed mixes. Our experimental results show improved trade-offs between the mixtures' global warming potential (GWP) and their associated compressive strengths, compared to mixes based on current industry practices. Our methods are open-sourced at github.com/facebookresearch/SustainableConcrete.
Online Two-stage Thermal History Prediction Method for Metal Additive Manufacturing of Thin Walls
Tang, Yifan, Dehaghani, M. Rahmani, Sajadi, Pouyan, Balani, Shahriar Bakrani, Dhalpe, Akshay, Panicker, Suraj, Wu, Di, Coatanea, Eric, Wang, G. Gary
This paper aims to propose an online two-stage thermal history prediction method, which could be integrated into a metal AM process for performance control. Based on the similarity of temperature curves (curve segments of a temperature profile of one point) between any two successive layers, the first stage of the proposed method designs a layer-to-layer prediction model to estimate the temperature curves of the yet-to-print layer from measured temperatures of certain points on the previously printed layer. With measured/predicted temperature profiles of several points on the same layer, the second stage proposes a reduced order model (ROM) (intra-layer prediction model) to decompose and construct the temperature profiles of all points on the same layer, which could be used to build the temperature field of the entire layer. The training of ROM is performed with an extreme learning machine (ELM) for computational efficiency. Fifteen wire arc AM experiments and nine simulations are designed for thin walls with a fixed length and unidirectional printing of each layer. The test results indicate that the proposed prediction method could construct the thermal history of a yet-to-print layer within 0.1 seconds on a low-cost desktop computer. Meanwhile, the method has acceptable generalization capability in most cases from lower layers to higher layers in the same simulation, as well as from one simulation to a new simulation on different AM process parameters. More importantly, after fine-tuning the proposed method with limited experimental data, the relative errors of all predicted temperature profiles on a new experiment are smaller than 0.09, which demonstrates the applicability and generalization of the proposed two-stage thermal history prediction method in online applications for metal AM.
A Fingertip Sensor and Algorithms for Pre-touch Distance Ranging and Material Detection in Robotic Grasping
Fang, Cheng, Wang, Di, Guo, Fengzhi, Zou, Jun, Song, Dezhen
To enhance robotic grasping capabilities, we are developing new contactless fingertip sensors to measure distance in close proximity and simultaneously detect the type of material and the interior structure. These sensors are referred to as pre-touch dual-modal and dual-mechanism (PDM$^2$) sensors, and they operate using both pulse-echo ultrasound (US) and optoacoustic (OA) modalities. We present the design of a PDM$^2$ sensor that utilizes a pulsed laser beam and a customized ultrasound transceiver with a wide acoustic bandwidth for ranging and sensing. Both US and OA signals are collected simultaneously, triggered by the same laser pulse. To validate our design, we have fabricated a prototype of the PDM$^2$ sensor and integrated it into an object scanning system. We have also developed algorithms to enable the sensor, including time-of-flight (ToF) auto estimation, ranging rectification, sensor and system calibration, distance ranging, material/structure detection, and object contour detection and reconstruction. The experimental results demonstrate that the new PDM$^2$ sensor and its algorithms effectively enable the object scanning system to achieve satisfactory ranging and contour reconstruction performances, along with satisfying material/structure detection capabilities. In conclusion, the PDM$^2$ sensor offers a practical and powerful solution to improve grasping of unknown objects with the robotic gripper by providing advanced perception capabilities.
Accurate and Fast Fischer-Tropsch Reaction Microkinetics using PINNs
Patel, Harshil, Panda, Aniruddha, Nikolaienko, Tymofii, Jaso, Stanislav, Lopez, Alejandro, Kalyanaraman, Kaushic
Microkinetics allows detailed modelling of chemical transformations occurring in many industrially relevant reactions. Traditional way of solving the microkinetics model for Fischer-Tropsch synthesis (FTS) becomes inefficient when it comes to more advanced real-time applications. In this work, we address these challenges by using physics-informed neural networks(PINNs) for modelling FTS microkinetics. We propose a computationally efficient and accurate method, enabling the ultra-fast solution of the existing microkinetics models in realistic process conditions. The proposed PINN model computes the fraction of vacant catalytic sites, a key quantity in FTS microkinetics, with median relative error (MRE) of 0.03%, and the FTS product formation rates with MRE of 0.1%. Compared to conventional equation solvers, the model achieves up to 1E+06 times speed-up when running on GPUs, thus being fast enough for multi-scale and multi-physics reactor modelling and enabling its applications in real-time process control and optimization.