Goto

Collaborating Authors

 Materials


DDIM sampling for Generative AIBIM, a faster intelligent structural design framework

arXiv.org Artificial Intelligence

Generative AIBIM, a successful structural design pipeline, has proven its ability to intelligently generate high-quality, diverse, and creative shear wall designs that are tailored to specific physical conditions. However, the current module of Generative AIBIM that generates designs, known as the physics-based conditional diffusion model (PCDM), necessitates 1000 iterations for each generation due to its reliance on the denoising diffusion probabilistic model (DDPM) sampling process. This leads to a time-consuming and computationally demanding generation process. To address this issue, this study introduces the denoising diffusion implicit model (DDIM), an accelerated generation method that replaces the DDPM sampling process in PCDM. While the original DDIM was designed for DDPM and the optimization process of PCDM differs from that of DDPM, this paper designs "DDIM sampling for PCDM," which modifies the original DDIM formulations to adapt to the optimization process of PCDM. Experimental results demonstrate that DDIM sampling for PCDM can accelerate the generation process of the original PCDM by a factor of 100 while maintaining the same visual quality in the generated results. This study effectively showcases the effectiveness of DDIM sampling for PCDM in expediting intelligent structural design. Furthermore, this paper reorganizes the contents of DDIM, focusing on the practical usage of DDIM. This change is particularly meaningful for researchers who may not possess a strong background in machine learning theory but are interested in utilizing the tool effectively.


A Self-Supervised Robotic System for Autonomous Contact-Based Spatial Mapping of Semiconductor Properties

arXiv.org Artificial Intelligence

Integrating robotically driven contact-based material characterization techniques into self-driving laboratories can enhance measurement quality, reliability, and throughput. While deep learning models support robust autonomy, current methods lack reliable pixel-precision positioning and require extensive labeled data. To overcome these challenges, we propose an approach for building self-supervised autonomy into contact-based robotic systems that teach the robot to follow domain expert measurement principles at high-throughputs. Firstly, we design a vision-based, self-supervised convolutional neural network (CNN) architecture that uses differentiable image priors to optimize domain-specific objectives, refining the pixel precision of predicted robot contact poses by 20.0% relative to existing approaches. Secondly, we design a reliable graph-based planner for generating distance-minimizing paths to accelerate the robot measurement throughput and decrease planning variance by 6x. We demonstrate the performance of this approach by autonomously driving a 4-degree-of-freedom robotic probe for 24 hours to characterize semiconductor photoconductivity at 3,025 uniquely predicted poses across a gradient of drop-casted perovskite film compositions, achieving throughputs over 125 measurements per hour. Spatially mapping photoconductivity onto each drop-casted film reveals compositional trends and regions of inhomogeneity, valuable for identifying manufacturing process defects. With this self-supervised CNN-driven robotic system, we enable high-precision and reliable automation of contact-based characterization techniques at high throughputs, thereby allowing the measurement of previously inaccessible yet important semiconductor properties for self-driving laboratories.


Industrial-scale Prediction of Cement Clinker Phases using Machine Learning

arXiv.org Artificial Intelligence

Cement production, exceeding 4.1 billion tonnes and contributing 2.4 tonnes of CO2 annually, faces critical challenges in quality control and process optimization. While traditional process models for cement manufacturing are confined to steady-state conditions with limited predictive capability for mineralogical phases, modern plants operate under dynamic conditions that demand real-time quality assessment. Here, exploiting a comprehensive two-year operational dataset from an industrial cement plant, we present a machine learning framework that accurately predicts clinker mineralogy from process data. Our model achieves unprecedented prediction accuracy for major clinker phases while requiring minimal input parameters, demonstrating robust performance under varying operating conditions. Through post-hoc explainable algorithms, we interpret the hierarchical relationships between clinker oxides and phase formation, providing insights into the functioning of an otherwise black-box model. This digital twin framework can potentially enable real-time optimization of cement production, thereby providing a route toward reducing material waste and ensuring quality while reducing the associated emissions under real plant conditions. Our approach represents a significant advancement in industrial process control, offering a scalable solution for sustainable cement manufacturing.


From Generalist to Specialist: A Survey of Large Language Models for Chemistry

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have significantly transformed our daily life and established a new paradigm in natural language processing (NLP). However, the predominant pretraining of LLMs on extensive web-based texts remains insufficient for advanced scientific discovery, particularly in chemistry. The scarcity of specialized chemistry data, coupled with the complexity of multi-modal data such as 2D graph, 3D structure and spectrum, present distinct challenges. Although several studies have reviewed Pretrained Language Models (PLMs) in chemistry, there is a conspicuous absence of a systematic survey specifically focused on chemistry-oriented LLMs. In this paper, we outline methodologies for incorporating domain-specific chemistry knowledge and multi-modal information into LLMs, we also conceptualize chemistry LLMs as agents using chemistry tools and investigate their potential to accelerate scientific research. Additionally, we conclude the existing benchmarks to evaluate chemistry ability of LLMs. Finally, we critically examine the current challenges and identify promising directions for future research. Through this comprehensive survey, we aim to assist researchers in staying at the forefront of developments in chemistry LLMs and to inspire innovative applications in the field.


5 coolest engineering innovations of 2024

Popular Science

To keep global temperatures from rising more than 1.5 degrees Celsius, we need to cut emissions in half by 2035--even as we will likely hit another record for burning fossil fuels this year. Still, the brilliant engineering demonstrated in this year's winning projects provides hope that we can rise to the challenge. A new kind of thermal battery will allow us to decarbonize the heat that powers the industrial processes behind everything from cement to chemicals. Newly inexpensive lasers are helping turn ore into pure iron for steelmaking using renewable electricity. Food challenges have generated different types of innovation: Instead of hauling agricultural waste to decompose in the dump, why not create a harvester-style robot that can process it into carbon-sequestering, soil-enriching biochar? To fight pests, a technique called mRNA interference allows bioengineers to create a precision poison for a particularly troublesome beetle.


Time Series Foundational Models: Their Role in Anomaly Detection and Prediction

arXiv.org Artificial Intelligence

Time series foundational models (TSFM) have gained prominence in time series forecasting, promising state-of-the-art performance across various applications. However, their application in anomaly detection and prediction remains underexplored, with growing concerns regarding their black-box nature, lack of interpretability and applicability. This paper critically evaluates the efficacy of TSFM in anomaly detection and prediction tasks. We systematically analyze TSFM across multiple datasets, including those characterized by the absence of discernible patterns, trends and seasonality. Our analysis shows that while TSFMs can be extended for anomaly detection and prediction, traditional statistical and deep learning models often match or outperform TSFM in these tasks. Additionally, TSFMs require high computational resources but fail to capture sequential dependencies effectively or improve performance in few-shot or zero-shot scenarios. \noindent The preprocessed datasets, codes to reproduce the results and supplementary materials are available at https://github.com/smtmnfg/TSFM.


Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs

arXiv.org Artificial Intelligence

This paper proposes a Clustering, Labeling, then Augmenting framework that significantly enhances performance in Semi-Supervised Text Classification (SSTC) tasks, effectively addressing the challenge of vast datasets with limited labeled examples. Unlike traditional SSTC approaches that rely on a predefined small set of labeled data to generate pseudo-labels for the unlabeled data, this framework innovatively employs clustering to select representative "landmarks" for labeling. These landmarks subsequently act as intermediaries in an ensemble of augmentation techniques, including Retrieval-Augmented Generation (RAG), Large Language Model (LLMs)-based rewriting, and synonym substitution, to generate synthetic labeled data without making pseudo-labels for the unlabeled data. Empirical results show that even in complex text document classification scenarios involving over 100 categories, our method achieves state-of-the-art accuracies of 95.41% on the Reuters dataset and 82.43% on the Web of Science dataset. Our approach significantly reduces the reliance on human labeling efforts and the associated expenses, while simultaneously ensuring high data quality and minimizing privacy risks. The finetuning results further show the efficiency of fine-tuning LLMs for text classification tasks, highlighting a robust solution for leveraging limited labeled data.


Label-free SERS Discrimination of Proline from Hydroxylated Proline at Single-molecule Level Assisted by a Deep Learning Model

arXiv.org Artificial Intelligence

ABSTRACT: Discriminating the low-abundance hydroxylated proline from hydroxylated proline is crucial for monitoring diseases and evaluating therapeutic outcomes that require single-molecule sensors. While the plasmonic nanopore sensor can detect the hydroxylation with single-molecule sensitivity by surface enhanced Raman spectroscopy (SERS), it suffers from intrinsic fluctuations of single-molecule signals as well as strong interference from citrates. Here, we used the occurrence frequency histogram of the single-molecule SERS peaks to extract overall dataset spectral features, overcome the signal fluctuations and investigate the citratereplaced plasmonic nanopore sensors for clean and distinguishable signals of proline and hydroxylated proline. By ligand exchange of the citrates by analyte molecules, the representative peaks of citrates decreased with incubation time, proving occupation of the plasmonic hot spot by the analytes. As a result, the discrimination of the single-molecule SERS signals of proline and hydroxylated proline was possible with the convolutional neural network model with 96.6% accuracy.


DOFEN: Deep Oblivious Forest ENsemble

arXiv.org Machine Learning

Deep Neural Networks (DNNs) have revolutionized artificial intelligence, achieving impressive results on diverse data types, including images, videos, and texts. However, DNNs still lag behind Gradient Boosting Decision Trees (GBDT) on tabular data, a format extensively utilized across various domains. In this paper, we propose DOFEN, short for \textbf{D}eep \textbf{O}blivious \textbf{F}orest \textbf{EN}semble, a novel DNN architecture inspired by oblivious decision trees. DOFEN constructs relaxed oblivious decision trees (rODTs) by randomly combining conditions for each column and further enhances performance with a two-level rODT forest ensembling process. By employing this approach, DOFEN achieves state-of-the-art results among DNNs and further narrows the gap between DNNs and tree-based models on the well-recognized benchmark: Tabular Benchmark \citep{grinsztajn2022tree}, which includes 73 total datasets spanning a wide array of domains. The code of DOFEN is available at: \url{https://github.com/Sinopac-Digital-Technology-Division/DOFEN}.


Unveiling Secrets of Brain Function With Generative Modeling: Motion Perception in Primates & Cortical Network Organization in Mice

arXiv.org Artificial Intelligence

This Dissertation is comprised of two main projects, addressing questions in neuroscience through applications of generative modeling. Project #1 (Chapter 4) explores how neurons encode features of the external world. I combine Helmholtz's "Perception as Unconscious Inference" -- paralleled by modern generative models like variational autoencoders (VAE) -- with the hierarchical structure of the visual cortex. This combination leads to the development of a hierarchical VAE model, which I test for its ability to mimic neurons from the primate visual cortex in response to motion stimuli. Results show that the hierarchical VAE perceives motion similar to the primate brain. Additionally, the model identifies causal factors of retinal motion inputs, such as object- and self-motion, in a completely unsupervised manner. Collectively, these results suggest that hierarchical inference underlines the brain's understanding of the world, and hierarchical VAEs can effectively model this understanding. Project #2 (Chapter 5) investigates the spatiotemporal structure of spontaneous brain activity and its reflection of brain states like rest. Using simultaneous fMRI and wide-field Ca2+ imaging data, this project demonstrates that the mouse cortex can be decomposed into overlapping communities, with around half of the cortical regions belonging to multiple communities. Comparisons reveal similarities and differences between networks inferred from fMRI and Ca2+ signals. The introduction (Chapter 1) is divided similarly to this abstract: sections 1.1 to 1.8 provide background information about Project #1, and sections 1.9 to 1.13 are related to Project #2. Chapter 2 includes historical background, Chapter 3 provides the necessary mathematical background, and finally, Chapter 6 contains concluding remarks and future directions.