Goto

Collaborating Authors

 Materials


Demystifying the Potential of ChatGPT-4 Vision for Construction Progress Monitoring

arXiv.org Artificial Intelligence

The integration of Large Vision-Language Models (LVLMs) such as OpenAI's GPT-4 Vision into various sectors has marked a significant evolution in the field of artificial intelligence, particularly in the analysis and interpretation of visual data. This paper explores the practical application of GPT-4 Vision in the construction industry, focusing on its capabilities in monitoring and tracking the progress of construction projects. Utilizing high-resolution aerial imagery of construction sites, the study examines how GPT-4 Vision performs detailed scene analysis and tracks developmental changes over time. The findings demonstrate that while GPT-4 Vision is proficient in identifying construction stages, materials, and machinery, it faces challenges with precise object localization and segmentation. Despite these limitations, the potential for future advancements in this technology is considerable. This research not only highlights the current state and opportunities of using LVLMs in construction but also discusses future directions for enhancing the model's utility through domain-specific training and integration with other computer vision techniques and digital twins.


AI beats human experts at distinguishing American whiskey from Scotch

New Scientist

Artificial intelligence can tell Scotch whisky from American whiskey and identify its strongest constituent aromas more reliably than human experts – by using data rather than tasting the drinks. Andreas Grasskamp at the Fraunhofer Institute for Process Engineering and Packaging IVV in Germany and his colleagues trained an AI molecular odour prediction algorithm called OWSum on descriptions of different whiskies. Then, in a study involving 16 samples – nine types of Scotch whisky and seven types of American bourbon or whiskey – they tasked OWSum with telling drinks from the two nations apart based on keyword descriptions of their flavours, such as flowery, fruity, woody or smoky. Using these alone, the AI could tell which country a drink came from with almost 94 per cent accuracy. Because the complex aroma of these spirits is determined by the absence or presence of many chemical compounds, the researchers also fed the AI a reference dataset of 390 molecules commonly found in whiskies.


Interview with Andrews Ata Kangah: Localising illegal mining sites using machine learning and geospatial data

AIHub

Andrews Ata Kangah is a team leader and researcher working on democratizing AI and AI solutions for environmental problems. We spoke to him about his research, attending the AfriClimate AI workshop at the Deep Learning Indaba, and what inspired him to work in AI and on climate-related projects. My name is Andrews Ata Kangah. I also double as a researcher at Armtos, which is a non-profit. At Armtos, our current goal is to build a solution to solve the illegal mining problem that's going on in Ghana. The mining is destroying the lands that are within mining areas.


TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation

arXiv.org Artificial Intelligence

In this paper, we propose Text-based Open Molecule Generation Benchmark (TOMG-Bench), the first benchmark to evaluate the open-domain molecule generation capability of LLMs. TOMG-Bench encompasses a dataset of three major tasks: molecule editing (MolEdit), molecule optimization (MolOpt), and customized molecule generation (MolCustom). Each task further contains three subtasks, with each subtask comprising 5,000 test samples. Given the inherent complexity of open molecule generation, we have also developed an automated evaluation system that helps measure both the quality and the accuracy of the generated molecules. Our comprehensive benchmarking of 25 LLMs reveals the current limitations and potential areas for improvement in text-guided molecule discovery. Furthermore, with the assistance of OpenMolIns, a specialized instruction tuning dataset proposed for solving challenges raised by TOMG-Bench, Llama3.1-8B could outperform all the open-source general LLMs, even surpassing GPT-3.5-turbo by 46.5\% on TOMG-Bench. Our codes and datasets are available through https://github.com/phenixace/TOMG-Bench.


Joint Models for Handling Non-Ignorable Missing Data using Bayesian Additive Regression Trees: Application to Leaf Photosynthetic Traits Data

arXiv.org Machine Learning

Dealing with missing data poses significant challenges in predictive analysis, often leading to biased conclusions when oversimplified assumptions about the missing data process are made. In cases where the data are missing not at random (MNAR), jointly modeling the data and missing data indicators is essential. Motivated by a real data application with partially missing multivariate outcomes related to leaf photosynthetic traits and several environmental covariates, we propose two methods under a selection model framework for handling data with missingness in the response variables suitable for recovering various missingness mechanisms. Both approaches use a multivariate extension of Bayesian additive regression trees (BART) to flexibly model the outcomes. The first approach simultaneously uses a probit regression model to jointly model the missingness. In scenarios where the relationship between the missingness and the data is more complex or non-linear, we propose a second approach using a probit BART model to characterize the missing data process, thereby employing two BART models simultaneously. Both models also effectively handle ignorable covariate missingness. The efficacy of both models compared to existing missing data approaches is demonstrated through extensive simulations, in both univariate and multivariate settings, and through the aforementioned application to the leaf photosynthetic trait data.


Time-Reversible Bridges of Data with Machine Learning

arXiv.org Machine Learning

The analysis of dynamical systems is a fundamental tool in the natural sciences and engineering. It is used to understand the evolution of systems as large as entire galaxies and as small as individual molecules. With predefined conditions on the evolution of dy-namical systems, the underlying differential equations have to fulfill specific constraints in time and space. This class of problems is known as boundary value problems. This thesis presents novel approaches to learn time-reversible deterministic and stochastic dynamics constrained by initial and final conditions. The dynamics are inferred by machine learning algorithms from observed data, which is in contrast to the traditional approach of solving differential equations by numerical integration. The work in this thesis examines a set of problems of increasing difficulty each of which is concerned with learning a different aspect of the dynamics. Initially, we consider learning deterministic dynamics from ground truth solutions which are constrained by deterministic boundary conditions. Secondly, we study a boundary value problem in discrete state spaces, where the forward dynamics follow a stochastic jump process and the boundary conditions are discrete probability distributions. In particular, the stochastic dynamics of a specific jump process, the Ehrenfest process, is considered and the reverse time dynamics are inferred with machine learning. Finally, we investigate the problem of inferring the dynamics of a continuous-time stochastic process between two probability distributions without any reference information. Here, we propose a novel criterion to learn time-reversible dynamics of two stochastic processes to solve the Schr\"odinger Bridge Problem.


FaultExplainer: Leveraging Large Language Models for Interpretable Fault Detection and Diagnosis

arXiv.org Artificial Intelligence

Machine learning algorithms are increasingly being applied to fault detection and diagnosis (FDD) in chemical processes. However, existing data-driven FDD platforms often lack interpretability for process operators and struggle to identify root causes of previously unseen faults. This paper presents FaultExplainer, an interactive tool designed to improve fault detection, diagnosis, and explanation in the Tennessee Eastman Process (TEP). FaultExplainer integrates real-time sensor data visualization, Principal Component Analysis (PCA)-based fault detection, and identification of top contributing variables within an interactive user interface powered by large language models (LLMs). We evaluate the LLMs' reasoning capabilities in two scenarios: one where historical root causes are provided, and one where they are not to mimic the challenge of previously unseen faults. Experimental results using GPT-4o and o1-preview models demonstrate the system's strengths in generating plausible and actionable explanations, while also highlighting its limitations, including reliance on PCA-selected features and occasional hallucinations.


A Survey on Large Language Model-based Agents for Statistics and Data Science

arXiv.org Artificial Intelligence

In recent years, data science agents powered by Large Language Models (LLMs), known as "data agents," have shown significant potential to transform the traditional data analysis paradigm. This survey provides an overview of the evolution, capabilities, and applications of LLM-based data agents, highlighting their role in simplifying complex data tasks and lowering the entry barrier for users without related expertise. We explore current trends in the design of LLM-based frameworks, detailing essential features such as planning, reasoning, reflection, multi-agent collaboration, user interface, knowledge integration, and system design, which enable agents to address data-centric problems with minimal human intervention. Furthermore, we analyze several case studies to demonstrate the practical applications of various data agents in real-world scenarios. Finally, we identify key challenges and propose future research directions to advance the development of data agents into intelligent statistical analysis software.


Machine Learning Co-pilot for Screening of Organic Molecular Additives for Perovskite Solar Cells

arXiv.org Artificial Intelligence

Machine learning (ML) has been extensively employed in planar perovskite photovoltaics to screen effective organic molecular additives, while encountering predictive biases for novel materials due to small datasets and reliance on predefined descriptors. Present work thus proposes an effective approach, Co-Pilot for Perovskite Additive Screener (Co-PAS), an ML-driven framework designed to accelerate additive screening for perovskite solar cells (PSCs). Co-PAS overcomes predictive biases by integrating the Molecular Scaffold Classifier (MSC) for scaffold-based pre-screening and utilizing Junction Tree Variational Autoencoder (JTVAE) latent vectors to enhance molecular structure representation, thereby enhancing the accuracy of power conversion efficiency (PCE) predictions. Leveraging Co-PAS, we integrate domain knowledge to screen an extensive dataset of 250,000 molecules from PubChem, prioritizing candidates based on predicted PCE values and key molecular properties such as donor number, dipole moment, and hydrogen bond acceptor count. This workflow leads to the identification of several promising passivating molecules, including the novel Boc-L-threonine N-hydroxysuccinimide ester (BTN), which, to our knowledge, has not been explored as an additive in PSCs and achieves a device PCE of 25.20%. Our results underscore the potential of Co-PAS in advancing additive discovery for high-performance PSCs.


Cutting Sequence Diffuser: Sim-to-Real Transferable Planning for Object Shaping by Grinding

arXiv.org Artificial Intelligence

Automating object shaping by grinding with a robot is a crucial industrial process that involves removing material with a rotating grinding belt. This process generates removal resistance depending on such process conditions as material type, removal volume, and robot grinding posture, all of which complicate the analytical modeling of shape transitions. Additionally, a data-driven approach based on real-world data is challenging due to high data collection costs and the irreversible nature of the process. This paper proposes a Cutting Sequence Diffuser (CSD) for object shaping by grinding. The CSD, which only requires simple simulation data for model learning, offers an efficient way to plan long-horizon action sequences transferable to the real world. Our method designs a smooth action space with constrained small removal volumes to suppress the complexity of the shape transitions caused by removal resistance, thus reducing the reality gap in simulations. Moreover, by using a diffusion model to generate long-horizon action sequences, our approach reduces the planning time and allows for grinding the target shape while adhering to the constraints of a small removal volume per step. Through evaluations in both simulation and real robot experiments, we confirmed that our CSD was effective for grinding to different materials and various target shapes in a short time.