Goto

Collaborating Authors

 Materials


The Engineer's Dilemma: A Review of Establishing a Legal Framework for Integrating Machine Learning in Construction by Navigating Precedents and Industry Expectations

arXiv.org Artificial Intelligence

Despite the widespread interest in machine learning (ML), the engineering industry has not yet fully adopted ML-based methods, which has left engineers and stakeholders uncertain about the legal and regulatory frameworks that govern their decisions. This gap remains unaddressed as an engineer's decision-making process, typically governed by professional ethics and practical guidelines, now intersects with complex algorithmic outputs. To bridge this gap, this paper explores how engineers can navigate legal principles and legislative justifications that support and/or contest the deployment of ML technologies. Drawing on recent precedents and experiences gained from other fields, this paper argues that analogical reasoning can provide a basis for embedding ML within existing engineering codes while maintaining professional accountability and meeting safety requirements. In exploring these issues, the discussion focuses on established liability doctrines, such as negligence and product liability, and highlights how courts have evaluated the use of predictive models. We further analyze how legislative bodies and standard-setting organizations can furnish explicit guidance equivalent to prior endorsements of emergent technologies. This exploration stresses the vitality of understanding the interplay between technical justifications and legal precedents for shaping an informed stance on ML's legitimacy in engineering practice. Finally, our analysis catalyzes a legal framework for integrating ML through which stakeholders can critically assess the responsibilities, liabilities, and benefits inherent in ML-driven engineering solutions.


Assessing the Chemical Intelligence of Large Language Models

arXiv.org Artificial Intelligence

Large Language Models are versatile, general-purpose tools with a wide range of applications. Recently, the advent of "reasoning models" has led to substantial improvements in their abilities in advanced problem-solving domains such as mathematics and software engineering. In this work, we assessed the ability of reasoning models to perform chemistry tasks directly, without any assistance from external tools. We created a novel benchmark, called ChemIQ, consisting of 816 questions assessing core concepts in organic chemistry, focused on molecular comprehension and chemical reasoning. Unlike previous benchmarks, which primarily use multiple choice formats, our approach requires models to construct short-answer responses, more closely reflecting real-world applications. The reasoning models, OpenAI's o3-mini, Google's Gemini Pro 2.5, and DeepSeek R1, answered 50%-57% of questions correctly in the highest reasoning modes, with higher reasoning levels significantly increasing performance on all tasks. These models substantially outperformed the non-reasoning models which achieved only 3%-7% accuracy. We found that Large Language Models can now convert SMILES strings to IUPAC names, a task earlier models were unable to perform. Additionally, we show that the latest reasoning models can elucidate structures from 1D and 2D 1H and 13C NMR data, with Gemini Pro 2.5 correctly generating SMILES strings for around 90% of molecules containing up to 10 heavy atoms, and in one case solving a structure comprising 25 heavy atoms. For each task, we found evidence that the reasoning process mirrors that of a human chemist. Our results demonstrate that the latest reasoning models can, in some cases, perform advanced chemical reasoning.


Red Teaming Large Language Models for Healthcare

arXiv.org Artificial Intelligence

We present the design process and findings of the pre-conference workshop at the Machine Learning for Healthcare Conference (2024) entitled Red Teaming Large Language Models for Healthcare, which took place on August 15, 2024. Conference participants, comprising a mix of computational and clinical expertise, attempted to discover vulnerabilities -- realistic clinical prompts for which a large language model (LLM) outputs a response that could cause clinical harm. Red-teaming with clinicians enables the identification of LLM vulnerabilities that may not be recognised by LLM developers lacking clinical expertise. We report the vulnerabilities found, categorise them, and present the results of a replication study assessing the vulnerabilities across all LLMs provided.


Joint Optimization-based Targetless Extrinsic Calibration for Multiple LiDARs and GNSS-Aided INS of Ground Vehicles

arXiv.org Artificial Intelligence

Accurate extrinsic calibration between multiple LiDAR sensors and a GNSS-aided inertial navigation system (GINS) is essential for achieving reliable sensor fusion in intelligent mining environments. Such calibration enables vehicle-road collaboration by aligning perception data from vehicle-mounted sensors to a unified global reference frame. However, existing methods often depend on artificial targets, overlapping fields of view, or precise trajectory estimation, which are assumptions that may not hold in practice. Moreover, the planar motion of mining vehicles leads to observability issues that degrade calibration performance. This paper presents a targetless extrinsic calibration method that aligns multiple onboard LiDAR sensors to the GINS coordinate system without requiring overlapping sensor views or external targets. The proposed approach introduces an observation model based on the known installation height of the GINS unit to constrain unobservable calibration parameters under planar motion. A joint optimization framework is developed to refine both the extrinsic parameters and GINS trajectory by integrating multiple constraints derived from geometric correspondences and motion consistency. The proposed method is applicable to heterogeneous LiDAR configurations, including both mechanical and solid-state sensors. Extensive experiments on simulated and real-world datasets demonstrate the accuracy, robustness, and practical applicability of the approach under diverse sensor setups.


DAVID MARCUS: Musk's Nazi AI glitch a flaming canary in our national coal mine

FOX News

The CyberGuy Kurt Knutsson gives his take on Elon Musk's claims that Grok 3 outperforms every AI rival on'Fox & Friends.' On July 4th, eccentric billionaire and owner of X Elon Musk took to his social media platform to make an announcement about its Artificial Intelligence bot named Grok. "We have improved Grok significantly," Musk told the world. "You should notice a difference when you ask Grok questions." Just a few days later, Grok had to have features shut down after it started answering questions by going full-Nazi and espousing antisemitic conspiracy theories. All that was missing was digital goosestepping and armbands.


Extracting ORR Catalyst Information for Fuel Cell from Scientific Literature

arXiv.org Artificial Intelligence

The oxygen reduction reaction (ORR) catalyst plays a critical role in enhancing fuel cell efficiency, making it a key focus in material science research. However, extracting structured information about ORR catalysts from vast scientific literature remains a significant challenge due to the complexity and diversity of textual data. In this study, we propose a named entity recognition (NER) and relation extraction (RE) approach using DyGIE++ with multiple pre-trained BERT variants, including MatSciBERT and PubMedBERT, to extract ORR catalyst-related information from the scientific literature, which is compiled into a fuel cell corpus for materials informatics (FC-CoMIcs). A comprehensive dataset was constructed manually by identifying 12 critical entities and two relationship types between pairs of the entities. Our methodology involves data annotation, integration, and fine-tuning of transformer-based models to enhance information extraction accuracy. We assess the impact of different BERT variants on extraction performance and investigate the effects of annotation consistency. Experimental evaluations demonstrate that the fine-tuned PubMedBERT model achieves the highest NER F1-score of 82.19% and the MatSciBERT model attains the best RE F1-score of 66.10%. Furthermore, the comparison with human annotators highlights the reliability of fine-tuned models for ORR catalyst extraction, demonstrating their potential for scalable and automated literature analysis. The results indicate that domain-specific BERT models outperform general scientific models like BlueBERT for ORR catalyst extraction.


Robotic System with AI for Real Time Weed Detection, Canopy Aware Spraying, and Droplet Pattern Evaluation

arXiv.org Artificial Intelligence

Uniform and excessive herbicide application in modern agriculture contributes to increased input costs, environmental pollution, and the emergence of herbicide resistant weeds. To address these challenges, we developed a vision guided, AI-driven variable rate sprayer system capable of detecting weed presence, estimating canopy size, and dynamically adjusting nozzle activation in real time. The system integrates lightweight YOLO11n and YOLO11n-seg deep learning models, deployed on an NVIDIA Jetson Orin Nano for onboard inference, and uses an Arduino Uno-based relay interface to control solenoid actuated nozzles based on canopy segmentation results. Indoor trials were conducted using 15 potted Hibiscus rosa sinensis plants of varying canopy sizes to simulate a range of weed patch scenarios. The YOLO11n model achieved a mean average precision (mAP@50) of 0.98, with a precision of 0.99 and a recall close to 1.0. The YOLO11n-seg segmentation model achieved a mAP@50 of 0.48, precision of 0.55, and recall of 0.52. System performance was validated using water sensitive paper, which showed an average spray coverage of 24.22% in zones where canopy was present. An upward trend in mean spray coverage from 16.22% for small canopies to 21.46% and 21.65% for medium and large canopies, respectively, demonstrated the system's capability to adjust spray output based on canopy size in real time. These results highlight the potential of combining real time deep learning with low-cost embedded hardware for selective herbicide application. Future work will focus on expanding the detection capabilities to include three common weed species in South Dakota: water hemp (Amaranthus tuberculatus), kochia (Bassia scoparia), and foxtail (Setaria spp.), followed by further validation in both indoor and field trials within soybean and corn production systems.


GAF-Guard: An Agentic Framework for Risk Management and Governance in Large Language Models

arXiv.org Artificial Intelligence

As Large Language Models (LLMs) continue to be increasingly applied across various domains, their widespread adoption necessitates rigorous monitoring to prevent unintended negative consequences and ensure robustness. Furthermore, LLMs must be designed to align with human values, like preventing harmful content and ensuring responsible usage. The current automated systems and solutions for monitoring LLMs in production are primarily centered on LLM-specific concerns like hallucination etc, with little consideration given to the requirements of specific use-cases and user preferences. This paper introduces GAF-Guard, a novel agentic framework for LLM governance that places the user, the use-case, and the model itself at the center. The framework is designed to detect and monitor risks associated with the deployment of LLM based applications. The approach models autonomous agents that identify risks, activate risk detection tools, within specific use-cases and facilitate continuous monitoring and reporting to enhance AI safety, and user expectations. The code is available at https://github.com/IBM/risk-atlas-nexus-demos/tree/main/gaf-guard.


Explainable Hierarchical Deep Learning Neural Networks (Ex-HiDeNN)

arXiv.org Artificial Intelligence

Data-driven science and computation have advanced immensely to construct complex functional relationships using trainable parameters. However, efficiently discovering interpretable and accurate closed-form expressions from complex dataset remains a challenge. The article presents a novel approach called Explainable Hierarchical Deep Learning Neural Networks or Ex-HiDeNN that uses an accurate, frugal, fast, separable, and scalable neural architecture with symbolic regression to discover closed-form expressions from limited observation. The article presents the two-step Ex-HiDeNN algorithm with a separability checker embedded in it. The accuracy and efficiency of Ex-HiDeNN are tested on several benchmark problems, including discerning a dynamical system from data, and the outcomes are reported. Ex-HiDeNN generally shows outstanding approximation capability in these benchmarks, producing orders of magnitude smaller errors compared to reference data and traditional symbolic regression. Later, Ex-HiDeNN is applied to three engineering applications: a) discovering a closed-form fatigue equation, b) identification of hardness from micro-indentation test data, and c) discovering the expression for the yield surface with data. In every case, Ex-HiDeNN outperformed the reference methods used in the literature. The proposed method is built upon the foundation and published works of the authors on Hierarchical Deep Learning Neural Network (HiDeNN) and Convolutional HiDeNN. The article also provides a clear idea about the current limitations and future extensions of Ex-HiDeNN.


Combining Graph Neural Networks and Mixed Integer Linear Programming for Molecular Inference under the Two-Layered Model

arXiv.org Artificial Intelligence

Recently, a novel two-phase framework named mol-infer for inference of chemical compounds with prescribed abstract structures and desired property values has been proposed. The framework mol-infer is primarily based on using mixed integer linear programming (MILP) to simulate the computational process of machine learning methods and describe the necessary and sufficient conditions to ensure such a chemical graph exists. The existing approaches usually first convert the chemical compounds into handcrafted feature vectors to construct prediction functions, but because of the limit on the kinds of descriptors originated from the need for tractability in the MILP formulation, the learning performances on datasets of some properties are not good enough. A lack of good learning performance can greatly lower the quality of the inferred chemical graphs, and thus improving learning performance is of great importance. On the other hand, graph neural networks (GNN) offer a promising machine learning method to directly utilize the chemical graphs as the input, and many existing GNN-based approaches to the molecular property prediction problem have shown that they can enjoy better learning performances compared to the traditional approaches that are based on feature vectors. In this study, we develop a molecular inference framework based on mol-infer, namely mol-infer-GNN, that utilizes GNN as the learning method while keeping the great flexibility originated from the two-layered model on the abstract structure of the chemical graph to be inferred. We conducted computational experiments on the QM9 dataset to show that our proposed GNN model can obtain satisfying learning performances for some properties despite its simple structure, and can infer small chemical graphs comprising up to 20 non-hydrogen atoms within reasonable computational time.