Goto

Collaborating Authors

 Materials


GeoT: A Geometry-aware Transformer for Reliable Molecular Property Prediction and Chemically Interpretable Representation Learning

arXiv.org Artificial Intelligence

In recent years, molecular representation learning has emerged as a key area of focus in various chemical tasks. However, many existing models fail to fully consider the geometric information of molecular structures, resulting in less intuitive representations. Moreover, the widely used message-passing mechanism is limited to provide the interpretation of experimental results from a chemical perspective. To address these challenges, we introduce a novel Transformer-based framework for molecular representation learning, named the Geometry-aware Transformer (GeoT). GeoT learns molecular graph structures through attention-based mechanisms specifically designed to offer reliable interpretability, as well as molecular property prediction. Consequently, GeoT can generate attention maps of interatomic relationships associated with training objectives. In addition, GeoT demonstrates comparable performance to MPNN-based models while achieving reduced computational complexity. Our comprehensive experiments, including an empirical simulation, reveal that GeoT effectively learns the chemical insights into molecular structures, bridging the gap between artificial intelligence and molecular sciences.


Length Generalization in Arithmetic Transformers

arXiv.org Artificial Intelligence

We examine how transformers cope with two challenges: learning basic integer arithmetic, and generalizing to longer sequences than seen during training. We find that relative position embeddings enable length generalization for simple tasks, such as addition: models trained on $5$-digit numbers can perform $15$-digit sums. However, this method fails for multiplication, and we propose train set priming: adding a few ($10$ to $50$) long sequences to the training set. We show that priming allows models trained on $5$-digit $\times$ $3$-digit multiplications to generalize to $35\times 3$ examples. We also show that models can be primed for different generalization lengths, and that the priming sample size scales as the logarithm of the training set size. Finally, we discuss potential applications of priming beyond arithmetic.


Exploring the Design Space of Extra-Linguistic Expression for Robots

arXiv.org Artificial Intelligence

In this paper, we explore the new design space of extra-linguistic cues inspired by graphical tropes used in graphic novels and animation to enhance the expressiveness of social robots. To achieve this, we identified a set of cues that can be used to generate expressions, including smoke/steam/fog, water droplets, and bubbles. We prototyped devices that can generate these fluid expressions for a robot and conducted design sessions where eight designers explored the use and utility of the cues in conveying the robot's internal states in various design scenarios. Our analysis of the 22 designs, the associated design justifications, and the interviews with designers revealed patterns in how each cue was used, how they were combined with nonverbal cues, and where the participants drew their inspiration from. These findings informed the design of an integrated module called EmoPack, which can be used to augment the expressive capabilities of any robot platform.


Toward a Millimeter-Scale Tendon-Driven Continuum Wrist with Integrated Gripper for Microsurgical Applications

arXiv.org Artificial Intelligence

Microsurgery is a particularly impactful yet challenging form of surgery. Robot assisted microsurgery has the potential to improve surgical dexterity and enable precise operation on such small scales in ways not previously possible. Intraocular microsurgery is a particularly challenging domain in part due to the lack of dexterity that is achievable with rigid instruments inserted through the eye. In this work, we present a new design for a millimeter-scale, dexterous wrist intended for microsurgery applications. The wrist is created via a state-of-the-art two-photon-polymerization (2PP) microfabrication technique, enabling the wrist to be constructed of flexible material with complex internal geometries and critical features at the micron-scale. The wrist features a square cross section with side length of 1.25 mm and total length of 3.75 mm. The wrist has three tendons routed down its length which, when actuated by small-scale linear actuators, enable bending in any plane. We present an integrated gripper actuated by a fourth tendon routed down the center of the robot. We evaluate the wrist and gripper by characterizing its bend-angle. We achieve more than 90 degrees bending in both axes. We demonstrate out of plane bending as well as the robot's ability to grip while actuated. Our integrated gripper/tendon-driven continuum robot design and meso-scale assembly techniques have the potential to enable small-scale wrists with more dexterity than has been previously demonstrated. Such a wrist could improve surgeon capabilities during teleoperation with the potential to improve patient outcomes in a variety of surgical applications, including intraocular surgery.


Conformal link prediction to control the error rate

arXiv.org Machine Learning

Most link prediction methods return estimates of the connection probability of missing edges in a graph. Such output can be used to rank the missing edges, from most to least likely to be a true edge, but it does not directly provide a classification into true and non-existent. In this work, we consider the problem of identifying a set of true edges with a control of the false discovery rate (FDR). We propose a novel method based on high-level ideas from the literature on conformal inference. The graph structure induces intricate dependence in the data, which we carefully take into account, as this makes the setup different from the usual setup in conformal inference, where exchangeability is assumed. The FDR control is empirically demonstrated for both simulated and real data.


Automatic Truss Design with Reinforcement Learning

arXiv.org Artificial Intelligence

Truss layout design, namely finding a lightweight truss layout satisfying all the physical constraints, is a fundamental problem in the building industry. Generating the optimal layout is a challenging combinatorial optimization problem, which can be extremely expensive to solve by exhaustive search. Directly applying end-to-end reinforcement learning (RL) methods to truss layout design is infeasible either, since only a tiny portion of the entire layout space is valid under the physical constraints, leading to particularly sparse rewards for RL training. In this paper, we develop AutoTruss, a two-stage framework to efficiently generate both lightweight and valid truss layouts. AutoTruss first adopts Monte Carlo tree search to discover a diverse collection of valid layouts. Then RL is applied to iteratively refine the valid solutions. We conduct experiments and ablation studies in popular truss layout design test cases in both 2D and 3D settings. AutoTruss outperforms the best-reported layouts by 25.1% in the most challenging 3D test cases, resulting in the first effective deep-RL-based approach in the truss layout design literature.


Spectral Analysis of Marine Debris in Simulated and Observed Sentinel-2/MSI Images using Unsupervised Classification

arXiv.org Artificial Intelligence

Marine litter poses significant threats to marine and coastal environments, with its impacts ever-growing. Remote sensing provides an advantageous supplement to traditional mitigation techniques, such as local cleaning operations and trawl net surveys, due to its capabilities for extensive coverage and frequent observation. In this study, we used Radiative Transfer Model (RTM) simulated data and data from the Multispectral Instrument (MSI) of the Sentinel-2 mission in combination with machine learning algorithms. Our aim was to study the spectral behavior of marine plastic pollution and evaluate the applicability of RTMs within this research area. The results from the exploratory analysis and unsupervised classification using the KMeans algorithm indicate that the spectral behavior of pollutants is influenced by factors such as the type of polymer and pixel coverage percentage. The findings also reveal spectral characteristics and trends of association and differentiation among elements. The applied methodology is strongly dependent on the data, and if reapplied in new, more diverse, and detailed datasets, it can potentially generate even better results. These insights can guide future research in remote sensing applications for detecting marine plastic pollution.


Segmentation of Industrial Burner Flames: A Comparative Study from Traditional Image Processing to Machine and Deep Learning

arXiv.org Artificial Intelligence

In many industrial processes, such as power generation, chemical production, and waste management, accurately monitoring industrial burner flame characteristics is crucial for safe and efficient operation. A key step involves separating the flames from the background through binary segmentation. Decades of machine vision research have produced a wide range of possible solutions, from traditional image processing to traditional machine learning and modern deep learning methods. In this work, we present a comparative study of multiple segmentation approaches, namely Global Thresholding, Region Growing, Support Vector Machines, Random Forest, Multilayer Perceptron, U-Net, and DeepLabV3+, that are evaluated on a public benchmark dataset of industrial burner flames. We provide helpful insights and guidance for researchers and practitioners aiming to select an appropriate approach for the binary segmentation of industrial burner flames and beyond. For the highest accuracy, deep learning is the leading approach, while for fast and simple solutions, traditional image processing techniques remain a viable option.


Segmentation and Tracking of Vegetable Plants by Exploiting Vegetable Shape Feature for Precision Spray of Agricultural Robots

arXiv.org Artificial Intelligence

With the increasing deployment of agricultural robots, the traditional manual spray of liquid fertilizer and pesticide is gradually being replaced by agricultural robots. For robotic precision spray application in vegetable farms, accurate plant phenotyping through instance segmentation and robust plant tracking are of great importance and a prerequisite for the following spray action. Regarding the robust tracking of vegetable plants, to solve the challenging problem of associating vegetables with similar color and texture in consecutive images, in this paper, a novel method of Multiple Object Tracking and Segmentation (MOTS) is proposed for instance segmentation and tracking of multiple vegetable plants. In our approach, contour and blob features are extracted to describe unique feature of each individual vegetable, and associate the same vegetables in different images. By assigning a unique ID for each vegetable, it ensures the robot to spray each vegetable exactly once, while traversing along the farm rows. Comprehensive experiments including ablation studies are conducted, which prove its superior performance over two State-Of-The-Art (SOTA) MOTS methods. Compared to the conventional MOTS methods, the proposed method is able to re-identify objects which have gone out of the camera field of view and re-appear again using the proposed data association strategy, which is important to ensure each vegetable be sprayed only once when the robot travels back and forth. Although the method is tested on lettuce farm, it can be applied to other similar vegetables such as broccoli and canola. Both code and the dataset of this paper is publicly released for the benefit of the community: https://github.com/NanH5837/LettuceMOTS.


Learning Transductions and Alignments with RNN Seq2seq Models

arXiv.org Artificial Intelligence

The paper studies the capabilities of Recurrent-Neural-Network sequence to sequence (RNN seq2seq) models in learning four transduction tasks: identity, reversal, total reduplication, and quadratic copying. These transductions are traditionally well studied under finite state transducers and attributed with increasing complexity. We find that RNN seq2seq models are only able to approximate a mapping that fits the training or in-distribution data, instead of learning the underlying functions. Although attention makes learning more efficient and robust, it does not overcome the out-of-distribution generalization limitation. We establish a novel complexity hierarchy for learning the four tasks for attention-less RNN seq2seq models, which may be understood in terms of the complexity hierarchy of formal languages, instead of string transductions. RNN variants also play a role in the results. In particular, we show that Simple RNN seq2seq models cannot count the input length.