Goto

Collaborating Authors

 Swanson, Kyle


Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

arXiv.org Machine Learning

Selecting input features of top relevance has become a popular method for building self-explaining models. In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction. Our approach employs optimal transport (OT) to find a minimal cost alignment between the inputs. However, directly applying OT often produces dense and therefore uninterpretable alignments. To overcome this limitation, we introduce novel constrained variants of the OT problem that result in highly sparse alignments with controllable sparsity. Our model is end-to-end differentiable using the Sinkhorn algorithm for OT and can be trained without any alignment annotations. We evaluate our model on the StackExchange, MultiNews, e-SNLI, and MultiRC datasets. Our model achieves very sparse rationale selections with high fidelity while preserving prediction accuracy compared to strong attention baseline models.


Deep Learning for Automated Classification and Characterization of Amorphous Materials

arXiv.org Machine Learning

The characterization of amorphous materials is especially challenging because their lack of long-range order makes it difficult to define structural metrics. In this work, we apply deep learning algorithms to accurately classify amorphous materials and characterize their structural features. Specifically, we show that convolutional neural networks and message passing neural networks can classify two-dimensional liquids and liquid-cooled glasses from molecular dynamics simulations with greater than 0.98 AUC, with no a priori assumptions about local particle relationships, even when the liquids and glasses are prepared at the same inherent structure energy. Furthermore, we demonstrate that message passing neural networks surpass convolutional neural networks in this context in both accuracy and interpretability. We extract a clear interpretation of how message passing neural networks evaluate liquid and glass structures by using a self-attention mechanism. Using this interpretation, we derive three novel structural metrics that accurately characterize glass formation. The methods presented here provide us with a procedure to identify important structural features in materials that could be missed by standard techniques and give us a unique insight into how these neural networks process data. I. INTRODUCTION Classifying material structures and predicting their properties are important tasks in materials science. The behavior of materials often depends strongly on their underlying structure, and understanding these structure-property relationships relies on accurately describing the structural features of a material. However, quantifying structure-property relationships and identifying structural features in complex materials are difficult tasks. A variety of standard techniques have been developed to analyze material structures. Some of the most common techniques include the Steinhardt bond order parameters, 1 Bond Angle Analysis (BAA), 2 and Common Neighbor Analysis (CNA), 3 which are useful for detecting order-disorder transitions and differentiating between crystal structures in ordered samples. As discussed in Reinhardt et al., 4 the Steinhardt bond order parameters can be stymied by thermal fluctuations or am-a) Electronic mail: swansonk1@uchicago.edu BAA relies on a small set of crystalline reference structures that may not be present in amorphous samples. CNA is more flexible than BAA, but it cannot provide accurate information about particles that do not exhibit known symmetries, making analysis of irregular structures challenging.


Are Learned Molecular Representations Ready For Prime Time?

arXiv.org Machine Learning

Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors, and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 15 proprietary industrial datasets spanning a wide variety of chemical endpoints. In addition, we introduce a graph convolutional model that consistently outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary datasets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows.