Goto

Collaborating Authors

 ect


Molecular Machine Learning Using Euler Characteristic Transforms

Toscano-Duran, Victor, Rottach, Florian, Rieck, Bastian

arXiv.org Artificial Intelligence

The shape of a molecule determines its physicochemical and biological properties. However, it is often underrepresented in standard molecular representation learning approaches. Here, we propose using the Euler Characteristic Transform (ECT) as a geometrical-topological descriptor. Computed directly on a molecular graph derived from handcrafted atomic features, the ECT enables the extraction of multiscale structural features, offering a novel way to represent and encode molecular shape in the feature space. We assess the predictive performance of this representation across nine benchmark regression datasets, all centered around predicting the inhibition constant $K_i$. In addition, we compare our proposed ECT-based representation against traditional molecular representations and methods, such as molecular fingerprints/descriptors and graph neural networks (GNNs). Our results show that our ECT-based representation achieves competitive performance, ranking among the best-performing methods on several datasets. More importantly, its combination with traditional representations, particularly with the AVALON fingerprint, significantly \emph{enhances predictive performance}, outperforming other methods on most datasets. These findings highlight the complementary value of multiscale topological information and its potential for being combined with established techniques. Our study suggests that hybrid approaches incorporating explicit shape information can lead to more informative and robust molecular representations, enhancing and opening new avenues in molecular machine learning tasks. To support reproducibility and foster open biomedical research, we provide open access to all experiments and code used in this work.


Identifying Heterogeneity in Distributed Learning

Xiao, Zelin, Gu, Jia, Chen, Song Xi

arXiv.org Machine Learning

We study methods for identifying heterogeneous parameter components in distributed M-estimation with minimal data transmission. One is based on a re-normalized Wald test, which is shown to be consistent as long as the number of distributed data blocks $K$ is of a smaller order of the minimum block sample size and the level of heterogeneity is dense. The second one is an extreme contrast test (ECT) based on the difference between the largest and smallest component-wise estimated parameters among data blocks. By introducing a sample splitting procedure, the ECT can avoid the bias accumulation arising from the M-estimation procedures, and exhibits consistency for $K$ being much larger than the sample size while the heterogeneity is sparse. The ECT procedure is easy to operate and communication-efficient. A combination of the Wald and the extreme contrast tests is formulated to attain more robust power under varying levels of sparsity of the heterogeneity. We also conduct intensive numerical experiments to compare the family-wise error rate (FWER) and the power of the proposed methods. Additionally, we conduct a case study to present the implementation and validity of the proposed methods.


Causal Decomposition Analysis with Synergistic Interventions: A Triply-Robust Machine Learning Approach to Addressing Multiple Dimensions of Social Disparities

Park, Soojin, Kim, Su Yeon, Zheng, Xinyao, Lee, Chioun

arXiv.org Machine Learning

Educational disparities are rooted in and perpetuate social inequalities across multiple dimensions such as race, socioeconomic status, and geography. To reduce disparities, most intervention strategies focus on a single domain and frequently evaluate their effectiveness by using causal decomposition analysis. However, a growing body of research suggests that single-domain interventions may be insufficient for individuals marginalized on multiple fronts. While interventions across multiple domains are increasingly proposed, there is limited guidance on appropriate methods for evaluating their effectiveness. To address this gap, we develop an extended causal decomposition analysis that simultaneously targets multiple causally ordered intervening factors, allowing for the assessment of their synergistic effects. These scenarios often involve challenges related to model misspecification due to complex interactions among group categories, intervening factors, and their confounders with the outcome. To mitigate these challenges, we introduce a triply robust estimator that leverages machine learning techniques to address potential model misspecification. We apply our method to a cohort of students from the High School Longitudinal Study, focusing on math achievement disparities between Black, Hispanic, and White high schoolers. Specifically, we examine how two sequential interventions - equalizing the proportion of students who attend high-performing schools and equalizing enrollment in Algebra I by 9th grade across racial groups - may reduce these disparities.


Simulation-Based Sensitivity Analysis in Optimal Treatment Regimes and Causal Decomposition with Individualized Interventions

Park, Soojin, Kang, Suyeon, Lee, Chioun

arXiv.org Machine Learning

Causal decomposition analysis aims to assess the effect of modifying risk factors on reducing social disparities in outcomes. Recently, this analysis has incorporated individual characteristics when modifying risk factors by utilizing optimal treatment regimes (OTRs). Since the newly defined individualized effects rely on the no omitted confounding assumption, developing sensitivity analyses to account for potential omitted confounding is essential. Moreover, OTRs and individualized effects are primarily based on binary risk factors, and no formal approach currently exists to benchmark the strength of omitted confounding using observed covariates for binary risk factors. To address this gap, we extend a simulation-based sensitivity analysis that simulates unmeasured confounders, addressing two sources of bias emerging from deriving OTRs and estimating individualized effects. Additionally, we propose a formal bounding strategy that benchmarks the strength of omitted confounding for binary risk factors. Using the High School Longitudinal Study 2009 (HSLS:09), we demonstrate this sensitivity analysis and benchmarking method.


Topology meets Machine Learning: An Introduction using the Euler Characteristic Transform

Rieck, Bastian

arXiv.org Artificial Intelligence

Machine learning is shaping up to be the transformative technology of our times: Many of us have played with (and marveled at) models like ChatGPT, new breakthroughs in applications like healthcare research are announced on an almost daily basis, and new avenues for integrating these tools into scientific research are opening up, with some mathematicians already using large language models as proof assistants. This article aims to lift the veil and dispel some myths about machine learning; along the way, it will also show how machine learning itself can benefit from mathematical concepts. Indeed, from the outside, machine learning might look like a homogeneous entity, but in fact, the field is fractured and highly diverse. While the main thrust of the field arises from the undeniable engineering advances, with bigger and better models, there is also a strong community of applied mathematicians. Next to the classical drivers of machine-learning architectures, i.e., linear algebra and statistics, topology recently started to provide novel insights into the foundations of machine learning: Point-set topology, harnessing concepts like neighborhoods, can be used to extend existing algorithms from graphs to cell complexes [4]. Algebraic topology, making use of effective invariants like homology, improves the results of models for volume reconstruction [13]. Finally, differential topology, providing tools to study smooth properties of data, results in efficient methods for analyzing embedded (simplicial) complexes [6]. These (and many more) methods have now found a home in the nascent field of topological deep learning [8]. Before diving into concrete examples, let us first take a step back and discuss machine learning as such.


Generative Topology for Shape Synthesis

Röell, Ernst, Rieck, Bastian

arXiv.org Artificial Intelligence

The Euler Characteristic Transform (ECT) is a powerful invariant for assessing geometrical and topological characteristics of a large variety of objects, including graphs and embedded simplicial complexes. Although the ECT is invertible in theory, no explicit algorithm for general data sets exists. In this paper, we address this lack and demonstrate that it is possible to learn the inversion, permitting us to develop a novel framework for shape generation tasks on point clouds. Our model exhibits high quality in reconstruction and generation tasks, affords efficient latent-space interpolation, and is orders of magnitude faster than existing methods. Understanding shapes requires understanding their geometrical and topological properties in tandem. Given the large variety of different representations of such data, ranging from point clouds over graphs to simplicial complexes, a general framework for handling such inputs is beneficial. The Euler Characteristic Transform (ECT) provides such a framework based on the idea of studying a shape from multiple directions--sampled from a sphere of appropriate dimensionality--and at multiple scales. In fact, the ECT is an injective map, serving as a unique characterisation of a shape (Ghrist et al., 2018; Turner et al., 2014). Somewhat surprisingly, this even holds when using a finite number of directions (Curry et al., 2022). Hence, while it is known that the ECT can be inverted, i.e. it is possible to reconstruct input data from an ECT, only algorithms for special cases such as planar graphs are currently known (Fasy et al., 2018).


Diss-l-ECT: Dissecting Graph Data with local Euler Characteristic Transforms

von Rohrscheidt, Julius, Rieck, Bastian

arXiv.org Artificial Intelligence

The Euler Characteristic Transform (ECT) is an efficiently-computable geometrical-topological invariant that characterizes the global shape of data. In this paper, we introduce the Local Euler Characteristic Transform ($\ell$-ECT), a novel extension of the ECT particularly designed to enhance expressivity and interpretability in graph representation learning. Unlike traditional Graph Neural Networks (GNNs), which may lose critical local details through aggregation, the $\ell$-ECT provides a lossless representation of local neighborhoods. This approach addresses key limitations in GNNs by preserving nuanced local structures while maintaining global interpretability. Moreover, we construct a rotation-invariant metric based on $\ell$-ECTs for spatial alignment of data spaces. Our method exhibits superior performance than standard GNNs on a variety of node classification tasks, particularly in graphs with high heterophily.


Consistency Models Made Easy

Geng, Zhengyang, Pokle, Ashwini, Luo, William, Lin, Justin, Kolter, J. Zico

arXiv.org Artificial Intelligence

Consistency models (CMs) are an emerging class of generative models that offer faster sampling than traditional diffusion models. CMs enforce that all points along a sampling trajectory are mapped to the same initial point. But this target leads to resource-intensive training: for example, as of 2024, training a SoTA CM on CIFAR-10 takes one week on 8 GPUs. In this work, we propose an alternative scheme for training CMs, vastly improving the efficiency of building such models. Specifically, by expressing CM trajectories via a particular differential equation, we argue that diffusion models can be viewed as a special case of CMs with a specific discretization. We can thus fine-tune a consistency model starting from a pre-trained diffusion model and progressively approximate the full consistency condition to stronger degrees over the training process. Our resulting method, which we term Easy Consistency Tuning (ECT), achieves vastly improved training times while indeed improving upon the quality of previous methods: for example, ECT achieves a 2-step FID of 2.73 on CIFAR10 within 1 hour on a single A100 GPU, matching Consistency Distillation trained of hundreds of GPU hours. Owing to this computational efficiency, we investigate the scaling law of CMs under ECT, showing that they seem to obey classic power law scaling, hinting at their ability to improve efficiency and performance at larger scales. Code (https://github.com/locuslab/ect) is available.


Instruction-Guided Bullet Point Summarization of Long Financial Earnings Call Transcripts

Khatuya, Subhendu, Sinha, Koushiki, Ganguly, Niloy, Ghosh, Saptarshi, Goyal, Pawan

arXiv.org Artificial Intelligence

While automatic summarization techniques have made significant advancements, their primary focus has been on summarizing short news articles or documents that have clear structural patterns like scientific articles or government reports. There has not been much exploration into developing efficient methods for summarizing financial documents, which often contain complex facts and figures. Here, we study the problem of bullet point summarization of long Earning Call Transcripts (ECTs) using the recently released ECTSum dataset. We leverage an unsupervised question-based extractive module followed by a parameter efficient instruction-tuned abstractive module to solve this task. Our proposed model FLAN-FinBPS achieves new state-of-the-art performances outperforming the strongest baseline with 14.88% average ROUGE score gain, and is capable of generating factually consistent bullet point summaries that capture the important facts discussed in the ECTs.


Learning Evaluation Models from Large Language Models for Sequence Generation

Wang, Chenglong, Zhou, Hang, Chang, Kaiyan, Liu, Tongran, Zhang, Chunliang, Du, Quan, Xiao, Tong, Zhu, Jingbo

arXiv.org Artificial Intelligence

Large language models achieve state-of-the-art performance on sequence generation evaluation, but typically have a large number of parameters. This is a computational challenge as presented by applying their evaluation capability at scale. To overcome the challenge, in this paper, we propose \textbf{ECT}, an \textbf{e}valuation \textbf{c}apability \textbf{t}ransfer method, to transfer the evaluation capability from LLMs to relatively lightweight language models. Based on the proposed ECT, we learn various evaluation models from ChatGPT, and employ them as reward models to improve sequence generation models via reinforcement learning and reranking approaches. Experimental results on machine translation, text style transfer, and summarization tasks demonstrate the effectiveness of our ECT. Notably, applying the learned evaluation models to sequence generation models results in better generated sequences as evaluated by commonly used metrics and ChatGPT.