Tensor clustering with algebraic constraints gives interpretable groups of crosstalk mechanisms in breast cancer

Seigal, Anna, Beguerisse-Díaz, Mariano, Schoeberl, Birgit, Niepel, Mario, Harrington, Heather A.

arXiv.org Machine Learning 

Muti-dimensional datasets are now prevalent across the sciences; their ubiquity and importance will only continue to grow [1-4]. The analysis of data demands methods that preserve multidimensional structures, and that exploit them. We introduce a versatile data clustering framework based on tensors (high dimensional arrays) and algebra to analyze multidimensional datasets. One key feature of this method is that it can incorporate general, application-specific constraints on the composition of a cluster, and is guaranteed to find optimal partitions. The flexibility of the method allows it to be used directly on a dataset (i.e., as a standalone clustering tool), or in combination with other clustering methods. We apply our method on an extensive set of timecourse measurements of the activation levels of the mitogen-activated protein kinase (MAPK) and phosphoinositide 3-kinase (PI3K) pathways that are involved in cellular decisions and fates [10-13], and are known to dysfunction in cancer [10-13, 16]. The key signaling proteins and subtype responses in breast cancer cells are known; however, among genetically diverse cell lines the dysfunction varies and is not well understood [1, 15, 16]. Our objective is to find groups of cell lines whose signal transduction networks have similar dynamics. A high similarity suggests that the cell lines share pathway features that can be relevant for the responses to the ligands.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found