We tested most of the 6000 genes in the yeast Saccharomyces cerevisiae for all possible pairwise genetic interactions, identifying nearly 1 million interactions, including 550,000 negative and 350,000 positive interactions, spanning 90% of all yeast genes. Essential genes were network hubs, displaying five times as many interactions as nonessential genes. The set of genetic interactions or the genetic interaction profile for a gene provides a quantitative measure of function, and a global network based on genetic interaction profile similarity revealed a hierarchy of modules reflecting the functional architecture of a cell. Negative interactions connected functionally related genes, mapped core bioprocesses, and identified pleiotropic genes, whereas positive interactions often mapped general regulatory connections associated with defects in cell cycle progression or cellular proteostasis. Importantly, the global network illustrates how coherent sets of negative or positive genetic interactions connect protein complex and pathways to map a functional wiring diagram of the cell.
Measurement studies of online social networks (OSNs)show that all social links are not equal, and the strength of each link is best characterized by the frequency of interactions between the linked users. To date, few studieshave been able to examine detailed interactiondata over time. In this paper, we first analyze the interaction dynamics in a large online social network. We find that users invite new friends to interact at a nearly constant rate, prefer to continue interacting with friends with whom they have a larger number of historical interactions,and most social links drop in interaction frequency over time. Then, we use our insights from the analysis to derive a generative model of social interactionsthat can capture fundamental processes underlinguser interactions.
Interpreting neural networks is a crucial and challenging task in machine learning. In this paper, we develop a novel framework for detecting statistical interactions captured by a feedforward multilayer neural network by directly interpreting its learned weights. Depending on the desired interactions, our method can achieve significantly better or similar interaction detection performance compared to the state-of-the-art without searching an exponential solution space of possible interactions. We obtain this accuracy and efficiency by observing that interactions between input features are created by the non-additive effect of nonlinear activation functions, and that interacting paths are encoded in weight matrices. We demonstrate the performance of our method and the importance of discovered interactions via experimental results on both synthetic datasets and real-world application datasets.
Protein interaction networks are a promising type of data for studying complex biological systems. However, despite the rich information embedded in these networks, they face important data quality challenges of noise and incompleteness that adversely affect the results obtained from their analysis. Here, we explore the use of the concept of common neighborhood similarity (CNS), which is a form of local structure in networks, to address these issues. Although several CNS measures have been proposed in the literature, an understanding of their relative efficacies for the analysis of interaction networks has been lacking. We follow the framework of graph transformation to convert the given interaction network into a transformed network corresponding to a variety of CNS measures evaluated. The effectiveness of each measure is then estimated by comparing the quality of protein function predictions obtained from its corresponding transformed network with those from the original network. Using a large set of S. cerevisiae interactions, and a set of 136 GO terms, we find that several of the transformed networks produce more accurate predictions than those obtained from the original network. In particular, the $HC.cont$ measure proposed here performs particularly well for this task. Further investigation reveals that the two major factors contributing to this improvement are the abilities of CNS measures, especially $HC.cont$, to prune out noisy edges and introduce new links between functionally related proteins.
Multi-omic data provides multiple views of the same patients. Integrative analysis of multi-omic data is crucial to elucidate the molecular underpinning of disease etiology. However, multi-omic data has the "big p, small N" problem (the number of features is large, but the number of samples is small), it is challenging to train a complicated machine learning model from the multi-omic data alone and make it generalize well. Here we propose a framework termed Multi-view Factorization AutoEncoder with network constraints to integrate multi-omic data with domain knowledge (biological interactions networks). Our framework employs deep representation learning to learn feature embeddings and patient embeddings simultaneously, enabling us to integrate feature interaction network and patient view similarity network constraints into the training objective. The whole framework is end-to-end differentiable. We applied our approach to the TCGA Pan-cancer dataset and achieved satisfactory results to predict disease progression-free interval (PFI) and patient overall survival (OS) events. Code will be made publicly available.