Collaborating Authors


Tree++: Truncated Tree Based Graph Kernels Machine Learning

Graph-structured data arise ubiquitously in many application domains. A fundamental problem is to quantify their similarities. Graph kernels are often used for this purpose, which decompose graphs into substructures and compare these substructures. However, most of the existing graph kernels do not have the property of scale-adaptivity, i.e., they cannot compare graphs at multiple levels of granularities. Many real-world graphs such as molecules exhibit structure at varying levels of granularities. To tackle this problem, we propose a new graph kernel called Tree++ in this paper. At the heart of Tree++ is a graph kernel called the path-pattern graph kernel. The path-pattern graph kernel first builds a truncated BFS tree rooted at each vertex and then uses paths from the root to every vertex in the truncated BFS tree as features to represent graphs. The path-pattern graph kernel can only capture graph similarity at fine granularities. In order to capture graph similarity at coarse granularities, we incorporate a new concept called super path into it. The super path contains truncated BFS trees rooted at the vertices in a path. Our evaluation on a variety of real-world graphs demonstrates that Tree++ achieves the best classification accuracy compared with previous graph kernels.

Efficient Structure-preserving Support Tensor Train Machine Machine Learning

Deploying the multi-relational tensor structure of a high dimensional feature space, more efficiently improves the performance of machine learning algorithms. One encounters the \emph{curse of dimensionality}, and working with vectorized data fails to preserve the data structure. To mitigate the nonlinear relationship of tensor data more economically, we propose the \emph{Tensor Train Multi-way Multi-level Kernel (TT-MMK)}. This technique combines kernel filtering of the initial input data (\emph{Kernelized Tensor Train (KTT)}), stable reparametrization of the KTT in the Canonical Polyadic (CP) format, and the Dual Structure-preserving Support Vector Machine (\emph{SVM}) Kernel for revealing nonlinear relationships. We demonstrate numerically that the TT-MMK method is more reliable computationally, is less sensitive to tuning parameters, and gives higher prediction accuracy in the SVM classification compared to similar tensorised SVM methods.

Can x2vec Save Lives? Integrating Graph and Language Embeddings for Automatic Mental Health Classification Artificial Intelligence

Graph and language embedding models are becoming commonplace in large scale analyses given their ability to represent complex sparse data densely in low-dimensional space. Integrating these models' complementary relational and communicative data may be especially helpful if predicting rare events or classifying members of hidden populations - tasks requiring huge and sparse datasets for generalizable analyses. For example, due to social stigma and comorbidities, mental health support groups often form in amorphous online groups. Predicting suicidality among individuals in these settings using standard network analyses is prohibitive due to resource limits (e.g., memory), and adding auxiliary data like text to such models exacerbates complexity- and sparsity-related issues. Here, I show how merging graph and language embedding models (metapath2vec and doc2vec) avoids these limits and extracts unsupervised clustering data without domain expertise or feature engineering. Graph and language distances to a suicide support group have little correlation (\r{ho} < 0.23), implying the two models are not embedding redundant information. When used separately to predict suicidality among individuals, graph and language data generate relatively accurate results (69% and 76%, respectively); however, when integrated, both data produce highly accurate predictions (90%, with 10% false-positives and 12% false-negatives). Visualizing graph embeddings annotated with predictions of potentially suicidal individuals shows the integrated model could classify such individuals even if they are positioned far from the support group. These results extend research on the importance of simultaneously analyzing behavior and language in massive networks and efforts to integrate embedding models for different kinds of data when predicting and classifying, particularly when they involve rare events.

ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching Machine Learning

In this paper, we develop a novel procedure for low-rank tensor regression, namely \underline{I}mportance \underline{S}ketching \underline{L}ow-rank \underline{E}stimation for \underline{T}ensors (ISLET). The central idea behind ISLET is \emph{importance sketching}, i.e., carefully designed sketches based on both the responses and low-dimensional structure of the parameter of interest. We show that the proposed method is sharply minimax optimal in terms of the mean-squared error under low-rank Tucker assumptions and under randomized Gaussian ensemble design. In addition, if a tensor is low-rank with group sparsity, our procedure also achieves minimax optimality. Further, we show through numerical studies that ISLET achieves comparable or better mean-squared error performance to existing state-of-the-art methods whilst having substantial storage and run-time advantages including capabilities for parallel and distributed computing. In particular, our procedure performs reliable estimation with tensors of dimension $p = O(10^8)$ and is $1$ or $2$ orders of magnitude faster than baseline methods.

Tensor Regression Using Low-rank and Sparse Tucker Decompositions Machine Learning

This paper studies a tensor-structured linear regression model with a scalar response variable and tensor-structured predictors, such that the regression parameters form a tensor of order d (i.e., a d -fold multiway array) in R n 1 n 2 ··· n d . This work focuses on the task of estimating the regression tensor from m realizations of the response variable and the predictors where m null n null i n i. Despite the ill-posedness of this estimation problem, it can still be solved if the parameter tensor belongs to the space of sparse, low Tucker-rank tensors. Accordingly, the estimation procedure is posed as a non-convex optimization program over the space of sparse, low Tucker-rank tensors, and a tensor variant of projected gradient descent is proposed to solve the resulting non-convex problem. In addition, mathematical guarantees are provided that establish the proposed method converges to the correct solution under the right set of conditions. Further, an upper bound on sample complexity of tensor parameter estimation for the model under consideration is characterized for the special case when the individual (scalar) predictors independently draw values from a sub-Gaussian distribution. The sample complexity bound is shown to have a polylogarithmic dependence on n max null n i: i {1, 2, . . .

Prediction of Reaction Time and Vigilance Variability from Spatiospectral Features of Resting-State EEG in a Long Sustained Attention Task Machine Learning

Resting-state brain networks represent the intrinsic state of the brain during the majority of cognitive and sensorimotor tasks. However, no study has yet presented concise predictors of task-induced vigilance variability from spectrospatial features of the pre-task, resting-state electroencephalograms (EEG). We asked ten healthy volunteers (6 females, 4 males) to participate in 105-minute fixed-sequence-varying-duration sessions of sustained attention to response task (SART). A novel and adaptive vigilance scoring scheme was designed based on the performance and response time in consecutive trials, and demonstrated large inter-participant variability in terms of maintaining consistent tonic performance. Multiple linear regression using feature relevance analysis obtained significant predictors of the mean cumulative vigilance score (CVS), mean response time, and variabilities of these scores from the resting-state, band-power ratios of EEG signals, p<0.05. Single-layer neural networks trained with cross-validation also captured different associations for the beta sub-bands. Increase in the gamma (28-48 Hz) and upper beta ratios from the left central and temporal regions predicted slower reactions and more inconsistent vigilance as explained by the increased activation of default mode network (DMN) and differences between the high- and low-attention networks at temporal regions. Higher ratios of parietal alpha from the Brodmann's areas 18, 19, and 37 during the eyes-open states predicted slower responses but more consistent CVS and reactions associated with the superior ability in vigilance maintenance. The proposed framework and these findings on the most stable and significant attention predictors from the intrinsic EEG power ratios can be used to model attention variations during the calibration sessions of BCI applications and vigilance monitoring systems.

Towards automated symptoms assessment in mental health Machine Learning

Activity and motion analysis has the potential to be used as a diagnostic tool for mental disorders. However, to-date, little work has been performed in turning stratification measures of activity into useful symptom markers. The research presented in this thesis has focused on the identification of objective activity and behaviour metrics that could be useful for the analysis of mental health symptoms in the above mentioned dimensions. Particular attention is given to the analysis of objective differences between disorders, as well as identification of clinical episodes of mania and depression in bipolar patients, and deterioration in borderline personality disorder patients. A principled framework is proposed for mHealth monitoring of psychiatric patients, based on measurable changes in behaviour, represented in physical activity time series, collected via mobile and wearable devices. The framework defines methods for direct computational analysis of symptoms in disorganisation and psychomotor dimensions, as well as measures for indirect assessment of mood, using patterns of physical activity, sleep and circadian rhythms. The approach of computational behaviour analysis, proposed in this thesis, has the potential for early identification of clinical deterioration in ambulatory patients, and allows for the specification of distinct and measurable behavioural phenotypes, thus enabling better understanding and treatment of mental disorders.

Big Data Analytics and AI in Mental Healthcare Artificial Intelligence

Mental health conditions cause a great deal of distress or impairment; depression alone will affect 11% of the world's population. The application of Artificial Intelligence (AI) and big-data technologies to mental health has great potential for personalizing treatment selection, prognosticating, monitoring for relapse, detecting and helping to prevent mental health conditions before they reach clinical-level symptomatology, and even delivering some treatments. However, unlike similar applications in other fields of medicine, there are several unique challenges in mental health applications which currently pose barriers towards the implementation of these technologies. Specifically, there are very few widely used or validated biomarkers in mental health, leading to a heavy reliance on patient and clinician derived questionnaire data as well as interpretation of new signals such as digital phenotyping. In addition, diagnosis also lacks the same objective 'gold standard' as in other conditions such as oncology, where clinicians and researchers can often rely on pathological analysis for confirmation of diagnosis. In this chapter we discuss the major opportunities, limitations and techniques used for improving mental healthcare through AI and big-data. We explore both the computational, clinical and ethical considerations and best practices as well as lay out the major researcher directions for the near future.

Machine learning in resting-state fMRI analysis Machine Learning

Machine learning techniques have gained prominence for the analysis of resting-state functional Magnetic Resonance Imaging (rsfMRI) data. Here, we present an overview of various unsupervised and supervised machine learning applicationsto rsfMRI. We present a methodical taxonomy of machine learning methods in resting-state fMRI. We identify three major divisions of unsupervised learning methods with regard to their applications to rsfMRI, based on whether they discover principal modes of variation across space, time or population. Next, we survey the algorithms and rsfMRI feature representations that have driven the success of supervised subject-level predictions. Thegoal is to provide a high-level overview of the burgeoning field of rsfMRI from the perspective of machine learning applications. Keywords: Machine learning, resting-state, functional MRI, intrinsic networks, brain connectivity 1. Introduction Resting-state fMRI (rsfMRI) is a widely used neuroimaging tool that ...

Spatiotemporal transcriptomic divergence across human and macaque brain development


Improved understanding of how the developing human nervous system differs from that of closely related nonhuman primates is fundamental for teasing out human-specific aspects of behavior, cognition, and disorders. The shared and unique functional properties of the human nervous system are rooted in the complex transcriptional programs governing the development of distinct cell types, neural circuits, and regions. However, the precise molecular mechanisms underlying shared and unique features of the developing human nervous system have been only minimally characterized. We generated complementary tissue-level and single-cell transcriptomic datasets from up to 16 brain regions covering prenatal and postnatal development in humans and rhesus macaques (Macaca mulatta), a closely related species and the most commonly studied nonhuman primate. We created and applied TranscriptomeAge and TempShift algorithms to age-match developing specimens between the species and to more rigorously ...