AITopics | Perros, Ioakeim

Collaborating Authors

Perros, Ioakeim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TASTE: Temporal and Static Tensor Factorization for Phenotyping Electronic Health Records

Afshar, Ardavan, Perros, Ioakeim, Park, Haesun, deFilippi, Christopher, Yan, Xiaowei, Stewart, Walter, Ho, Joyce, Sun, Jimeng

arXiv.org Machine LearningNov-13-2019

Phenotyping electronic health records (EHR) focuses on defining meaningful patient groups (e.g., heart failure group and diabetes group) and identifying the temporal evolution of patients in those groups. Tensor factorization has been an effective tool for phenotyping. Most of the existing works assume either a static patient representation with aggregate data or only model temporal data. However, real EHR data contain both temporal (e.g., longitudinal clinical visits) and static information (e.g., patient demographics), which are difficult to model simultaneously. In this paper, we propose Temporal And Static TEnsor factorization (TASTE) that jointly models both static and temporal information to extract phenotypes. TASTE combines the PARAFAC2 model with non-negative matrix factorization to model a temporal and a static tensor. To fit the proposed model, we transform the original problem into simpler ones which are optimally solved in an alternating fashion. For each of the sub-problems, our proposed mathematical reformulations lead to efficient sub-problem solvers. Comprehensive experiments on large EHR data from a heart failure (HF) study confirmed that TASTE is up to 14x faster than several baselines and the resulting phenotypes were confirmed to be clinically meaningful by a cardiologist. Using 80 phenotypes extracted by TASTE, a simple logistic regression can achieve the same level of area under the curve (AUC) for HF prediction compared to a deep learning model using recurrent neural networks (RNN) with 345 features.

deep learning, phenotype, vascular disease, (25 more...)

arXiv.org Machine Learning

1911.05843

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.55)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

COPA: Constrained PARAFAC2 for Sparse & Large Datasets

Afshar, Ardavan, Perros, Ioakeim, Papalexakis, Evangelos E., Searles, Elizabeth, Ho, Joyce, Sun, Jimeng

arXiv.org Machine LearningMar-12-2018

PARAFAC2 has demonstrated success in modeling irregular tensors, where the tensor dimensions vary across one of the modes. An example scenario is jointly modeling treatments across a set of patients with varying number of medical encounters, where the alignment of events in time bears no clinical meaning, and it may also be impossible to align them due to their varying length. Despite recent improvements on scaling up unconstrained PARAFAC2, its model factors are usually dense and sensitive to noise which limits their interpretability. As a result, the following open challenges remain: a) various modeling constraints, such as temporal smoothness, sparsity and non-negativity, are needed to be imposed for interpretable temporal modeling and b) a scalable approach is required to support those constraints efficiently for large datasets. To tackle these challenges, we propose a COnstrained PARAFAC2 (COPA) method, which carefully incorporates optimization constraints such as temporal smoothness, sparsity, and non-negativity in the resulting factors. To efficiently support all those constraints, COPA adopts a hybrid optimization framework using alternating optimization and alternating direction method of multiplier (AO-ADMM). As evaluated on large electronic health record (EHR) datasets with hundreds of thousands of patients, COPA achieves significant speedups (up to 36x faster) over prior PARAFAC2 approaches that only attempt to handle a subset of the constraints that COPA enables. Overall, our method outperforms all the baselines attempting to handle a subset of the constraints in terms of speed, while achieving the same level of accuracy.

constraint, oncology, optimization problem, (21 more...)

arXiv.org Machine Learning

1803.04572

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

SPALS: Fast Alternating Least Squares via Implicit Leverage Scores Sampling

Cheng, Dehua, Peng, Richard, Liu, Yan, Perros, Ioakeim

Neural Information Processing SystemsDec-31-2016

Tensor CANDECOMP/PARAFAC (CP) decomposition is a powerful but computationally challengingtool in modern data analytics. In this paper, we show ways of sampling intermediate steps of alternating minimization algorithms for computing low rank tensor CP decompositions, leading to the sparse alternating least squares (SPALS) method. Specifically, we sample the Khatri-Rao product, which arises as an intermediate object during the iterations of alternating least squares. This product captures the interactions between different tensor modes, and form the main computational bottleneck for solving many tensor related tasks. By exploiting the spectral structures of the matrix Khatri-Rao product, we provide efficient access to its statistical leverage scores. When applied to the tensor CP decomposition, our method leads to the first algorithm that runs in sublinear time per-iteration and approximates the output of deterministic alternating least squares algorithms.

algorithm, artificial intelligence, data mining, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

Scalable Latent Tree Model and its Application to Health Analytics

Huang, Furong, N., Niranjan U., Perros, Ioakeim, Chen, Robert, Sun, Jimeng, Anandkumar, Anima

arXiv.org Machine LearningMar-16-2015

We present an integrated approach to structure and parameter estimation in latent tree graphical models, where some nodes are hidden. Our overall approach follows a "divide-and-conquer" strategy that learns models over small groups of variables and iteratively merges into a global solution. The structure learning involves combinatorial operations such as minimum spanning tree construction and local recursive grouping; the parameter learning is based on the method of moments and on tensor decompositions. Our method is guaranteed to correctly recover the unknown tree structure and the model parameters with low sample complexity for the class of linear multivariate latent tree models which includes discrete and Gaussian distributions, and Gaussian mixtures. Our bulk asynchronous parallel algorithm is implemented in parallel using the OpenMP framework and scales logarithmically with the number of variables and linearly with dimensionality of each variable. Our experiments confirm a high degree of efficiency and accuracy on large datasets of electronic health records. The proposed algorithm also generates intuitive and clinically meaningful disease hierarchies.

node, oncology, us government, (21 more...)

arXiv.org Machine Learning

1406.4566

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Add feedback