AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Seigal, Anna, Beguerisse-Díaz, Mariano, Schoeberl, Birgit, Niepel, Mario, Harrington, Heather A.

Tensor clustering with algebraic constraints gives interpretable groups of crosstalk mechanisms in breast cancer

arXiv.org Machine LearningApr-28-2017

Muti-dimensional datasets are now prevalent across the sciences; their ubiquity and importance will only continue to grow [1-4]. The analysis of data demands methods that preserve multidimensional structures, and that exploit them. We introduce a versatile data clustering framework based on tensors (high dimensional arrays) and algebra to analyze multidimensional datasets. One key feature of this method is that it can incorporate general, application-specific constraints on the composition of a cluster, and is guaranteed to find optimal partitions. The flexibility of the method allows it to be used directly on a dataset (i.e., as a standalone clustering tool), or in combination with other clustering methods. We apply our method on an extensive set of timecourse measurements of the activation levels of the mitogen-activated protein kinase (MAPK) and phosphoinositide 3-kinase (PI3K) pathways that are involved in cellular decisions and fates [10-13], and are known to dysfunction in cancer [10-13, 16]. The key signaling proteins and subtype responses in breast cancer cells are known; however, among genetically diverse cell lines the dysfunction varies and is not well understood [1, 15, 16]. Our objective is to find groups of cell lines whose signal transduction networks have similar dynamics. A high similarity suggests that the cell lines share pathway features that can be relevant for the responses to the ligands.

artificial intelligence, cell line, machine learning, (16 more...)

1612.08116

Country: North America > United States > California (0.67)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)

#artificialintelligenceApr-27-2017, 16:07:43 GMT

How to use machine learning to identify "good" customers vs "bad" customers - BDO Canada - IT Solutions

Good profitable customers rarely become unprofitable. It is more likely that they were unprofitable from the onset. Determining an approach to define customer value can be a complex decision. Traditionally, we use gross margin in identifying good and bad customers. For example, if your overhead costs are 25% of gross revenue, a good customer is anyone with a gross margin over 25%.

artificial intelligence, customer, machine learning, (7 more...)

Country: North America > Canada (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.38)

Zahavy, Tom, Zrihem, Nir Ben, Mannor, Shie

Graying the black box: Understanding DQNs

arXiv.org Artificial IntelligenceApr-24-2017

In recent years there is a growing interest in using deep representations for reinforcement learning. In this paper, we present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. Moreover, we propose a new model, the Semi Aggregated Markov Decision Process (SAMDP), and an algorithm that learns it automatically. The SAMDP model allows us to identify spatio-temporal abstractions directly from features and may be used as a sub-goal detector in future work. Using our tools we reveal that the features learned by DQNs aggregate the state space in a hierarchical fashion, explaining its success. Moreover, we are able to understand and describe the policies learned by DQNs for three different Atari2600 games and suggest ways to interpret, debug and optimize deep neural networks in reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1602.02658

Country: Asia > Middle East (0.28)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry:

Leisure & Entertainment > Games (1.00)
Education (0.93)
Transportation (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

#artificialintelligenceApr-17-2017, 08:35:19 GMT

How Machines Make Sense of Big Data: an Introduction to Clustering Algorithms

While there's not necessarily a "correct" answer here, it's most likely you split the bugs into four clusters. That wasn't too bad, was it? You could probably do the same with twice as many bugs, right? If you had a bit of time to spare -- or a passion for entomology -- you could probably even do the same with a hundred bugs. For a machine though, grouping ten objects into however many meaningful clusters is no small task, thanks to a mind-bending branch of maths called combinatorics, which tells us that are 115,975 different possible ways you could have grouped those ten insects together. Had there been twenty bugs, there would have been over fifty trillion possible ways of clustering them. With a hundred bugs -- there'd be many times more solutions than there are particles in the known universe. In fact, there are more than four million billion googol solutions (what's a googol?).

data mining, machine learning, vertex, (20 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Communications > Social Media (0.94)

#artificialintelligenceApr-14-2017, 03:45:22 GMT

Machine Learning With Python - Hierarchical Clustering Advantages & Disadvantages

Enroll in the course for free at: https://bigdatauniversity.com/courses... Machine Learning can be an incredibly beneficial tool to uncover hidden insights and predict future trends. This free Machine Learning with Python course will give you all the tools you need to get started with supervised and unsupervised learning. This #MachineLearning with #Python course dives into the basics of machine learning using an approachable, and well-known, programming language. You'll learn about Supervised vs Unsupervised Learning, look into how Statistical Modeling relates to Machine Learning, and do a comparison of each. Look at real-life examples of Machine learning and how it affects society in ways you may not have guessed!

artificial intelligence, hierarchical clustering advantage & disadvantage, machine learning, (2 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.40)

Li, Le, Guedj, Benjamin, Loustau, Sébastien

A Quasi-Bayesian Perspective to Online Clustering

arXiv.org Machine LearningApr-8-2017

When faced with high frequency streams of data, clustering raises theoretical and algorithmic pitfalls. We introduce a new and adaptive online clustering algorithm relying on a quasi-Bayesian approach, with a dynamic (\emph{i.e.}, time-dependent) estimation of the (unknown and changing) number of clusters. We prove that our approach is supported by minimax regret bounds. We also provide an RJMCMC-flavored implementation (called PACBO) for which we give a convergence guarantee. Finally, numerical experiments illustrate the potential of our procedure.

algorithm, artificial intelligence, machine learning, (18 more...)

1602.00522

Country: Europe (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Banijamali, Ershad, Ghodsi, Ali

Fast Spectral Clustering Using Autoencoders and Landmarks

arXiv.org Machine LearningApr-7-2017

In this paper, we introduce an algorithm for performing spectral clustering efficiently. Spectral clustering is a powerful clustering algorithm that suffers from high computational complexity, due to eigen decomposition. In this work, we first build the adjacency matrix of the corresponding graph of the dataset. To build this matrix, we only consider a limited number of points, called landmarks, and compute the similarity of all data points with the landmarks. Then, we present a definition of the Laplacian matrix of the graph that enable us to perform eigen decomposition efficiently, using a deep autoencoder. The overall complexity of the algorithm for eigen decomposition is $O(np)$, where $n$ is the number of data points and $p$ is the number of landmarks. At last, we evaluate the performance of the algorithm in different experiments.

artificial intelligence, machine learning, spectral, (19 more...)

1704.02345

Country: North America > Canada (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.36)

arXiv.org Machine LearningApr-6-2017

DIMM-SC: A Dirichlet mixture model for clustering droplet-based single cell transcriptomic data

Sun, Zhe, Wang, Ting, Deng, Ke, Wang, Xiao-Feng, Lafyatis, Robert, Ding, Ying, Hu, Ming, Chen, Wei

Motivation: Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored. Methods: We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. An expectation-maximization algorithm is used for parameter inference. Results: We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods.

artificial intelligence, machine learning, scrna-seq data, (15 more...)

1704.02007

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Machine LearningApr-6-2017

Massive Data Clustering in Moderate Dimensions from the Dual Spaces of Observation and Attribute Data Clouds

Murtagh, Fionn

Cluster analysis of very high dimensional data can benefit from the properties of such high dimensionality. Informally expressed, in this work, our focus is on the analogous situation when the dimensionality is moderate to small, relative to a massively sized set of observations. Mathematically expressed, these are the dual spaces of observations and attributes. The point cloud of observations is in attribute space, and the point cloud of attributes is in observation space. In this paper, we begin by summarizing various perspectives related to methodologies that are used in multivariate analytics. We draw on these to establish an efficient clustering processing pipeline, both partitioning and hierarchical clustering.

artificial intelligence, machine learning, projection, (19 more...)

1704.01871

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)