AITopics | Boullé, Marc

Plotting

Boullé, Marc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cats & Co: Categorical Time Series Coclustering

Gay, Dominique, Guigourès, Romain, Boullé, Marc, Clérot, Fabrice

arXiv.org Machine LearningMay-6-2015

We suggest a novel method of clustering and exploratory analysis of temporal event sequences data (also known as categorical time series) based on three-dimensional data grid models. A data set of temporal event sequences can be represented as a data set of three-dimensional points, each point is defined by three variables: a sequence identifier, a time value and an event value. Instantiating data grid models to the 3D-points turns the problem into 3D-coclustering. The sequences are partitioned into clusters, the time variable is discretized into intervals and the events are partitioned into clusters. The cross-product of the univariate partitions forms a multivariate partition of the representation space, i.e., a grid of cells and it also represents a nonparametric estimator of the joint distribution of the sequences, time and events dimensions. Thus, the sequences are grouped together because they have similar joint distribution of time and events, i.e., similar distribution of events along the time dimension. The best data grid is computed using a parameter-free Bayesian model selection approach. We also suggest several criteria for exploiting the resulting grid through agglomerative hierarchies, for interpreting the clusters of sequences and characterizing their components through insightful visualizations. Extensive experiments on both synthetic and real-world data sets demonstrate that data grid models are efficient, effective and discover meaningful underlying patterns of categorical time series data.

bayesian inference, health & medicine, sequence, (19 more...)

arXiv.org Machine Learning

1505.013

Country:

Europe (0.67)
North America > United States > Nevada (0.14)
North America > Canada > British Columbia (0.14)

Genre: Research Report (0.84)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Country-scale Exploratory Analysis of Call Detail Records through the Lens of Data Grid Models

Guigourès, Romain, Gay, Dominique, Boullé, Marc, Clérot, Fabrice, Rossi, Fabrice

arXiv.org Machine LearningMar-20-2015

Call Detail Records (CDRs) are data recorded by telecommunications companies, consisting of basic informations related to several dimensions of the calls made through the network: the source, destination, date and time of calls. CDRs data analysis has received much attention in the recent years since it might reveal valuable information about human behavior. It has shown high added value in many application domains like e.g., communities analysis or network planning. In this paper, we suggest a generic methodology for summarizing information contained in CDRs data. The method is based on a parameter-free estimation of the joint distribution of the variables that describe the calls. We also suggest several well-founded criteria that allows one to browse the summary at various granularities and to explore the summary by means of insightful visualizations. The method handles network graph data, temporal sequence data as well as user mobility data stemming from original CDRs data. We show the relevance of our methodology for various case studies on real-world CDRs data from Ivory Coast.

antenna, artificial intelligence, télécommunications, (16 more...)

arXiv.org Machine Learning

1503.0606

Country:

Africa > Côte d'Ivoire (0.92)
Europe (0.68)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.68)
Information Technology > Networks (0.48)
Telecommunications > Networks (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Communications > Mobile (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science (0.88)

Add feedback

Nonparametric Hierarchical Clustering of Functional Data

Boullé, Marc, Guigourès, Romain, Rossi, Fabrice

arXiv.org Machine LearningJul-2-2014

In this paper, we deal with the problem of curves clustering. We propose a nonparametric method which partitions the curves into clusters and discretizes the dimensions of the curve points into intervals. The cross-product of these partitions forms a data-grid which is obtained using a Bayesian model selection approach while making no assumptions regarding the curves. Finally, a post-processing technique, aiming at reducing the number of clusters in order to improve the interpretability of the clustering, is proposed. It consists in optimally merging the clusters step by step, which corresponds to an agglomerative hierarchical classification whose dissimilarity measure is the variation of the criterion. Interestingly this measure is none other than the sum of the Kullback-Leibler divergences between clusters distributions before and after the merges. The practical interest of the approach for functional data exploratory analysis is presented and compared with an alternative approach on an artificial and a real world data set.

artificial intelligence, machine learning, power consumption, (15 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-319-02999-3_2

1407.0612

Country: North America > United States > New York (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

A Triclustering Approach for Time Evolving Graphs

Guigourès, Romain, Boullé, Marc, Rossi, Fabrice

arXiv.org Machine LearningJan-12-2013

This paper introduces a novel technique to track structures in time evolving graphs. The method is based on a parameter free approach for three-dimensional co-clustering of the source vertices, the target vertices and the time. All these features are simultaneously segmented in order to build time segments and clusters of vertices whose edge distributions are similar and evolve in the same way over the time segments. The main novelty of this approach lies in that the time segments are directly inferred from the evolution of the edge distribution between the vertices, thus not requiring the user to make an a priori discretization. Experiments conducted on a synthetic dataset illustrate the good behaviour of the technique, and a study of a real-life dataset shows the potential of the proposed approach for exploratory data analysis.

artificial intelligence, data mining, graph, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/ICDMW.2012.61

1301.2659

Country: Europe > France (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback