AITopics | Rossi, Fabrice

Collaborating Authors

Rossi, Fabrice

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Lasso based feature selection for malaria risk exposure prediction

Kouwayè, Bienvenue, Fonton, Noël, Rossi, Fabrice

arXiv.org Machine LearningNov-4-2015

In life sciences, the experts generally use empirical knowledge to recode variables, choose interactions and perform selection by classical approach. The aim of this work is to perform automatic learning algorithm for variables selection which can lead to know if experts can be help in they decision or simply replaced by the machine and improve they knowledge and results. The Lasso method can detect the optimal subset of variables for estimation and prediction under some conditions. In this paper, we propose a novel approach which uses automatically all variables available and all interactions. By a double cross-validation combine with Lasso, we select a best subset of variables and with GLM through a simple cross-validation perform predictions. The algorithm assures the stability and the the consistency of estimators.

health & medicine, immunology, prediction, (19 more...)

arXiv.org Machine Learning

1511.01284

Country:

Africa (0.47)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Belgium > Flanders (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.90)
Health & Medicine > Therapeutic Area > Immunology (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Co-Clustering Network-Constrained Trajectory Data

Mahrsi, Mohamed Khalil El, Guigourès, Romain, Rossi, Fabrice, Boullé, Marc

arXiv.org Machine LearningNov-4-2015

Recently, clustering moving object trajectories kept gaining interest from both the data mining and machine learning communities. This problem, however, was studied mainly and extensively in the setting where moving objects can move freely on the euclidean space. In this paper, we study the problem of clustering trajectories of vehicles whose movement is restricted by the underlying road network. We model relations between these trajectories and road segments as a bipartite graph and we try to cluster its vertices. We demonstrate our approaches on synthetic data and show how it could be useful in inferring knowledge about the flow dynamics and the behavior of the drivers using the road network.

artificial intelligence, ground transportation, trajectory, (20 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-319-23751-0_2

1511.01281

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (0.65)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Data Science > Data Mining (0.89)

Add feedback

A Study of the Spatio-Temporal Correlations in Mobile Calls Networks

Guigourès, Romain, Boullé, Marc, Rossi, Fabrice

arXiv.org Machine LearningOct-30-2015

For the last few years, the amount of data has significantly increased in the companies. It is the reason why data analysis methods have to evolve to meet new demands. In this article, we introduce a practical analysis of a large database from a telecommunication operator. The problem is to segment a territory and characterize the retrieved areas owing to their inhabitant behavior in terms of mobile telephony. We have call detail records collected during five months in France. We propose a two stages analysis. The first one aims at grouping source antennas which originating calls are similarly distributed on target antennas and conversely for target antenna w.r.t. source antenna. A geographic projection of the data is used to display the results on a map of France. The second stage discretizes the time into periods between which we note changes in distributions of calls emerging from the clusters of source antennas. This enables an analysis of temporal changes of inhabitants behavior in every area of the country.

antenna, artificial intelligence, télécommunications, (15 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-319-23751-0_1

1510.09005

Country: Europe > France (0.87)

Genre: Research Report (1.00)

Industry: Telecommunications (0.48)

Technology:

Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science (0.66)

Add feedback

S\'election de variables par le GLM-Lasso pour la pr\'ediction du risque palustre

Kouwayè, Bienvenue, Fonton, Noël, Rossi, Fabrice

arXiv.org Machine LearningSep-9-2015

In this study, we propose an automatic learning method for variables selection based on Lasso in epidemiology context. One of the aim of this approach is to overcome the pretreatment of experts in medicine and epidemiology on collected data. These pretreatment consist in recoding some variables and to choose some interactions based on expertise. The approach proposed uses all available explanatory variables without treatment and generate automatically all interactions between them. This lead to high dimension. We use Lasso, one of the robust methods of variable selection in high dimension. To avoid over fitting a two levels cross-validation is used. Because the target variable is account variable and the lasso estimators are biased, variables selected by lasso are debiased by a GLM and used to predict the distribution of the main vector of malaria which is Anopheles. Results show that only few climatic and environmental variables are the mains factors associated to the malaria risk exposure.

artificial intelligence, health & medicine, selection, (14 more...)

arXiv.org Machine Learning

1509.02873

Country:

Africa > Benin (0.15)
Europe > Belgium (0.14)

Genre: Research Report > New Finding (0.54)

Industry: Health & Medicine > Epidemiology (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Modelling time evolving interactions in networks through a non stationary extension of stochastic block models

Corneli, Marco, Latouche, Pierre, Rossi, Fabrice

arXiv.org Machine LearningSep-8-2015

In this paper, we focus on the stochastic block model (SBM),a probabilistic tool describing interactions between nodes of a network using latent clusters. The SBM assumes that the networkhas a stationary structure, in which connections of time varying intensity are not taken into account. In other words, interactions between two groups are forced to have the same features during the whole observation time. To overcome this limitation,we propose a partition of the whole time horizon, in which interactions are observed, and develop a non stationary extension of the SBM,allowing to simultaneously cluster the nodes in a network along with fixed time intervals in which the interactions take place. The number of clusters (K for nodes, D for time intervals) as well as the class memberships are finallyobtained through maximizing the complete-data integrated likelihood by means of a greedy search approach. After showing that the model works properly with simulated data, we focus on a real data set. We thus consider the three days ACM Hypertext conference held in Turin,June 29th - July 1st 2009. Proximity interactions between attendees during the first day are modelled and an interestingclustering of the daily hours is finally obtained, with times of social gathering (e.g. coffee breaks) recovered by the approach. Applications to large networks are limited due to the computational complexity of the greedy search which is dominated bythe number $K\_{max}$ and $D\_{max}$ of clusters used in the initialization. Therefore,advanced clustering tools are considered to reduce the number of clusters expected in the data, making the greedy search applicable to large networks.

artificial intelligence, interaction, machine learning, (17 more...)

arXiv.org Machine Learning

1509.02347

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.25)
Europe > Belgium (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.55)

Add feedback

Empirical risk minimization is consistent with the mean absolute percentage error

De Myttenaere, Arnaud, Grand, Bénédicte Le, Rossi, Fabrice

arXiv.org Machine LearningSep-8-2015

We study in this paper the consequences of using the Mean Absolute Percentage Error (MAPE) as a measure of quality for regression models. We show that finding the best model under the MAPE is equivalent to doing weighted Mean Absolute Error (MAE) regression. We also show that, under some asumptions, universal consistency of Empirical Risk Minimization remains possible using the MAPE.

artificial intelligence, machine learning, mape, (13 more...)

arXiv.org Machine Learning

1509.02357

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Add feedback

Graphs in machine learning: an introduction

Latouche, Pierre, Rossi, Fabrice

arXiv.org Machine LearningJun-23-2015

Graphs are commonly used to characterise interactions between objects of interest. Because they are based on a straightforward formalism, they are used in many scientific fields from computer science to historical sciences. In this paper, we give an introduction to some methods relying on graphs for learning. This includes both unsupervised and supervised methods. Unsupervised learning algorithms usually aim at visualising graphs in latent spaces and/or clustering the nodes. Both focus on extracting knowledge from graph topologies. While most existing techniques are only applicable to static graphs, where edges do not evolve through time, recent developments have shown that they could be extended to deal with evolving networks. In a supervised context, one generally aims at inferring labels or numerical values attached to nodes using both the graph and, when they are available, node characteristics. Balancing the two sources of information can be challenging, especially as they can disagree locally or globally. In both contexts, supervised and un-supervised, data can be relational (augmented with one or several global graphs) as described above, or graph valued. In this latter case, each object of interest is given as a full graph (possibly completed by other characteristics). In this context, natural tasks include graph clustering (as in producing clusters of graphs rather than clusters of nodes in a single graph), graph classification, etc. 1 Real networks One of the first practical studies on graphs can be dated back to the original work of Moreno [51] in the 30s. Since then, there has been a growing interest in graph analysis associated with strong developments in the modelling and the processing of these data. Graphs are now used in many scientific fields. In Biology [54, 2, 7], for instance, metabolic networks can describe pathways of biochemical reactions [41], while in social sciences networks are used to represent relation ties between actors [66, 56, 36, 34]. Other examples include powergrids [71] and the web [75]. Recently, networks have also been considered in other areas such as geography [22] and history [59, 39]. In machine learning, networks are seen as powerful tools to model problems in order to extract information from data and for prediction purposes. This is the object of this paper. For more complete surveys, we refer to [28, 62, 49, 45]. In this section, we introduce notations and highlight properties shared by most real networks. In Section 2, we then consider methods aiming at extracting information from a unique network. We will particularly focus on clustering methods where the goal is to find clusters of vertices. Finally, in Section 3, techniques that take a series of networks into account, where each network is

bayesian inference, graph, survey article, (18 more...)

arXiv.org Machine Learning

1506.06962

Country:

North America > United States (0.28)
Europe > Belgium (0.28)
North America > Canada > Alberta (0.14)

Genre: Overview (0.86)

Add feedback

Search Strategies for Binary Feature Selection for a Naive Bayes Classifier

Rabenoro, Tsirizo, Lacaille, Jérôme, Cottrell, Marie, Rossi, Fabrice

arXiv.org Machine LearningJun-12-2015

We compare in this paper several feature selection methods for the Naive Bayes Classifier (NBC) when the data under study are described by a large number of redundant binary indicators. Wrapper approaches guided by the NBC estimation of the classification error probability out-perform filter approaches while retaining a reasonable computational cost.

artificial intelligence, indicator, machine learning, (15 more...)

arXiv.org Machine Learning

1506.04177

Country: Asia > China (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Exact ICL maximization in a non-stationary time extension of the latent block model for dynamic networks

Corneli, Marco, Latouche, Pierre, Rossi, Fabrice

arXiv.org Machine LearningJun-12-2015

The latent block model (LBM) is a flexible probabilistic tool to describe interactions between node sets in bipartite networks, but it does not account for interactions of time varying intensity between nodes in unknown classes. In this paper we propose a non stationary temporal extension of the LBM that clusters simultaneously the two node sets of a bipartite network and constructs classes of time intervals on which interactions are stationary. The number of clusters as well as the membership to classes are obtained by maximizing the exact complete-data integrated likelihood relying on a greedy search approach. Experiments on simulated and real data are carried out in order to assess the proposed methodology.

artificial intelligence, interaction, machine learning, (14 more...)

arXiv.org Machine Learning

1506.04138

Country: Europe > Belgium (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

Reducing offline evaluation bias of collaborative filtering algorithms

De Myttenaere, Arnaud, Golden, Boris, Grand, Bénédicte Le, Rossi, Fabrice

arXiv.org Machine LearningJun-12-2015

Recommendation systems have been integrated into the majority of large online systems to filter and rank information according to user profiles. It thus influences the way users interact with the system and, as a consequence, bias the evaluation of the performance of a recommendation algorithm computed using historical data (via offline evaluation). This paper presents a new application of a weighted offline evaluation to reduce this bias for collaborative filtering algorithms.

algorithm, artificial intelligence, offline evaluation, (16 more...)

arXiv.org Machine Learning

1506.04135

Country:

Europe > France (0.16)
Europe > Belgium (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback