Baratchi, Mitra
Automated classification of pre-defined movement patterns: A comparison between GNSS and UWB technology
Laanen, Rodi, Nasri, Maedeh, van Dijk, Richard, Baratchi, Mitra, Koutamanis, Alexander, Rieffe, Carolien
Advanced real-time location systems (RTLS) allow for collecting spatio-temporal data from human movement behaviours. Tracking individuals in small areas such as schoolyards or nursing homes might impose difficulties for RTLS in terms of positioning accuracy. However, to date, few studies have investigated the performance of different localisation systems regarding the classification of human movement patterns in small areas. The current study aims to design and evaluate an automated framework to classify human movement trajectories obtained from two different RTLS: Global Navigation Satellite System (GNSS) and Ultra-wideband (UWB), in areas of approximately 100 square meters. Specifically, we designed a versatile framework which takes GNSS or UWB data as input, extracts features from these data and classifies them according to the annotated spatial patterns. The automated framework contains three choices for applying noise removal: (i) no noise removal, (ii) Savitzky Golay filter on the raw location data or (iii) Savitzky Golay filter on the extracted features, as well as three choices regarding the classification algorithm: Decision Tree (DT), Random Forest (RF) or Support Vector Machine (SVM). We integrated different stages within the framework with the Sequential Model-Based Algorithm Configuration (SMAC) to perform automated hyperparameter optimisation. The best performance is achieved with a pipeline consisting of noise removal applied to the raw location data with an RF model for the GNSS and no noise removal with an SVM model for the UWB. We further demonstrate through statistical analysis that the UWB achieves significantly higher results than the GNSS in classifying movement patterns.
A Systematic Analysis on the Impact of Contextual Information on Point-of-Interest Recommendation
Rahmani, Hossein A., Aliannejadi, Mohammad, Baratchi, Mitra, Crestani, Fabio
As the popularity of Location-based Social Networks (LBSNs) increases, designing accurate models for Point-of-Interest (POI) recommendation receives more attention. POI recommendation is often performed by incorporating contextual information into previously designed recommendation algorithms. Some of the major contextual information that has been considered in POI recommendation are the location attributes (i.e., exact coordinates of a location, category, and check-in time), the user attributes (i.e., comments, reviews, tips, and check-in made to the locations), and other information, such as the distance of the POI from user's main activity location, and the social tie between users. The right selection of such factors can significantly impact the performance of the POI recommendation. However, previous research does not consider the impact of the combination of these different factors. In this paper, we propose different contextual models and analyze the fusion of different major contextual information in POI recommendation. The major contributions of this paper are: (i) providing an extensive survey of context-aware location recommendation (ii) quantifying and analyzing the impact of different contextual information (e.g., social, temporal, spatial, and categorical) in the POI recommendation on available baselines and two new linear and non-linear models, that can incorporate all the major contextual information into a single recommendation model, and (iii) evaluating the considered models using two well-known real-world datasets. Our results indicate that while modeling geographical and temporal influences can improve recommendation quality, fusing all other contextual information into a recommendation model is not always the best strategy.
Unsupervised Discretization by Two-dimensional MDL-based Histogram
Yang, Lincen, Baratchi, Mitra, van Leeuwen, Matthijs
Unsupervised discretization is a crucial step in many knowledge discovery tasks. The state-of-the-art method for one-dimensional data infers locally adaptive histograms using the minimum description length (MDL) principle, but the multi-dimensional case is far less studied: current methods consider the dimensions one at a time (if not independently), which result in discretizations based on rectangular cells of adaptive size. Unfortunately, this approach is unable to adequately characterize dependencies among dimensions and/or results in discretizations consisting of more cells (or bins) than is desirable. To address this problem, we propose an expressive model class that allows for far more flexible partitions of two-dimensional data. We extend the state of the art for the one-dimensional case to obtain a model selection problem based on the normalised maximum likelihood, a form of refined MDL. As the flexibility of our model class comes at the cost of a vast search space, we introduce a heuristic algorithm, named PALM, which partitions each dimension alternately and then merges neighbouring regions, all using the MDL principle. Experiments on synthetic data show that PALM 1) accurately reveals ground truth partitions that are within the model class (i.e., the search space), given a large enough sample size; 2) approximates well a wide range of partitions outside the model class; 3) converges, in contrast to its closest competitor IPD; and 4) is self-adaptive with regard to both sample size and local density structure of the data despite being parameter-free. Finally, we apply our algorithm to two geographic datasets to demonstrate its real-world potential.