Statistically Discriminative Sub-trajectory Mining

Duy, Vo Nguyen Le, Sakuma, Takuto, Ishiyama, Taiju, Toda, Hiroki, Nishi, Kazuya, Karasuyama, Masayuki, Okubo, Yuta, Sunaga, Masayuki, Tabei, Yasuo, Takeuchi, Ichiro

May-5-2019–arXiv.org Machine Learning

We study the problem of discriminative sub-trajectory mining. Given two groups of trajectories, the goal of this problem is to extract moving patterns in the form of sub-trajectories which are more similar to sub-trajectories of one group and less similar to those of the other. We propose a new method called Statistically Discriminative Sub-trajectory Mining (SDSM) for this problem. An advantage of the SDSM method is that the statistical significance of the extracted sub-trajectories are properly controlled in the sense that the probability of finding a false positive sub-trajectory is smaller than a specified significance threshold alpha (e.g., 0.05), which is indispensable when the method is used in scientific or social studies under noisy environment. Finding such statistically discriminative sub-trajectories from massive trajectory dataset is both computationally and statistically challenging. In the SDSM method, we resolve the difficulties by introducing a tree representation among sub-trajectories and running an efficient permutation-based statistical inference method on the tree. To the best of our knowledge, SDSM is the first method that can efficiently extract statistically discriminative sub-trajectories from massive trajectory dataset. We illustrate the effectiveness and scalability of the SDSM method by applying it to a real-world dataset with 1,000,000 trajectories which contains 16,723,602,505 sub-trajectories.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Machine Learning

May-5-2019

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)

Genre:
- Research Report > Experimental Study (0.55)

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning > Performance Analysis
      - Accuracy (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found