Density based Spatial Clustering of Lines via Probabilistic Generation of Neighbourhood
Das, Akanksha, Bhattacharyya, Malay
–arXiv.org Artificial Intelligence
Density based spatial clustering of points in $\mathbb{R}^n$ has a myriad of applications in a variety of industries. We generalise this problem to the density based clustering of lines in high-dimensional spaces, keeping in mind there exists no valid distance measure that follows the triangle inequality for lines. In this paper, we design a clustering algorithm that generates a customised neighbourhood for a line of a fixed volume (given as a parameter), based on an optional parameter as a continuous probability density function. This algorithm is not sensitive to the outliers and can effectively identify the noise in the data using a cardinality parameter. One of the pivotal applications of this algorithm is clustering data points in $\mathbb{R}^n$ with missing entries, while utilising the domain knowledge of the respective data. In particular, the proposed algorithm is able to cluster $n$-dimensional data points that contain at least $(n-1)$-dimensional information. We illustrate the neighbourhoods for the standard probability distributions with continuous probability density functions and demonstrate the effectiveness of our algorithm on various synthetic and real-world datasets (e.g., rail and road networks). The experimental results also highlight its application in clustering incomplete data.
arXiv.org Artificial Intelligence
Oct-3-2024
- Country:
- North America > United States (0.29)
- Asia
- Middle East > Israel
- Haifa District > Haifa (0.04)
- India > West Bengal
- Kolkata (0.04)
- Middle East > Israel
- Genre:
- Research Report (0.64)
- Industry:
- Transportation > Ground (0.49)
- Technology: