[D] is the "curse of dimensionality" still as relevant as it was 20 years ago?
I have been reading some good examples that explain (in layman's terms) what is the curse of dimensionality. These examples first considers a circle inside a square (2 dimensions: example 1) - and then considers a sphere inside a cube (3 dimensions: example 2). This is to illustrate the fact that the cube in example 2 is a lot more "emptier" (ratio of volume between sphere and cube) compared to the square in example 1. As the number of dimensions increase (e.g. the cube becomes a hypercube in 4 dimensions), it can be mathematically shown that the ratio of emptiness increases more and more. In this analogy, the sphere represents the data and the cube represents the space which the data belongs to. These examples show us that in higher dimensions, we need exponentially more and more data to fill this space - thus, in higher dimensions, data becomes more "sparse", and this sparsity makes it harder to fit machine learning algorithms (I understand this is intuitively, but I don't know if there is a mathematical explanation behind why sparsity gives machine learning algorithms a hard time - perhaps sparsity makes some of the matrix calculations harder to calculate?
Apr-22-2021, 09:00:27 GMT
- Technology: