Fighting doppelgängers. How to rid data of evil twins reducing…
When working with data produced by sensors recording machinery events, large datasets including hundreds or thousands of variables are usual. In these cases, many variables can be candidates to predict some target measures. However, especially in industrial contexts, data can include fully linearly dependent or very correlated variables. For example, a sensor can extract several features from the same process as linear transformations of the same basis (like the sum of a set of records, their mean, etc.). In other cases, there are genuinely different measures but related by nature, or representing two opposite facets of the same phenomenon (imagine two complementary elements of a chemical mixture).
Dec-23-2022, 20:40:57 GMT
- Technology: