Evaluating Synthetically Generated Data from Small Sample Sizes: An Experimental Study
–arXiv.org Artificial Intelligence
In this paper, we propose a method for measuring the similarity low sample tabular data with synthetically generated data with a larger number of samples than original. This process is also known as data augmentation. But significance levels obtained from non-parametric tests are suspect when sample size is small. Our method uses a combination of geometry, topology and robust statistics for hypothesis testing in order to compare the "validity" of generated data. We also compare the results with common global metric methods available in the literature for large sample size data.
arXiv.org Artificial Intelligence
Jan-21-2023
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine (0.46)
- Technology: