Reviews: Processing of missing data by neural networks
–Neural Information Processing Systems
The paper provides a theoretical and practical justification on using a density function to represent missing data while training a neural networks. An obvious upside is that training can be done with incomplete data, unlike denoising autoencoder for example; this can be very helpful in many applications. My comments are: - It is stated that if all the attributes are complete then the density is not used; if we have access to a huge amount of complete training data and relatively small amount of training missing data, how trustworthy is our estimation of density function? Can't we benefit from the complete data? Do we really have to remove attributes as is done in ESR task? - In the above case, would denoising autoencoder outperform? - How would the generalized activation impact the training time?
Neural Information Processing Systems
Oct-7-2024, 08:54:48 GMT
- Technology: