Goto

Collaborating Authors

 shape feature representation


What shapes feature representations? Exploring datasets, architectures, and training

Neural Information Processing Systems

In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not. Of the useful features, which ones does the model use? Of the task-irrelevant features, which ones does the model represent? Answers to these questions are important for understanding the basis of models' decisions, as well as for building models that learn versatile, adaptable representations useful beyond the original training task. We study these questions using synthetic datasets in which the task-relevance of input features can be controlled directly.


Review for NeurIPS paper: What shapes feature representations? Exploring datasets, architectures, and training

Neural Information Processing Systems

Weaknesses: One of the main concerns is the size of the dataset used from training (4900 images). To train a deep architecture, this size is very small. It is well-known that deep models using smaller datasets often result in a lower test accuracy, perhaps because the training set is not sufficiently representative of the problem and the model might overfit. To address this, researchers have of the used transfer learning (pre-trained base CNNs that are trained over the large diverse datasets) and fine-tuned on the smaller dataset. Given the size of the AlexNet (61M parameters), I have a feeling that the model is overfitted for this particular experimental design and evaluation.


What shapes feature representations? Exploring datasets, architectures, and training

Neural Information Processing Systems

In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not. Of the useful features, which ones does the model use? Of the task-irrelevant features, which ones does the model represent? Answers to these questions are important for understanding the basis of models' decisions, as well as for building models that learn versatile, adaptable representations useful beyond the original training task. We study these questions using synthetic datasets in which the task-relevance of input features can be controlled directly.