Towards flexible perception with visual memory
Geirhos, Robert, Jaini, Priyank, Stone, Austin, Medapati, Sourabh, Yi, Xi, Toderici, George, Ogale, Abhijit, Shlens, Jonathon
–arXiv.org Artificial Intelligence
Training a neural network is a monolithic endeavor, akin to carving knowledge into stone: once the process is completed, editing the knowledge in a network is nearly impossible, since all information is distributed across the network's weights. We here explore a simple, compelling alternative by marrying the representational power of deep neural networks with the flexibility of a database. Decomposing the task of image classification into image similarity (from a pre-trained embedding) and search (via fast nearest neighbor retrieval from a knowledge database), we build a simple and flexible visual memory that has the following key capabilities: (1.) The ability to flexibly add data across scales: from individual samples all the way to entire classes and billion-scale data; (2.) The ability to remove data through unlearning and memory pruning; (3.) An interpretable decision-mechanism on which we can intervene to control its behavior. Taken together, these capabilities comprehensively demonstrate the benefits of an explicit visual memory. We hope that it might contribute to a conversation on how knowledge should be represented in deep vision models -- beyond carving it in "stone" weights.
arXiv.org Artificial Intelligence
Sep-17-2024
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > Ireland
- Leinster > County Dublin > Dublin (0.04)
- North America > United States
- California > Santa Clara County > Palo Alto (0.04)
- South America > Peru
- Lima Department > Lima Province > Lima (0.04)
- Asia > Middle East
- Genre:
- Research Report (0.82)
- Technology: