What can human minimal videos tell us about dynamic recognition models?

Open in new window