Towards generalizing deep-audio fake detection networks