SALMONN: Towards Generic Hearing Abilities for Large Language Models