Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems