Establishing Trustworthiness: Rethinking Tasks and Model Evaluation