Measuring machine learning harms from stereotypes: requires understanding who is being harmed by which errors in what ways