Towards measuring fairness in AI: the Casual Conversations dataset