On the effectiveness of discrete representations in sparse mixture of experts

Open in new window