To design an efficient intervention, decision-makers need not only to estimate the total effect of the intervention, but also understand the underlying causal mechanisms driving this effect.
However, there is a noticeable gap in analysis for multiclass classification, with only a handful of results which themselves are restricted to the cross-entropy loss.
Visual question answering ( VQA) is a challenging task that requires an in-depth understanding of vision and language, as well as multi-modal reasoning.
While several previous works have focused on classifying close-set samples and detecting open-set samples during testing, it's still essential to be able to classify unknown subjects as human beings.