Education
Glance and Focus: Memory Prompting for Multi-Event Video Question Answering Ziyi Bai
Video Question Answering (VideoQA) has emerged as a vital tool to evaluate agents' ability to understand human daily behaviors. Despite the recent success of large vision language models in many multi-modal tasks, complex situation reasoning over videos involving multiple human-object interaction events still remains challenging.
H-nobs: Achieving Certified Fairness and Robustness in Distributed Learning on Heterogeneous Datasets
Fairness and robustness are two important goals in the desig n of modern distributed learning systems. Despite a few prior works attemp ting to achieve both fairness and robustness, some key aspects of this direction remain underexplored. In this paper, we try to answer three largely unnoticed and un addressed questions that are of paramount significance to this topic: (i) What mak es jointly satisfying fairness and robustness difficult?