Enhancing Long Video Question Answering with Scene-Localized Frame Grouping

Open in new window