Constructing Hierarchical Q&A Datasets for Video Story Understanding