Towards holistic scene understanding: Semantic segmentation and beyond