Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding