waterway
FIFA: Unified Faithfulness Evaluation Framework for Text-to-Video and Video-to-Text Generation
Jing, Liqiang, Lai, Viet, Yoon, Seunghyun, Bui, Trung, Du, Xinya
Video Multimodal Large Language Models (VideoMLLMs) have achieved remarkable progress in both Video-to-Text and Text-to-Video tasks. However, they often suffer fro hallucinations, generating content that contradicts the visual input. Existing evaluation methods are limited to one task (e.g., V2T) and also fail to assess hallucinations in open-ended, free-form responses. To address this gap, we propose FIFA, a unified FaIthFulness evAluation framework that extracts comprehensive descriptive facts, models their semantic dependencies via a Spatio-Temporal Semantic Dependency Graph, and verifies them using VideoQA models. We further introduce Post-Correction, a tool-based correction framework that revises hallucinated content. Extensive experiments demonstrate that FIFA aligns more closely with human judgment than existing evaluation methods, and that Post-Correction effectively improves factual consistency in both text and video generation.
- Europe > Austria > Vienna (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- (14 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding
Guan, Runwei, Ouyang, Ningwei, Xu, Tianhao, Liang, Shaofeng, Dai, Wei, Sun, Yafeng, Gao, Shang, Lai, Songning, Yao, Shanliang, Hu, Xuming, Liu, Ryan Wen, Yue, Yutao, Xiong, Hui
Automated waterway environment perception is crucial for enabling unmanned surface vessels (USVs) to understand their surroundings and make informed decisions. Most existing waterway perception models primarily focus on instance-level object perception paradigms (e.g., detection, segmentation). However, due to the complexity of waterway environments, current perception datasets and models fail to achieve global semantic understanding of waterways, limiting large-scale monitoring and structured log generation. With the advancement of vision-language models (VLMs), we leverage image captioning to introduce WaterCaption, the first captioning dataset specifically designed for waterway environments. WaterCaption focuses on fine-grained, multi-region long-text descriptions, providing a new research direction for visual geo-understanding and spatial scene cognition. Exactly, it includes 20.2k image-text pair data with 1.8 million vocabulary size. Additionally, we propose Da Yu, an edge-deployable multi-modal large language model for USVs, where we propose a novel vision-to-language projector called Nano Transformer Adaptor (NTA). NTA effectively balances computational efficiency with the capacity for both global and fine-grained local modeling of visual features, thereby significantly enhancing the model's ability to generate long-form textual outputs. Da Yu achieves an optimal balance between performance and efficiency, surpassing state-of-the-art models on WaterCaption and several other captioning benchmarks.
- Asia > China > Hong Kong (0.05)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
USVTrack: USV-Based 4D Radar-Camera Tracking Dataset for Autonomous Driving in Inland Waterways
Yao, Shanliang, Guan, Runwei, Ni, Yi, Xu, Sen, Yue, Yong, Zhu, Xiaohui, Liu, Ryan Wen
Object tracking in inland waterways plays a crucial role in safe and cost-effective applications, including waterborne transportation, sightseeing tours, environmental monitoring and surface rescue. Our Unmanned Surface Vehicle (USV), equipped with a 4D radar, a monocular camera, a GPS, and an IMU, delivers robust tracking capabilities in complex waterborne environments. By leveraging these sensors, our USV collected comprehensive object tracking data, which we present as USVTrack, the first 4D radar-camera tracking dataset tailored for autonomous driving in new generation waterborne transportation systems. Our USVTrack dataset presents rich scenarios, featuring diverse various waterways, varying times of day, and multiple weather and lighting conditions. Moreover, we present a simple but effective radar-camera matching method, termed RCM, which can be plugged into popular two-stage association trackers. Experimental results utilizing RCM demonstrate the effectiveness of the radar-camera matching in improving object tracking accuracy and reliability for autonomous driving in waterborne environments. The USVTrack dataset is public on https://usvtrack.github.io.
- Asia > China > Hubei Province > Wuhan (0.05)
- Europe > Switzerland (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- (3 more...)
- Automobiles & Trucks (0.92)
- Transportation > Ground > Road (0.83)
- Information Technology > Robotics & Automation (0.83)
Graph Learning-Driven Multi-Vessel Association: Fusing Multimodal Data for Maritime Intelligence
Lu, Yuxu, Yang, Kaisen, Yang, Dong, Ding, Haifeng, Weng, Jinxian, Liu, Ryan Wen
Ensuring maritime safety and optimizing traffic management in increasingly crowded and complex waterways require effective waterway monitoring. However, current methods struggle with challenges arising from multimodal data, such as dimensional disparities, mismatched target counts, vessel scale variations, occlusions, and asynchronous data streams from systems like the automatic identification system (AIS) and closed-circuit television (CCTV). Traditional multi-target association methods often struggle with these complexities, particularly in densely trafficked waterways. To overcome these issues, we propose a graph learning-driven multi-vessel association (GMvA) method tailored for maritime multimodal data fusion. By integrating AIS and CCTV data, GMvA leverages time series learning and graph neural networks to capture the spatiotemporal features of vessel trajectories effectively. To enhance feature representation, the proposed method incorporates temporal graph attention and spatiotemporal attention, effectively capturing both local and global vessel interactions. Furthermore, a multi-layer perceptron-based uncertainty fusion module computes robust similarity scores, and the Hungarian algorithm is adopted to ensure globally consistent and accurate target matching. Extensive experiments on real-world maritime datasets confirm that GMvA delivers superior accuracy and robustness in multi-target association, outperforming existing methods even in challenging scenarios with high vessel density and incomplete or unevenly distributed AIS and CCTV data.
- Asia > China > Hong Kong (0.05)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Africa > Mozambique > Sofala Province > Beira (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.86)
Deep learning waterways for rural infrastructure development
Pierson, Matthew, Mehrabi, Zia
Surprisingly a number of Earth's waterways remain unmapped, with a significant number in low and middle income countries. Here we build a computer vision model (WaterNet) to learn the location of waterways in the United States, based on high resolution satellite imagery and digital elevation models, and then deploy this in novel environments in the African continent. Our outputs provide detail of waterways structures hereto unmapped. When assessed against community needs requests for rural bridge building related to access to schools, health care facilities and agricultural markets, we find these newly generated waterways capture on average 93% (country range: 88-96%) of these requests whereas Open Street Map, and the state of the art data from TDX-Hydro, capture only 36% (5-72%) and 62% (37% - 85%), respectively. Because these new machine learning enabled maps are built on public and operational data acquisition this approach offers promise for capturing humanitarian needs and planning for social development in places where cartographic efforts have so far failed to deliver. The improved performance in identifying community needs missed by existing data suggests significant value for rural infrastructure development and better targeting of development interventions.
- Africa > Ethiopia (0.05)
- Africa > Rwanda (0.05)
- Africa > Côte d'Ivoire (0.05)
- (16 more...)
- Social Sector (0.66)
- Education (0.54)
- Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)
2-Level Reinforcement Learning for Ships on Inland Waterways
Waltz, Martin, Paulig, Niklas, Okhrin, Ostap
This paper proposes a realistic modularized framework for controlling autonomous surface vehicles (ASVs) on inland waterways (IWs) based on deep reinforcement learning (DRL). The framework comprises two levels: a high-level local path planning (LPP) unit and a low-level path following (PF) unit, each consisting of a DRL agent. The LPP agent is responsible for planning a path under consideration of nearby vessels, traffic rules, and the geometry of the waterway. We thereby transfer a recently proposed spatial-temporal recurrent neural network architecture to continuous action spaces. The LPP agent improves operational safety in comparison to a state-of-the-art artificial potential field method by increasing the minimum distance to other vessels by 65% on average. The PF agent performs low-level actuator control while accounting for shallow water influences and the environmental forces winds, waves, and currents. Compared with a proportional-integral-derivative (PID) controller, the PF agent yields only 61% of the mean cross-track error while significantly reducing control effort in terms of the required absolute rudder angle. Lastly, both agents are jointly validated in simulation, employing the lower Elbe in northern Germany as an example case and using real automatic identification system (AIS) trajectories to model the behavior of other ships.
- Atlantic Ocean > North Atlantic Ocean > North Sea > Elbe Estuary (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Geneva > Geneva (0.04)
- (5 more...)
- Transportation > Marine (1.00)
- Information Technology (0.92)
- Transportation > Freight & Logistics Services > Shipping (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Safety Aware Autonomous Path Planning Using Model Predictive Reinforcement Learning for Inland Waterways
Vanneste, Astrid, Vanneste, Simon, Vasseur, Olivier, Janssens, Robin, Billast, Mattias, Anwar, Ali, Mets, Kevin, De Schepper, Tom, Mercelis, Siegfried, Hellinckx, Peter
In recent years, interest in autonomous shipping in urban waterways has increased significantly due to the trend of keeping cars and trucks out of city centers. Classical approaches such as Frenet frame based planning and potential field navigation often require tuning of many configuration parameters and sometimes even require a different configuration depending on the situation. In this paper, we propose a novel path planning approach based on reinforcement learning called Model Predictive Reinforcement Learning (MPRL). MPRL calculates a series of waypoints for the vessel to follow. The environment is represented as an occupancy grid map, allowing us to deal with any shape of waterway and any number and shape of obstacles. We demonstrate our approach on two scenarios and compare the resulting path with path planning using a Frenet frame and path planning based on a proximal policy optimization (PPO) agent. Our results show that MPRL outperforms both baselines in both test scenarios. The PPO based approach was not able to reach the goal in either scenario while the Frenet frame approach failed in the scenario consisting of a corner with obstacles. MPRL was able to safely (collision free) navigate to the goal in both of the test scenarios.
- Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Puerto Rico > San Juan > San Juan (0.04)
- (5 more...)
Risk-Aware Wasserstein Distributionally Robust Control of Vessels in Natural Waterways
Nadales, Juan Moreno, Hakobyan, Astghik, de la Peña, David Muñoz, Limon, Daniel, Yang, Insoon
In the realm of maritime transportation, autonomous vessel navigation in natural inland waterways faces persistent challenges due to unpredictable natural factors. Existing scheduling algorithms fall short in handling these uncertainties, compromising both safety and efficiency. Moreover, these algorithms are primarily designed for non-autonomous vessels, leading to labor-intensive operations vulnerable to human error. To address these issues, this study proposes a risk-aware motion control approach for vessels that accounts for the dynamic and uncertain nature of tide islands in a distributionally robust manner. Specifically, a model predictive control method is employed to follow the reference trajectory in the time-space map while incorporating a risk constraint to prevent grounding accidents. To address uncertainties in tide islands, a novel modeling technique represents them as stochastic polytopes. Additionally, potential inaccuracies in waterway depth are addressed through a risk constraint that considers the worst-case uncertainty distribution within a Wasserstein ambiguity set around the empirical distribution. Using sensor data collected in the Guadalquivir River, we empirically demonstrate the performance of the proposed method through simulations on a vessel. As a result, the vessel successfully navigates the waterway while avoiding grounding accidents, even with a limited dataset of observations. This stands in contrast to existing non-robust controllers, highlighting the robustness and practical applicability of the proposed approach.
- Transportation (1.00)
- Energy > Oil & Gas (0.42)
US Navy sails first drone boat through Strait of Hormuz between Iran, Oman
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. The U.S. Navy sailed its first drone boat through the strategic Strait of Hormuz on Wednesday, a crucial waterway for global energy supplies where American sailors often faces tense encounters with Iranian forces. The trip by the L3 Harris Arabian Fox MAST-13, a 41-foot speedboat carrying sensors and cameras, drew the attention of Iran's Revolutionary Guard, but took place without incident, said Navy spokesman Cmdr. Two U.S. Coast Guard cutters, the USCGC Charles Moulthrope and USCGC John Scheuerman, accompanied the drone.
- North America > United States (1.00)
- Asia > Middle East > Iran (1.00)
- Asia > Middle East > UAE (0.66)
- (11 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military > Navy (1.00)
- Government > Regional Government > Asia Government > Middle East Government > Iran Government (0.61)
Robust Path Following on Rivers Using Bootstrapped Reinforcement Learning
This paper develops a Deep Reinforcement Learning (DRL)-agent for navigation and control of autonomous surface vessels (ASV) on inland waterways. Spatial restrictions due to waterway geometry and the resulting challenges, such as high flow velocities or shallow banks, require controlled and precise movement of the ASV. A state-of-the-art bootstrapped Q-learning algorithm in combination with a versatile training environment generator leads to a robust and accurate rudder controller. To validate our results, we compare the path-following capabilities of the proposed approach to a vessel-specific PID controller on real-world river data from the lower- and middle Rhine, indicating that the DRL algorithm could effectively prove generalizability even in never-seen scenarios while simultaneously attaining high navigational accuracy.
- Europe > Germany (0.46)
- North America > United States (0.46)
- Energy > Oil & Gas (0.46)
- Transportation > Marine (0.46)