Wong, Anthony
Simultaneous emulation and downscaling with physically-consistent deep learning-based regional ocean emulators
Lupin-Jimenez, Leonard, Darman, Moein, Hazarika, Subhashis, Wu, Tianning, Gray, Michael, He, Ruyoing, Wong, Anthony, Chattopadhyay, Ashesh
Data-driven models are promising tools for predicting ocean conditions and enhancing the details of these predictions. In this study, we applied advanced machine learning methods to model sea surface velocity and height in the Gulf of Mexico. To forecast broad ocean conditions, we used a method called Fourier Neural Operators (FNO), designed to balance computational efficiency with accuracy through a specialized loss function that combines grid and spectral space information. For creating high-resolution details from low-resolution data -- a process called downscaling -- we explored two different neural network architectures and compared their performance against simpler linear interpolation. This combination of forecasting and downscaling methods greatly improves the efficiency of ocean forecast and downscaling compared to numerical simulation with limited input variables. Our results highlight that these data-driven techniques can provide reliable, physics-aware predictions that can be useful for quick, localized analyses and in generating statistical predictions.
ControlMTR: Control-Guided Motion Transformer with Scene-Compliant Intention Points for Feasible Motion Prediction
Sun, Jiawei, Yuan, Chengran, Sun, Shuo, Wang, Shanze, Han, Yuhang, Ma, Shuailei, Huang, Zefan, Wong, Anthony, Tee, Keng Peng, Ang, Marcelo H. Jr
The ability to accurately predict feasible multimodal future trajectories of surrounding traffic participants is crucial for behavior planning in autonomous vehicles. The Motion Transformer (MTR), a state-of-the-art motion prediction method, alleviated mode collapse and instability during training and enhanced overall prediction performance by replacing conventional dense future endpoints with a small set of fixed prior motion intention points. However, the fixed prior intention points make the MTR multi-modal prediction distribution over-scattered and infeasible in many scenarios. In this paper, we propose the ControlMTR framework to tackle the aforementioned issues by generating scene-compliant intention points and additionally predicting driving control commands, which are then converted into trajectories by a simple kinematic model with soft constraints. These control-generated trajectories will guide the directly predicted trajectories by an auxiliary loss function. Together with our proposed scene-compliant intention points, they can effectively restrict the prediction distribution within the road boundaries and suppress infeasible off-road predictions while enhancing prediction performance. Remarkably, without resorting to additional model ensemble techniques, our method surpasses the baseline MTR model across all performance metrics, achieving notable improvements of 5.22% in SoftmAP and a 4.15% reduction in MissRate. Our approach notably results in a 41.85% reduction in the cross-boundary rate of the MTR, effectively ensuring that the prediction distribution is confined within the drivable area.
GET-DIPP: Graph-Embedded Transformer for Differentiable Integrated Prediction and Planning
Sun, Jiawei, Yuan, Chengran, Sun, Shuo, Liu, Zhiyang, Goh, Terence, Wong, Anthony, Tee, Keng Peng, Ang, Marcelo H. Jr
Accurately predicting interactive road agents' future trajectories and planning a socially compliant and human-like trajectory accordingly are important for autonomous vehicles. In this paper, we propose a planning-centric prediction neural network, which takes surrounding agents' historical states and map context information as input, and outputs the joint multi-modal prediction trajectories for surrounding agents, as well as a sequence of control commands for the ego vehicle by imitation learning. An agent-agent interaction module along the time axis is proposed in our network architecture to better comprehend the relationship among all the other intelligent agents on the road. To incorporate the map's topological information, a Dynamic Graph Convolutional Neural Network (DGCNN) is employed to process the road network topology. Besides, the whole architecture can serve as a backbone for the Differentiable Integrated motion Prediction with Planning (DIPP) method by providing accurate prediction results and initial planning commands. Experiments are conducted on real-world datasets to demonstrate the improvements made by our proposed method in both planning and prediction accuracy compared to the previous state-of-the-art methods.