Cao, Guangzhi
Large Trajectory Models are Scalable Motion Predictors and Planners
Sun, Qiao, Zhang, Shiduo, Ma, Danjiao, Shi, Jingzhe, Li, Derun, Luo, Simian, Wang, Yu, Xu, Ningyi, Cao, Guangzhi, Zhao, Hang
Motion prediction and planning are vital tasks in autonomous driving, and recent efforts have shifted to machine learning-based approaches. The challenges include understanding diverse road topologies, reasoning traffic dynamics over a long time horizon, interpreting heterogeneous behaviors, and generating policies in a large continuous state space. Inspired by the success of large language models in addressing similar complexities through model scaling, we introduce a scalable trajectory model called State Transformer (STR). Our approach unites trajectory generation problems with other sequence modeling problems, powering rapid iterations with breakthroughs in neighbor domains such as language modeling. Remarkably, experimental results reveal that large trajectory models (LTMs), such as STR, adhere to the scaling laws by presenting outstanding adaptability and learning efficiency. Qualitative results further demonstrate that LTMs are capable of making plausible predictions in scenarios that diverge significantly from the training data distribution. LTMs also learn to make complex reasonings for long-term planning, without explicit loss designs or costly high-level annotations. Motion planning and prediction in autonomous driving rely on the ability to semantically understand complex driving environments and interactions between various road users. Learning-based methods are pivotal to overcoming this complexity as rule-based and scenario-specific strategies often prove inadequate to cover all possible situations and unexpected events that may occur during operations. Such learning problems can be regarded as conditional sequence-to-sequence tasks, where models leverage past trajectories to generate future ones, depending on the observations. Notably, these problems share structural similarities with other sequence modeling problems, such as language generation. Recent studies (Mirchandani et al., 2023; Zeng et al., 2023) have demonstrated that the LLMs excel not only in natural language generation but also in tackling a wide range of sequence modeling problems and time series forecasting challenges. Building on these insights, prior research (Chen et al., 2021; Janner et al., 2021; Sun et al., 2023) have effectively utilized conditional causal transformers to address motion planning as a large sequence modeling problem, with both behavior cloning and reinforcement learning. Furthermore, (Brohan et al., 2023) replace the transformer backbone with language models, demonstrating the potential to merge motion planning along with other modalities within one large sequence for LLMs.
Covariance Estimation for High Dimensional Data Vectors Using the Sparse Matrix Transform
Cao, Guangzhi, Bouman, Charles
Covariance estimation for high dimensional vectors is a classically difficult problem in statistical analysis and machine learning due to limited sample size. In this paper, we propose a new approach to covariance estimation, which is based on constrained maximum likelihood (ML) estimation of the covariance. Specifically, the covariance is constrained to have an eigen decomposition which can be represented as a sparse matrix transform (SMT). The SMT is formed by a product of pairwise coordinate rotations known as Givens rotations. Using this framework, the covariance can be efficiently estimated using greedy minimization of the log likelihood function, and the number of Givens rotations can be efficiently computed using a cross-validation procedure. The estimator obtained using this method is always positive definite and well-conditioned even with limited sample size. Experiments on hyperspectral data show that SMT covariance estimation results in consistently better estimates of the covariance for a variety of different classes and sample sizes compared to traditional shrinkage estimators.