AITopics | Ji, Jingwei

Collaborating Authors

Ji, Jingwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EMMA: End-to-End Multimodal Model for Autonomous Driving

Hwang, Jyh-Jing, Xu, Runsheng, Lin, Hubert, Hung, Wei-Chih, Ji, Jingwei, Choi, Kristy, Huang, Di, He, Tong, Covington, Paul, Sapp, Benjamin, Zhou, Yin, Guo, James, Anguelov, Dragomir, Tan, Mingxing

arXiv.org Artificial IntelligenceNov-4-2024

We introduce EMMA, an End-to-end Multimodal Model for Autonomous driving. Built on a multi-modal large language model foundation, EMMA directly maps raw camera sensor data into various driving-specific outputs, including planner trajectories, perception objects, and road graph elements. EMMA maximizes the utility of world knowledge from the pre-trained large language models, by representing all non-sensor inputs (e.g. navigation instructions and ego vehicle status) and outputs (e.g. trajectories and 3D locations) as natural language text. This approach allows EMMA to jointly process various driving tasks in a unified language space, and generate the outputs for each task using task-specific prompts. Empirically, we demonstrate EMMA's effectiveness by achieving state-of-the-art performance in motion planning on nuScenes as well as competitive results on the Waymo Open Motion Dataset (WOMD). EMMA also yields competitive results for camera-primary 3D object detection on the Waymo Open Dataset (WOD). We show that co-training EMMA with planner trajectories, object detection, and road graph tasks yields improvements across all three domains, highlighting EMMA's potential as a generalist model for autonomous driving applications. However, EMMA also exhibits certain limitations: it can process only a small amount of image frames, does not incorporate accurate 3D sensing modalities like LiDAR or radar and is computationally expensive. We hope that our results will inspire further research to mitigate these issues and to further evolve the state of the art in autonomous driving model architectures.

large language model, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2410.23262

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Multi-Task Dynamic Pricing in Credit Market with Contextual Information

Javanmard, Adel, Ji, Jingwei, Xu, Renyuan

arXiv.org Artificial IntelligenceOct-25-2024

We study the dynamic pricing problem faced by a broker that buys and sells a large number of financial securities in the credit market, such as corporate bonds, government bonds, loans, and other credit-related securities. One challenge in pricing these securities is their infrequent trading, which leads to insufficient data for individual pricing. However, many of these securities share structural features that can be utilized. Building on this, we propose a multi-task dynamic pricing framework that leverages these shared structures across securities, enhancing pricing accuracy through learning. In our framework, a security is fully characterized by a $d$ dimensional contextual/feature vector. The customer will buy (sell) the security from the broker if the broker quotes a price lower (higher) than that of the competitors. We assume a linear contextual model for the competitor's pricing, with unknown parameters a priori. The parameters for pricing different securities may or may not be similar to each other. The firm's objective is to minimize the expected regret, namely, the expected revenue loss against a clairvoyant policy which has the knowledge of the parameters of the competitor's pricing model. We show that the regret of our policy is better than both the policy that treats each security individually and the policy that treats all securities as the same. Moreover, the regret is bounded by $\tilde{O} ( \delta_{\max} \sqrt{T M d} + M d ) $, where $M$ is the number of securities and $\delta_{\max}$ characterizes the overall dissimilarity across securities in the basket.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.14839

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

3D Human Keypoints Estimation From Point Clouds in the Wild Without Human Labels

Weng, Zhenzhen, Gorban, Alexander S., Ji, Jingwei, Najibi, Mahyar, Zhou, Yin, Anguelov, Dragomir

arXiv.org Artificial IntelligenceJun-7-2023

Training a 3D human keypoint detector from point clouds in a supervised manner requires large volumes of high quality labels. While it is relatively easy to capture large amounts of human point clouds, annotating 3D keypoints is expensive, subjective, error prone and especially difficult for long-tail cases (pedestrians with rare poses, scooterists, etc.). In this work, we propose GC-KPL - Geometry Consistency inspired Key Point Leaning, an approach for learning 3D human joint locations from point clouds without human labels. We achieve this by our novel unsupervised loss formulations that account for the structure and movement of the human body. We show that by training on a large training set from Waymo Open Dataset without any human annotated keypoints, we are able to achieve reasonable performance as compared to the fully supervised approach. Further, the backbone benefits from the unsupervised training and is useful in downstream fewshot learning of keypoints, where fine-tuning on only 10 percent of the labeled training data gives comparable performance to fine-tuning on the entire set. We demonstrated that GC-KPL outperforms by a large margin over SoTA when trained on entire dataset and efficiently leverages large volumes of unlabeled data.

artificial intelligence, machine learning, point cloud, (14 more...)

arXiv.org Artificial Intelligence

2306.04745

Genre: Research Report (1.00)

Industry: Health & Medicine (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)

Add feedback

Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving

Najibi, Mahyar, Ji, Jingwei, Zhou, Yin, Qi, Charles R., Yan, Xinchen, Ettinger, Scott, Anguelov, Dragomir

arXiv.org Artificial IntelligenceOct-14-2022

Learning-based perception and prediction modules in modern autonomous driving systems typically rely on expensive human annotation and are designed to perceive only a handful of predefined object categories. This closed-set paradigm is insufficient for the safety-critical autonomous driving task, where the autonomous vehicle needs to process arbitrarily many types of traffic participants and their motion behaviors in a highly dynamic world. To address this difficulty, this paper pioneers a novel and challenging direction, i.e., training perception and prediction models to understand open-set moving objects, with no human supervision. Our proposed framework uses self-learned flow to trigger an automated meta labeling pipeline to achieve automatic supervision. 3D detection experiments on the Waymo Open Dataset show that our method significantly outperforms classical unsupervised approaches and is even competitive to the counterpart with supervised scene flow. We further show that our approach generates highly promising results in open-set 3D detection and trajectory prediction, confirming its potential in closing the safety gap of fully supervised systems.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2210.08061

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.91)
Information Technology > Robotics & Automation (0.82)
Automobiles & Trucks (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback