AITopics

2412.14569

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry:

Transportation > Infrastructure & Services (0.49)
Consumer Products & Services > Travel (0.35)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-18-2024

RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation

Wu, Kun, Hou, Chengkai, Liu, Jiaming, Che, Zhengping, Ju, Xiaozhu, Yang, Zhuqin, Li, Meng, Zhao, Yinuo, Xu, Zhiyuan, Yang, Guang, Zhao, Zhen, Li, Guangyu, Jin, Zhao, Wang, Lecheng, Mao, Jilei, Wang, Xinhua, Fan, Shichao, Liu, Ning, Ren, Pei, Zhang, Qiang, Lyu, Yaoxu, Liu, Mengzhen, He, Jingyang, Luo, Yulin, Gao, Zeyu, Li, Chenxuan, Gu, Chenyang, Fu, Yankai, Wu, Di, Wang, Xingyu, Chen, Sixiang, Wang, Zhenyu, An, Pengju, Qian, Siyuan, Zhang, Shanghang, Tang, Jian

Developing robust and general-purpose robotic manipulation policies is a key goal in the field of robotics. To achieve effective generalization, it is essential to construct comprehensive datasets that encompass a large number of demonstration trajectories and diverse tasks. Unlike vision or language data that can be collected from the Internet, robotic datasets require detailed observations and manipulation actions, necessitating significant investment in hardware-software infrastructure and human labor. While existing works have focused on assembling various individual robot datasets, there remains a lack of a unified data collection standard and insufficient diversity in tasks, scenarios, and robot types. In this paper, we introduce RoboMIND (Multi-embodiment Intelligence Normative Data for Robot manipulation), featuring 55k real-world demonstration trajectories across 279 diverse tasks involving 61 different object classes. RoboMIND is collected through human teleoperation and encompasses comprehensive robotic-related information, including multi-view RGB-D images, proprioceptive robot state information, end effector details, and linguistic task descriptions. To ensure dataset consistency and reliability during policy learning, RoboMIND is built on a unified data collection platform and standardized protocol, covering four distinct robotic embodiments. We provide a thorough quantitative and qualitative analysis of RoboMIND across multiple dimensions, offering detailed insights into the diversity of our datasets. In our experiments, we conduct extensive real-world testing with four state-of-the-art imitation learning methods, demonstrating that training with RoboMIND data results in a high manipulation success rate and strong generalization. Our project is at https://x-humanoid-robomind.github.io/.

large language model, machine learning, natural language, (18 more...)

2412.13877

Country: Asia (0.28)

Genre: Research Report > New Finding (0.87)

Industry:

Leisure & Entertainment (0.67)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceSep-28-2023

Design of JiuTian Intelligent Network Simulation Platform

Zhao, Lei, Zhang, Miaomiao, Li, Guangyu, Guan, Zhuowen, Liu, Sijia, Xiao, Zhaobin, Cao, Yuting, Lv, Zhe, Liang, Yanping

This paper introduced the JiuTian Intelligent Network Simulation Platform, which can provide wireless communication simulation data services for the Open Innovation Platform. The platform contains a series of scalable simulator functionalities, offering open services that enable users to use reinforcement learning algorithms for model training and inference based on simulation environments and data. Additionally, it allows users to address optimization tasks in different scenarios by uploading and updating parameter configurations. The platform and its open services were primarily introduced from the perspectives of background, overall architecture, simulator, business scenarios, and future directions.

jiutian intelligent network simulation platform, machine learning, reinforcement learning, (2 more...)

2310.06858

Genre: Research Report (0.40)

Technology:

Information Technology > Communications > Networks (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)

arXiv.org Artificial IntelligenceDec-3-2022

Point Cloud Registration-Driven Robust Feature Matching for 3D Siamese Object Tracking

Jiang, Haobo, Lan, Kaihao, Hui, Le, Li, Guangyu, Xie, Jin, Yang, Jian

Learning robust feature matching between the template and search area is crucial for 3D Siamese tracking. The core of Siamese feature matching is how to assign high feature similarity on the corresponding points between the template and search area for precise object localization. In this paper, we propose a novel point cloud registration-driven Siamese tracking framework, with the intuition that spatially aligned corresponding points (via 3D registration) tend to achieve consistent feature representations. Specifically, our method consists of two modules, including a tracking-specific nonlocal registration module and a registration-aided Sinkhorn template-feature aggregation module. The registration module targets at the precise spatial alignment between the template and search area. The tracking-specific spatial distance constraint is proposed to refine the cross-attention weights in the nonlocal module for discriminative feature learning. Then, we use the weighted SVD to compute the rigid transformation between the template and search area, and align them to achieve the desired spatially aligned corresponding points. For the feature aggregation model, we formulate the feature matching between the transformed template and search area as an optimal transport problem and utilize the Sinkhorn optimization to search for the outlier-robust matching solution. Also, a registration-aided spatial distance map is built to improve the matching robustness in indistinguishable regions (e.g., smooth surface). Finally, guided by the obtained feature matching map, we aggregate the target information from the template into the search area to construct the target-specific feature, which is then fed into a CenterPoint-like detection head for object localization. Extensive experiments on KITTI, NuScenes and Waymo datasets verify the effectiveness of our proposed method.

artificial intelligence, machine learning, search area, (16 more...)

2209.06395

Genre: Research Report (0.50)

Industry: Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceMay-11-2022

Hyperspectral Image Classification With Contrastive Graph Convolutional Network

Yu, Wentao, Wan, Sheng, Li, Guangyu, Yang, Jian, Gong, Chen

Recently, Graph Convolutional Network (GCN) has been widely used in Hyperspectral Image (HSI) classification due to its satisfactory performance. However, the number of labeled pixels is very limited in HSI, and thus the available supervision information is usually insufficient, which will inevitably degrade the representation ability of most existing GCN-based methods. To enhance the feature representation ability, in this paper, a GCN model with contrastive learning is proposed to explore the supervision signals contained in both spectral information and spatial relations, which is termed Contrastive Graph Convolutional Network (ConGCN), for HSI classification. First, in order to mine sufficient supervision signals from spectral information, a semi-supervised contrastive loss function is utilized to maximize the agreement between different views of the same node or the nodes from the same land cover category. Second, to extract the precious yet implicit spatial relations in HSI, a graph generative loss function is leveraged to explore supplementary supervision signals contained in the graph topology. In addition, an adaptive graph augmentation technique is designed to flexibly incorporate the spectral-spatial priors of HSI, which helps facilitate the subsequent contrastive representation learning. The extensive experimental results on four typical benchmark datasets firmly demonstrate the effectiveness of the proposed ConGCN in both qualitative and quantitative aspects.

artificial intelligence, classification, machine learning, (16 more...)

doi: 10.1109/TGRS.2023.3240721

2205.11237

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Machine LearningSep-26-2019

Hyperspectral Image Classification With Context-Aware Dynamic Graph Convolutional Network

Wan, Sheng, Gong, Chen, Zhong, Ping, Pan, Shirui, Li, Guangyu, Yang, Jian

In hyperspectral image (HSI) classification, spatial context has demonstrated its significance in achieving promising performance. However, conventional spatial context-based methods simply assume that spatially neighboring pixels should correspond to the same land-cover class, so they often fail to correctly discover the contextual relations among pixels in complex situations, and thus leading to imperfect classification results on some irregular or inhomogeneous regions such as class boundaries. To address this deficiency, we develop a new HSI classification method based on the recently proposed Graph Convolutional Network (GCN), as it can flexibly encode the relations among arbitrarily structured non-Euclidean data. Different from traditional GCN, there are two novel strategies adopted by our method to further exploit the contextual relations for accurate HSI classification. First, since the receptive field of traditional GCN is often limited to fairly small neighborhood, we proposed to capture long range contextual relations in HSI by performing successive graph convolutions on a learned region-induced graph which is transformed from the original 2D image grids. Second, we refine the graph edge weight and the connective relationships among image regions by learning the improved adjacency matrix and the 'edge filter', so that the graph can be gradually refined to adapt to the representations generated by each graph convolutional layer. Such updated graph will in turn result in accurate region representations, and vice versa. The experiments carried out on three real-world benchmark datasets demonstrate that the proposed method yields significant improvement in the classification performance when compared with some state-of-the-art approaches.

contextual relation, deep learning, neural network, (17 more...)

1909.11953

Country:

Asia (0.14)
Europe (0.14)

Genre: Research Report > Promising Solution (0.48)

Industry: Energy (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

arXiv.org Machine LearningApr-3-2019

D$^2$-City: A Large-Scale Dashcam Video Dataset of Diverse Traffic Scenarios

Che, Zhengping, Li, Guangyu, Li, Tracy, Jiang, Bo, Shi, Xuefeng, Zhang, Xinsheng, Lu, Ying, Wu, Guobin, Liu, Yan, Ye, Jieping

Driving datasets accelerate the development of intelligent driving and related computer vision technologies, while substantial and detailed annotations serve as fuels and powers to boost the efficacy of such datasets to improve learning-based models. We propose D$^2$-City, a large-scale comprehensive collection of dashcam videos collected by vehicles on DiDi's platform. D$^2$-City contains more than 10000 video clips which deeply reflect the diversity and complexity of real-world traffic scenarios in China. We also provide bounding boxes and tracking annotations of 12 classes of objects in all frames of 1000 videos and detection annotations on keyframes for the remainder of the videos. Compared with existing datasets, D$^2$-City features data in varying weather, road, and traffic conditions and a huge amount of elaborate detection and tracking annotations. By bringing a diverse set of challenging cases to the community, we expect the D$^2$-City dataset will advance the perception and related areas of intelligent driving.

artificial intelligence, ground transportation, video, (16 more...)

1904.01975

Country:

Asia (0.68)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.95)
Transportation > Infrastructure & Services (0.68)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

arXiv.org Machine LearningOct-29-2018

Big Data Meet Cyber-Physical Systems: A Panoramic Survey

Atat, Rachad, Liu, Lingjia, Wu, Jinsong, Li, Guangyu, Ye, Chunxuan, Yi, Yang

The world is witnessing an unprecedented growth of cyber-physical systems (CPS), which are foreseen to revolutionize our world {via} creating new services and applications in a variety of sectors such as environmental monitoring, mobile-health systems, intelligent transportation systems and so on. The {information and communication technology }(ICT) sector is experiencing a significant growth in { data} traffic, driven by the widespread usage of smartphones, tablets and video streaming, along with the significant growth of sensors deployments that are anticipated in the near future. {It} is expected to outstandingly increase the growth rate of raw sensed data. In this paper, we present the CPS taxonomy {via} providing a broad overview of data collection, storage, access, processing and analysis. Compared with other survey papers, this is the first panoramic survey on big data for CPS, where our objective is to provide a panoramic summary of different CPS aspects. Furthermore, CPS {require} cybersecurity to protect {them} against malicious attacks and unauthorized intrusion, which {become} a challenge with the enormous amount of data that is continuously being generated in the network. {Thus, we also} provide an overview of the different security solutions proposed for CPS big data storage, access and analytics. We also discuss big data meeting green challenges in the contexts of CPS.

international conference, law enforcement, renewable energy, (26 more...)

1810.12399

Country:

Asia > Middle East (0.92)
North America > United States > California (0.45)
Europe > France > Île-de-France > Paris > Paris (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Transportation > Ground > Road (1.00)
Telecommunications > Networks (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(10 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

AAAI ConferencesJun-20-2018

On Group Popularity Prediction in Event-Based Social Networks

Li, Guangyu (New York University) | Liu, Yong (New York University) | Ribeiro, Bruno (Purdue University) | Ding, Hao (New York University)

Although previous work has shown that member and structural features are important to the future popularity of groups in EBSN, it is not yet clear how different member roles and the interplay between them contribute to group popularity. In this paper, we study a real-world dataset from Meetup --- a popular EBSN platform --- and propose a deep neural network based method to predict the popularity of new Meetup groups. Our method uses group-level features specific to event-based social networks, such as time and location of events in a group, as well as the structural features internal to a group, such as the inferred member roles in a group and social substructures among members. Empirically, our approach reduces the RMSE of the popularity prediction (measured in RSVPs) of a group's future events by up to 12%, against the state-of-the-art baselines.

event-based social network, group popularity prediction

AAAI Conferences

Twelfth International AAAI Conference on Web and Social Media

Industry: Information Technology > Services (0.60)

Technology:

Information Technology > Communications > Social Media (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

arXiv.org Machine LearningOct-31-2017

Tensor Regression Meets Gaussian Processes

Yu, Rose, Li, Guangyu, Liu, Yan

Low-rank tensor regression, a new model class that learns high-order correlation from data, has recently received considerable attention. At the same time, Gaussian processes (GP) are well-studied machine learning models for structure learning. In this paper, we demonstrate interesting connections between the two, especially for multi-way data analysis. We show that low-rank tensor regression is essentially learning a multi-linear kernel in Gaussian processes, and the low-rank assumption translates to the constrained Bayesian inference problem. We prove the oracle inequality and derive the average case learning curve for the equivalent GP model. Our finding implies that low-rank tensor regression, though empirically successful, is highly dependent on the eigenvalues of covariance functions as well as variable correlations.

artificial intelligence, bayesian inference, regression, (17 more...)

1710.11345

Country: North America > United States > California (0.14)

Genre: Research Report (0.70)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)