AITopics | Ding, Rui

Plotting

Ding, Rui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Shen, Xuan, Ma, Weize, Liu, Jing, Yang, Changdi, Ding, Rui, Wang, Quanyi, Ding, Henghui, Niu, Wei, Wang, Yanzhi, Zhao, Pu, Lin, Jun, Gu, Jiuxiang

arXiv.org Artificial IntelligenceMar-20-2025

Monocular Depth Estimation (MDE) has emerged as a pivotal task in computer vision, supporting numerous real-world applications. However, deploying accurate depth estimation models on resource-limited edge devices, especially Application-Specific Integrated Circuits (ASICs), is challenging due to the high computational and memory demands. Recent advancements in foundational depth estimation deliver impressive results but further amplify the difficulty of deployment on ASICs. To address this, we propose QuartDepth which adopts post-training quantization to quantize MDE models with hardware accelerations for ASICs. Our approach involves quantizing both weights and activations to 4-bit precision, reducing the model size and computation cost. To mitigate the performance degradation, we introduce activation polishing and compensation algorithm applied before and after activation quantization, as well as a weight reconstruction method for minimizing errors in weight quantization. Furthermore, we design a flexible and programmable hardware accelerator by supporting kernel fusion and customized instruction programmability, enhancing throughput and efficiency. Experimental results demonstrate that our framework achieves competitive accuracy while enabling fast inference and higher energy efficiency on ASICs, bridging the gap between high-performance depth estimation and practical edge-device applicability. Code: https://github.com/shawnricecake/quart-depth

artificial intelligence, machine learning, quantization, (19 more...)

arXiv.org Artificial Intelligence

2503.16709

Country:

Asia (0.67)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (0.46)
Semiconductors & Electronics (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning

Zhang, Jiaru, Ding, Rui, Fu, Qiang, Huang, Bojun, Deng, Zizhen, Hua, Yang, Guan, Haibing, Han, Shi, Zhang, Dongmei

arXiv.org Artificial IntelligenceFeb-15-2025

Causal discovery is a structured prediction task that aims to predict causal relations among variables based on their data samples. Supervised Causal Learning (SCL) is an emerging paradigm in this field. Existing Deep Neural Network (DNN)-based methods commonly adopt the "Node-Edge approach", in which the model first computes an embedding vector for each variable-node, then uses these variable-wise representations to concurrently and independently predict for each directed causal-edge. In this paper, we first show that this architecture has some systematic bias that cannot be mitigated regardless of model size and data size. We then propose SiCL, a DNN-based SCL method that predicts a skeleton matrix together with a v-tensor (a third-order tensor representing the v-structures). According to the Markov Equivalence Class (MEC) theory, both the skeleton and the v-structures are identifiable causal structures under the canonical MEC setting, so predictions about skeleton and v-structures do not suffer from the identifiability limit in causal discovery, thus SiCL can avoid the systematic bias in Node-Edge architecture, and enable consistent estimators for causal discovery. Moreover, SiCL is also equipped with a specially designed pairwise encoder module with a unidirectional attention layer to model both internal and external relationships of pairs of nodes. Experimental results on both synthetic and real-world benchmarks show that SiCL significantly outperforms other DNN-based SCL approaches.

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.10883

Country: Asia (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Data-and-Semantic Dual-Driven Spectrum Map Construction for 6G Spectrum Management

Liu, Jiayu, Zhou, Fuhui, Liu, Xiaodong, Ding, Rui, Yuan, Lu, Wu, Qihui

arXiv.org Artificial IntelligenceJan-22-2025

Spectrum maps reflect the utilization and distribution of spectrum resources in the electromagnetic environment, serving as an effective approach to support spectrum management. However, the construction of spectrum maps in urban environments is challenging because of high-density connection and complex terrain. Moreover, the existing spectrum map construction methods are typically applied to a fixed frequency, which cannot cover the entire frequency band. To address the aforementioned challenges, a UNet-based data-and-semantic dual-driven method is proposed by introducing the semantic knowledge of binary city maps and binary sampling location maps to enhance the accuracy of spectrum map construction in complex urban environments with dense communications. Moreover, a joint frequency-space reasoning model is exploited to capture the correlation of spectrum data in terms of space and frequency, enabling the realization of complete spectrum map construction without sampling all frequencies of spectrum data. The simulation results demonstrate that the proposed method can infer the spectrum utilization status of missing frequencies and improve the completeness of the spectrum map construction. Furthermore, the accuracy of spectrum map construction achieved by the proposed data-and-semantic dual-driven method outperforms the benchmark schemes, especially in scenarios with low sampling density.

artificial intelligence, machine learning, spectrum map, (17 more...)

arXiv.org Artificial Intelligence

2501.12853

Country: Asia > China > Jiangsu Province (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Telecommunications (0.61)
Information Technology > Networks (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

Ding, Rui, Yong, Liang, Zhao, Sihuan, Nie, Jing, Chen, Lihui, Liu, Haijun, Zhou, Xichuan

arXiv.org Artificial IntelligenceDec-19-2024

Due to its efficiency, Post-Training Quantization (PTQ) has been widely adopted for compressing Vision Transformers (ViTs). However, when quantized into low-bit representations, there is often a significant performance drop compared to their full-precision counterparts. To address this issue, reconstruction methods have been incorporated into the PTQ framework to improve performance in low-bit quantization settings. Nevertheless, existing related methods predefine the reconstruction granularity and seldom explore the progressive relationships between different reconstruction granularities, which leads to sub-optimal quantization results in ViTs. To this end, in this paper, we propose a Progressive Fine-to-Coarse Reconstruction (PFCR) method for accurate PTQ, which significantly improves the performance of low-bit quantized vision transformers. Specifically, we define multi-head self-attention and multi-layer perceptron modules along with their shortcuts as the finest reconstruction units. After reconstructing these two fine-grained units, we combine them to form coarser blocks and reconstruct them at a coarser granularity level. We iteratively perform this combination and reconstruction process, achieving progressive fine-to-coarse reconstruction. Additionally, we introduce a Progressive Optimization Strategy (POS) for PFCR to alleviate the difficulty of training, thereby further enhancing model performance. Experimental results on the ImageNet dataset demonstrate that our proposed method achieves the best Top-1 accuracy among state-of-the-art methods, particularly attaining 75.61% for 3-bit quantized ViT-B in PTQ. Besides, quantization results on the COCO dataset reveal the effectiveness and generalization of our proposed method on other computer vision tasks like object detection and instance segmentation.

artificial intelligence, machine learning, quantization, (18 more...)

arXiv.org Artificial Intelligence

2412.14633

Country:

Asia > China (0.14)
Europe > Switzerland (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Relation Learning and Aggregate-attention for Multi-person Motion Prediction

Qu, Kehua, Ding, Rui, Tang, Jin

arXiv.org Artificial IntelligenceNov-6-2024

Multi-person motion prediction is an emerging and intricate task with broad real-world applications. Unlike single person motion prediction, it considers not just the skeleton structures or human trajectories but also the interactions between others. Previous methods use various networks to achieve impressive predictions but often overlook that the joints relations within an individual (intra-relation) and interactions among groups (inter-relation) are distinct types of representations. These methods often lack explicit representation of inter&intra-relations, and inevitably introduce undesired dependencies. To address this issue, we introduce a new collaborative framework for multi-person motion prediction that explicitly modeling these relations:a GCN-based network for intra-relations and a novel reasoning network for inter-relations.Moreover, we propose a novel plug-and-play aggregation module called the Interaction Aggregation Module (IAM), which employs an aggregate-attention mechanism to seamlessly integrate these relations. Experiments indicate that the module can also be applied to other dual-path models. Extensive experiments on the 3DPW, 3DPW-RC, CMU-Mocap, MuPoTS-3D, as well as synthesized datasets Mix1 & Mix2 (9 to 15 persons), demonstrate that our method achieves state-of-the-art performance.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2411.03729

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

UnityGraph: Unified Learning of Spatio-temporal features for Multi-person Motion Prediction

Qu, Kehua, Ding, Rui, Tang, Jin

arXiv.org Artificial IntelligenceNov-6-2024

Multi-person motion prediction is a complex and emerging field with significant real-world applications. Current state-of-the-art methods typically adopt dual-path networks to separately modeling spatial features and temporal features. However, the uncertain compatibility of the two networks brings a challenge for spatio-temporal features fusion and violate the spatio-temporal coherence and coupling of human motions by nature. To address this issue, we propose a novel graph structure, UnityGraph, which treats spatio-temporal features as a whole, enhancing model coherence and coupling.spatio-temporal features as a whole, enhancing model coherence and coupling. Specifically, UnityGraph is a hypervariate graph based network. The flexibility of the hypergraph allows us to consider the observed motions as graph nodes. We then leverage hyperedges to bridge these nodes for exploring spatio-temporal features. This perspective considers spatio-temporal dynamics unitedly and reformulates multi-person motion prediction into a problem on a single graph. Leveraging the dynamic message passing based on this hypergraph, our model dynamically learns from both types of relations to generate targeted messages that reflect the relevance among nodes. Extensive experiments on several datasets demonstrates that our method achieves state-of-the-art performance, confirming its effectiveness and innovative design.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2411.04151

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Control the GNN: Utilizing Neural Controller with Lyapunov Stability for Test-Time Feature Reconstruction

Yang, Jielong, Ding, Rui, Ji, Feng, Wang, Hongbin, Xie, Linbo

arXiv.org Artificial IntelligenceOct-12-2024

The performance of graph neural networks (GNNs) is susceptible to discrepancies between training and testing sample distributions. Prior studies have attempted to enhance GNN performance by reconstructing node features during the testing phase without modifying the model parameters. However, these approaches lack theoretical analysis of the proximity between predictions and ground truth at test time. In this paper, we propose a novel node feature reconstruction method grounded in Lyapunov stability theory. Specifically, we model the GNN as a control system during the testing phase, considering node features as control variables. A neural controller that adheres to the Lyapunov stability criterion is then employed to reconstruct these node features, ensuring that the predictions progressively approach the ground truth at test time. We validate the effectiveness of our approach through extensive experiments across multiple datasets, demonstrating significant performance improvements.

artificial intelligence, machine learning, node feature, (17 more...)

arXiv.org Artificial Intelligence

2410.09708

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

Scalable Differentiable Causal Discovery in the Presence of Latent Confounders with Skeleton Posterior (Extended Version)

Ma, Pingchuan, Ding, Rui, Fu, Qiang, Zhang, Jiaru, Wang, Shuai, Han, Shi, Zhang, Dongmei

arXiv.org Machine LearningJun-15-2024

Differentiable causal discovery has made significant advancements in the learning of directed acyclic graphs. However, its application to real-world datasets remains restricted due to the ubiquity of latent confounders and the requirement to learn maximal ancestral graphs (MAGs). To date, existing differentiable MAG learning algorithms have been limited to small datasets and failed to scale to larger ones (e.g., with more than 50 variables). The key insight in this paper is that the causal skeleton, which is the undirected version of the causal graph, has potential for improving accuracy and reducing the search space of the optimization procedure, thereby enhancing the performance of differentiable causal discovery. Therefore, we seek to address a two-fold challenge to harness the potential of the causal skeleton for differentiable causal discovery in the presence of latent confounders: (1) scalable and accurate estimation of skeleton and (2) universal integration of skeleton estimation with differentiable causal discovery. To this end, we propose SPOT (Skeleton Posterior-guided OpTimization), a two-phase framework that harnesses skeleton posterior for differentiable causal discovery in the presence of latent confounders. On the contrary to a ``point-estimation'', SPOT seeks to estimate the posterior distribution of skeletons given the dataset. It first formulates the posterior inference as an instance of amortized inference problem and concretizes it with a supervised causal learning (SCL)-enabled solution to estimate the skeleton posterior. To incorporate the skeleton posterior with differentiable causal discovery, SPOT then features a skeleton posterior-guided stochastic optimization procedure to guide the optimization of MAGs. [abridged due to length limit]

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

2406.10537

Country: Asia > China (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
(2 more...)

Add feedback

Enhancing Text Authenticity: A Novel Hybrid Approach for AI-Generated Text Detection

Zhang, Ye, Leng, Qian, Zhu, Mengran, Ding, Rui, Wu, Yue, Song, Jintong, Gong, Yulu

arXiv.org Artificial IntelligenceJun-1-2024

The rapid advancement of Large Language Models (LLMs) has ushered in an era where AI-generated text is increasingly indistinguishable from human-generated content. Detecting AI-generated text has become imperative to combat misinformation, ensure content authenticity, and safeguard against malicious uses of AI. In this paper, we propose a novel hybrid approach that combines traditional TF-IDF techniques with advanced machine learning models, including Bayesian classifiers, Stochastic Gradient Descent (SGD), Categorical Gradient Boosting (CatBoost), and 12 instances of Deberta-v3-large models. Our approach aims to address the challenges associated with detecting AI-generated text by leveraging the strengths of both traditional feature extraction methods and state-of-the-art deep learning models. Through extensive experiments on a comprehensive dataset, we demonstrate the effectiveness of our proposed method in accurately distinguishing between human and AI-generated text. Our approach achieves superior performance compared to existing methods. This research contributes to the advancement of AI-generated text detection techniques and lays the foundation for developing robust solutions to mitigate the challenges posed by AI-generated content.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.06558

Country: North America > United States (0.48)

Genre: Research Report (1.00)

Industry:

Media > News (0.49)
Health & Medicine (0.48)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries

He, Xinyi, Zhou, Mengyu, Xu, Xinrun, Ma, Xiaojun, Ding, Rui, Du, Lun, Gao, Yan, Jia, Ran, Chen, Xu, Han, Shi, Yuan, Zejian, Zhang, Dongmei

arXiv.org Artificial IntelligenceDec-21-2023

Tabular data analysis is crucial in various fields, and large language models show promise in this area. However, current research mostly focuses on rudimentary tasks like Text2SQL and TableQA, neglecting advanced analysis like forecasting and chart generation. To address this gap, we developed the Text2Analysis benchmark, incorporating advanced analysis tasks that go beyond the SQL-compatible operations and require more in-depth analysis. We also develop five innovative and effective annotation methods, harnessing the capabilities of large language models to enhance data quality and quantity. Additionally, we include unclear queries that resemble real-world user questions to test how well models can understand and tackle such challenges. Finally, we collect 2249 query-result pairs with 347 tables. We evaluate five state-of-the-art models using three different metrics and the results show that our benchmark presents introduces considerable challenge in the field of tabular data analysis, paving the way for more advanced research opportunities.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.13671

Country:

Asia > China (0.28)
North America > Canada (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Transportation (0.32)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback