AITopics | Wang, Xiaotian

Collaborating Authors

Wang, Xiaotian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs

Wu, Qizhe, Liang, Huawen, Gui, Yuchen, Zeng, Zhichen, He, Zerong, Tao, Linfeng, Wang, Xiaotian, Zhao, Letian, Zeng, Zhaoxi, Yuan, Wei, Wu, Wei, Jin, Xi

arXiv.org Artificial IntelligenceMar-8-2025

General matrix-matrix multiplication (GEMM) is a cornerstone of AI computations, making tensor processing engines (TPEs) increasingly critical in GPUs and domain-specific architectures. Existing architectures primarily optimize dataflow or operand reuse strategies. However, considering the interaction between matrix multiplication and multiply-accumulators (MACs) offers greater optimization potential. This work introduces a novel hardware perspective on matrix multiplication, focusing on the bit-weight dimension of MACs. We propose a finer-grained TPE notation using matrix triple loops as an example, introducing new methods for designing and optimizing PE microarchitectures. Based on this notation and its transformations, we propose four optimization techniques that improve timing, area, and power consumption. Implementing our design in RTL using the SMIC-28nm process, we evaluate its effectiveness across four classic TPE architectures: systolic array, 3D-Cube, multiplier-adder tree, and 2D-Matrix. Our techniques achieve area efficiency improvements of 1.27x, 1.28x, 1.56x, and 1.44x, and energy efficiency gains of 1.04x, 1.56x, 1.49x, and 1.20x, respectively. Applied to a bit-slice architecture, our approach achieves a 12.10x improvement in energy efficiency and 2.85x in area efficiency compared to Laconic. Our Verilog HDL code, along with timing, area, and power reports, is available at https://github.com/wqzustc/High-Performance-Tensor-Processing-Engines

dimension, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.06342

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Semiconductors & Electronics (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Architecture (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Adaptive Progressive Attention Graph Neural Network for EEG Emotion Recognition

Feng, Tianzhi, Wu, Chennan, Niu, Yi, Li, Fu, Fu, Boxun, Zhao, Zhifu, Wang, Xiaotian, Shi, Guangming

arXiv.org Artificial IntelligenceJan-24-2025

In recent years, numerous neuroscientific studies have shown that human emotions are closely linked to specific brain regions, with these regions exhibiting variability across individuals and emotional states. To fully leverage these neural patterns, we propose an Adaptive Progressive Attention Graph Neural Network (APAGNN), which dynamically captures the spatial relationships among brain regions during emotional processing. The APAGNN employs three specialized experts that progressively analyze brain topology. The first expert captures global brain patterns, the second focuses on region-specific features, and the third examines emotion-related channels. This hierarchical approach enables increasingly refined analysis of neural activity. Additionally, a weight generator integrates the outputs of all three experts, balancing their contributions to produce the final predictive label. Extensive experiments on three publicly available datasets (SEED, SEED-IV and MPED) demonstrate that the proposed method enhances EEG emotion recognition performance, achieving superior results compared to baseline methods.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2501.14246

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

Optimizing the Passenger Flow for Airport Security Check

Wang, Yuxin, Meng, Fanfei, Wang, Xiaotian, Xie, Chaoyu

arXiv.org Artificial IntelligenceDec-13-2023

Due to the necessary security for the airport and flight, passengers are required to have strict security check before getting aboard. However, there are frequent complaints of wasting huge amount of time while waiting for the security check. This paper presents a potential solution aimed at optimizing gate setup procedures specifically tailored for Chicago OHare International Airport. By referring to queueing theory and performing Monte Carlo simulations, we propose an approach to significantly diminish the average waiting time to a more manageable level. Additionally, our study meticulously examines and identifies the influential factors contributing to this optimization, providing a comprehensive understanding of their impact.

artificial intelligence, machine learning, security check, (16 more...)

arXiv.org Artificial Intelligence

2312.05259

Country:

Asia (0.69)
North America > United States > Illinois > Cook County > Chicago (0.25)

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation > Infrastructure & Services > Airport (1.00)
Transportation > Air (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

BaPipe: Exploration of Balanced Pipeline Parallelism for DNN Training

Zhao, Letian, Xu, Rui, Wang, Tianqi, Tian, Teng, Wang, Xiaotian, Wu, Wei, Ieong, Chio-in, Jin, Xi

arXiv.org Artificial IntelligenceJan-14-2021

The size of deep neural networks (DNNs) grows rapidly as the complexity of the machine learning algorithm increases. To satisfy the requirement of computation and memory of DNN training, distributed deep learning based on model parallelism has been widely recognized. We propose a new pipeline parallelism training framework, BaPipe, which can automatically explore pipeline parallelism training methods and balanced partition strategies for DNN distributed training. In BaPipe, each accelerator calculates the forward propagation and backward propagation of different parts of networks to implement the intra-batch pipeline parallelism strategy. BaPipe uses a new load balancing automatic exploration strategy that considers the parameters of DNN models and the computation, memory, and communication resources of accelerator clusters. We have trained different DNNs such as VGG-16, ResNet-50, and GNMT on GPU clusters and simulated the performance of different FPGA clusters. Compared with state-of-the-art data parallelism and pipeline parallelism frameworks, BaPipe provides up to 3.2x speedup and 4x memory reduction in various platforms.

accelerator, deep learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2012.12544

Country: Asia > China (0.29)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Midstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback