AITopics | Shen, Xipeng

Collaborating Authors

Shen, Xipeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile

Niu, Wei, Sanim, Md Musfiqur Rahman, Shu, Zhihao, Guan, Jiexiong, Shen, Xipeng, Yin, Miao, Agrawal, Gagan, Ren, Bin

arXiv.org Artificial IntelligenceApr-21-2024

This work is motivated by recent developments in Deep Neural Networks, particularly the Transformer architectures underlying applications such as ChatGPT, and the need for performing inference on mobile devices. Focusing on emerging transformers (specifically the ones with computationally efficient Swin-like architectures) and large models (e.g., Stable Diffusion and LLMs) based on transformers, we observe that layout transformations between the computational operators cause a significant slowdown in these applications. This paper presents SmartMem, a comprehensive framework for eliminating most layout transformations, with the idea that multiple operators can use the same tensor layout through careful choice of layout and implementation of operations. Our approach is based on classifying the operators into four groups, and considering combinations of producer-consumer edges between the operators. We develop a set of methods for searching such layouts. Another component of our work is developing efficient memory layouts for 2.5 dimensional memory commonly seen in mobile devices. Our experimental results show that SmartMem outperforms 5 state-of-the-art DNN execution frameworks on mobile devices across 18 varied neural networks, including CNNs, Transformers with both local and global attention, as well as LLMs. In particular, compared to DNNFusion, SmartMem achieves an average speedup of 2.8$\times$, and outperforms TVM and MNN with speedups of 6.9$\times$ and 7.9$\times$, respectively, on average.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3620666.3651384

2404.13528

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.83)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Large Language Models Fine-Tuning On Graphs

Xue, Rui, Shen, Xipeng, Yu, Ruozhou, Liu, Xiaorui

arXiv.org Artificial IntelligenceDec-7-2023

Learning from Text-Attributed Graphs (TAGs) has attracted significant attention due to its wide range of real-world applications. The rapid evolution of large language models (LLMs) has revolutionized the way we process textual data, which indicates a strong potential to replace shallow text embedding generally used in Graph Neural Networks (GNNs). However, we find that existing LLM approaches that exploit text information in graphs suffer from inferior computation and data efficiency. In this work, we introduce a novel and efficient approach for the end-toend fine-tuning of Large Language Models (LLMs) on TAGs, named LEADING. The proposed approach maintains computation cost and memory overhead comparable to the graph-less fine-tuning of LLMs. Moreover, it transfers the rick knowledge in LLMs to downstream graph learning tasks effectively with limited labeled data in semi-supervised learning. Its superior computation and data efficiency are demonstrated through comprehensive experiments, offering a promising solution for a wide range of LLMs and graph learning tasks on TAGs. Graph neural networks (GNNs) have been widely used for representation learning on graphstructured data (Hamilton, 2020; Ma & Tang, 2021), and they achieve promising state-of-the-art performance on various graph learning tasks, such as node classification, link prediction, and graph classification.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.04737

Country: North America > United States > North Carolina (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines

Flynn, Patrick, Vanderbruggen, Tristan, Liao, Chunhua, Lin, Pei-Hung, Emani, Murali, Shen, Xipeng

arXiv.org Artificial IntelligenceJun-15-2023

Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2208.05596

Country: North America > United States > North Carolina (0.28)

Genre:

Overview (0.93)
Research Report (0.82)

Industry: Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs

Chen, Jou-An, Sung, Hsin-Hsuan, Shen, Xipeng, Choudhury, Sutanay, Li, Ang

arXiv.org Artificial IntelligenceJun-3-2023

Recent studies have shown that Binary Graph Neural Networks (GNNs) are promising for saving computations of GNNs through binarized tensors. Prior work, however, mainly focused on algorithm designs or training techniques, leaving it open to how to materialize the performance potential on accelerator hardware fully. This work redesigns the binary GNN inference backend from the efficiency perspective. It fills the gap by proposing a series of abstractions and techniques to map binary GNNs and their computations best to fit the nature of bit manipulations on GPUs. Results on real-world graphs with GCNs, GraphSAGE, and GraphSAINT show that the proposed techniques outperform state-of-the-art binary GNN implementations by 8-22X with the same accuracy maintained. BitGNN code is publicly available.

artificial intelligence, machine learning, opération, (17 more...)

arXiv.org Artificial Intelligence

2305.02522

Country: North America > United States > North Carolina (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card

Sung, Hsin-Hsuan, Xu, Yuanchao, Guan, Jiexiong, Niu, Wei, Liu, Shaoshan, Ren, Bin, Wang, Yanzhi, Shen, Xipeng

arXiv.org Artificial IntelligenceOct-12-2021

Autonomous driving is of great interest in both research and industry. The high cost has been one of the major roadblocks that slow down the development and adoption of autonomous driving in practice. This paper, for the first-time, shows that it is possible to run level-4 (i.e., fully autonomous driving) software on a single off-the-shelf card (Jetson AGX Xavier) for less than $1k, an order of magnitude less than the state-of-the-art systems, while meeting all the requirements of latency. The success comes from the resolution of some important issues shared by existing practices through a series of measures and innovations. The study overturns the common perceptions of the computing resources required by level-4 autonomous driving, points out a promising path for the industry to lower the cost, and suggests a number of research opportunities for rethinking the architecture, software design, and optimizations of autonomous driving.

application, artificial intelligence, autonomous driving, (16 more...)

arXiv.org Artificial Intelligence

2110.06373

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

Coarsening Optimization for Differentiable Programming

Shen, Xipeng, Zhang, Guoqiang, Dea, Irene, Andow, Samantha, Arroyo-Fang, Emilio, Gafter, Neal, George, Johann, Grueter, Melissa, Meijer, Erik, Stumpos, Steffi, Tempest, Alanna, Warden, Christy, Yang, Shannon

arXiv.org Artificial IntelligenceOct-5-2021

A program written with differentiable programming can be differentiated automatically. The differentiation results can then be used for gradient-based optimization (e.g., gradient descent) of the parameters in the program. Differentiable programming have been used in scientific computing, physics simulations, and other domains to help mitigate the burden of manual error-prone coding of derivative computations. Recent several years have witnessed a growing interest of differentiable programming in machine learning (ML) [11, 34] and Probabilistic Programming [30], to accommodate the needs of various customized ML operators, user-defined operations in the learning targets (e.g., the physical environment of reinforcement learning) and statistical sampling. The key technique in differentiable programming is automatic differentiation. For a program (P) that produces output (y) from some given values (X), automatic differentiation automatically computes the derivatives ( y/ x) (x X) without the need for users to write the differentiation code. The given program P is called the primal code, and x is called an active input variable. Existing approaches of automatic differentiation fall into two categories: (i) Symbolic differentiation, which uses expression manipulation in computer algebra systems, (ii) Algorithmic differentiation, which performs a non-standard interpretation of a given computer program by replacing the domain of the variables to incorporate derivative values and redefining the semantics of the operators to propagate derivatives per the chain rule of differential calculus (elaborated in Section 2). Symbolic differentiation has been commonly regarded inappropriate for differentiable programming, for several reasons: (i) It results in complex and cryptic expressions plagued with the problem of "expression swell" [5].

artificial intelligence, machine learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

2110.02307

Genre: Research Report (1.00)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device

Zhao, Pu, Niu, Wei, Yuan, Geng, Cai, Yuxuan, Sung, Hsin-Hsuan, Wen, Wujie, Liu, Sijia, Shen, Xipeng, Ren, Bin, Wang, Yanzhi, Lin, Xue

arXiv.org Artificial IntelligenceDec-26-2020

3D object detection is an important task, especially in the autonomous driving application domain. However, it is challenging to support the real-time performance with the limited computation and memory resources on edge-computing devices in self-driving cars. To achieve this, we propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques, to enable real-time inference of 3D object detection on the resource-limited edge-computing devices. Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically, without human expertise and assistance. And the evaluated performance of the unified schemes can be fed back to train the generator RNN. The experimental results demonstrate that the proposed framework firstly achieves real-time 3D object detection on mobile devices (Samsung Galaxy S20 phone) with competitive detection performance.

deep learning, neural network, optimization, (20 more...)

arXiv.org Artificial Intelligence

2012.13801

Country: North America > United States > Michigan > Ingham County (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Ground > Road (0.54)
Information Technology > Robotics & Automation (0.54)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

In-Place Zero-Space Memory Protection for CNN

Guan, Hui, Ning, Lin, Lin, Zhen, Shen, Xipeng, Zhou, Huiyang, Lim, Seung-Hwan

Neural Information Processing SystemsMar-18-2020, 22:48:09 GMT

Convolutional Neural Networks (CNN) are being actively explored for safety-critical applications such as autonomous vehicles and aerospace, where it is essential to ensure the reliability of inference results in the presence of possible memory faults. Traditional methods such as error correction codes (ECC) and Triple Modular Redundancy (TMR) are CNN-oblivious and incur substantial memory overhead and energy cost. This paper introduces in-place zero-space ECC assisted with a new training scheme weight distribution-oriented training. The new method provides the first known zero space cost memory protection for CNNs without compromising the reliability offered by traditional ECC. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, ground transportation, neural network, (4 more...)

Neural Information Processing Systems

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

How to "DODGE" Complex Software Analytics?

Agrawal, Amritanshu, Fu, Wei, Chen, Di, Shen, Xipeng, Menzies, Tim

arXiv.org Artificial IntelligenceFeb-5-2019

AI software is still software. Software engineers need better tools to make better use of AI software. For example, for software defect prediction and software text mining, the default tunings for software analytics tools can be improved with "hyperparameter optimization" tools that decide (e.g.,) how many trees are needed in a random forest. Hyperparameter optimization is unnecessarily slow when optimizers waste time exploring redundant options (i.e., pairs of tunings with indistinguishably different results). By ignoring redundant tunings, the Dodge(E) hyperparameter optimization tool can run orders of magnitude faster, yet still find better tunings than prior state-of-the-art algorithms (for software defect prediction and software text mining).

artificial intelligence, natural language, software engineering, (19 more...)

arXiv.org Artificial Intelligence

1902.01838

Country: North America > United States > California (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback