AITopics | Jiang, Zihan

Collaborating Authors

Jiang, Zihan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

Sun, Lin, Zhao, Guangxiang, Jian, Xiaoqi, Wu, Yuhan, Lin, Weihong, Zhu, Yongfu, Jia, Change, Zhang, Linglin, Wu, Jinzhu, Ran, Junfeng, Hu, Sai-er, Jiang, Zihan, Zhou, Junting, Liu, Wenrui, Cui, Bin, Yang, Tong, Zhang, Xiangzheng

arXiv.org Artificial IntelligenceMar-17-2025

The challenge of reducing the size of Large Language Models (LLMs) while maintaining their performance has gained significant attention. However, existing methods, such as model distillation and transfer learning, often fail to achieve high accuracy. To address this limitation, we introduce the Branch-Merge distillation approach, which enhances model compression through two phases: (1) the Branch Phase, where knowledge from a large teacher model is \textit{selectively distilled} into specialized student models via domain-specific supervised fine-tuning (SFT); And (2) the Merge Phase, where these student models are merged to enable cross-domain knowledge transfer and improve generalization. We validate our distillation approach using DeepSeek-R1 as the teacher and DeepSeek-R1-Distill-Qwen-32B as the student. The resulting merged model, TinyR1-32B-Preview, outperforms its counterpart DeepSeek-R1-Distill-Qwen-32B across multiple benchmarks, including Mathematics (+5.5 points), Coding (+4.4 points) and Science (+2.9 points), while achieving near-equal performance to DeepSeek-R1 on AIME 2024. The Branch-Merge distillation approach provides a scalable solution for creating smaller, high-performing LLMs with reduced computational cost and time.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.04872

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

INT-FlashAttention: Enabling Flash Attention for INT8 Quantization

Chen, Shimao, Liu, Zirui, Wu, Zhiying, Zheng, Ce, Cong, Peizhuang, Jiang, Zihan, Wu, Yuhan, Su, Lei, Yang, Tong

arXiv.org Artificial IntelligenceSep-26-2024

As the foundation of large language models (LLMs), self-attention module faces the challenge of quadratic time and memory complexity with respect to sequence length. FlashAttention accelerates attention computation and reduces its memory usage by leveraging the GPU memory hierarchy. A promising research direction is to integrate FlashAttention with quantization methods. This paper introduces INT-FlashAttention, the first INT8 quantization architecture compatible with the forward workflow of FlashAttention, which significantly improves the inference speed of FlashAttention on Ampere GPUs. We implement our INT-FlashAttention prototype with fully INT8 activations and general matrix-multiplication (GEMM) kernels, making it the first attention operator with fully INT8 input. As a general token-level post-training quantization framework, INT-FlashAttention is also compatible with other data formats like INT4, etc. Experimental results show INT-FlashAttention achieves 72% faster inference speed and 82% smaller quantization error compared to standard FlashAttention with FP16 and FP8 data format.

flashattention, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2409.16997

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MCNS: Mining Causal Natural Structures Inside Time Series via A Novel Internal Causality Scheme

Liu, Yuanhao, Du, Dehui, Jiang, Zihan, Huang, Anyan, Li, Yiyang

arXiv.org Artificial IntelligenceSep-13-2023

Causal inference permits us to discover covert relationships of various variables in time series. However, in most existing works, the variables mentioned above are the dimensions. The causality between dimensions could be cursory, which hinders the comprehension of the internal relationship and the benefit of the causal graph to the neural networks (NNs). In this paper, we find that causality exists not only outside but also inside the time series because it reflects a succession of events in the real world. It inspires us to seek the relationship between internal subsequences. However, the challenges are the hardship of discovering causality from subsequences and utilizing the causal natural structures to improve NNs. To address these challenges, we propose a novel framework called Mining Causal Natural Structure (MCNS), which is automatic and domain-agnostic and helps to find the causal natural structures inside time series via the internal causality scheme. We evaluate the MCNS framework and impregnation NN with MCNS on time series classification tasks. Experimental results illustrate that our impregnation, by refining attention, shape selection classification, and pruning datasets, drives NN, even the data itself preferable accuracy and interpretability. Besides, MCNS provides an in-depth, solid summary of the time series and datasets.

artificial intelligence, machine learning, mining causal natural structure, (3 more...)

arXiv.org Artificial Intelligence

2309.06739

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

CMLCompiler: A Unified Compiler for Classical Machine Learning

Wen, Xu, Gao, Wanling, Li, Anzheng, Wang, Lei, Jiang, Zihan, Zhan, Jianfeng

arXiv.org Artificial IntelligenceApr-28-2023

Classical machine learning (CML) occupies nearly half of machine learning pipelines in production applications. Unfortunately, it fails to utilize the state-of-the-practice devices fully and performs poorly. Without a unified framework, the hybrid deployments of deep learning (DL) and CML also suffer from severe performance and portability issues. This paper presents the design of a unified compiler, called CMLCompiler, for CML inference. We propose two unified abstractions: operator representations and extended computational graphs. The CMLCompiler framework performs the conversion and graph optimization based on two unified abstractions, then outputs an optimized computational graph to DL compilers or frameworks. We implement CMLCompiler on TVM. The evaluation shows CMLCompiler's portability and superior performance. It achieves up to 4.38$\times$ speedup on CPU, 3.31$\times$ speedup on GPU, and 5.09$\times$ speedup on IoT devices, compared to the state-of-the-art solutions -- scikit-learn, intel sklearn, and hummingbird. Our performance of CML and DL mixed pipelines achieves up to 3.04x speedup compared with cross-framework implementations. The project documents and source code are available at https://www.computercouncil.org/cmlcompiler.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Artificial Intelligence

2301.13441

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

OpenClinicalAI: enabling AI to diagnose diseases in real-world clinical settings

Huang, Yunyou, Wang, Nana, Tang, Suqin, Ma, Li, Hao, Tianshu, Jiang, Zihan, Zhang, Fan, Kang, Guoxin, Miao, Xiuxia, Guan, Xianglong, Zhang, Ruchang, Zhang, Zhifei, Zhan, Jianfeng

arXiv.org Artificial IntelligenceSep-8-2021

This paper quantitatively reveals the state-of-the-art and state-of-the-practice AI systems only achieve acceptable performance on the stringent conditions that all categories of subjects are known, which we call closed clinical settings, but fail to work in real-world clinical settings. Compared to the diagnosis task in the closed setting, real-world clinical settings pose severe challenges, and we must treat them differently. We build a clinical AI benchmark named Clinical AIBench to set up real-world clinical settings to facilitate researches. We propose an open, dynamic machine learning framework and develop an AI system named OpenClinicalAI to diagnose diseases in real-world clinical settings. The first versions of Clinical AIBench and OpenClinicalAI target Alzheimer's disease. In the real-world clinical setting, OpenClinicalAI significantly outperforms the state-of-the-art AI system. In addition, OpenClinicalAI develops personalized diagnosis strategies to avoid unnecessary testing and seamlessly collaborates with clinicians. It is promising to be embedded in the current medical systems to improve medical services.

examination, health & medicine, neurology, (22 more...)

arXiv.org Artificial Intelligence

2109.04004

Country: North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

AIBench Training: Balanced Industry-Standard AI Training Benchmarking

Tang, Fei, Gao, Wanling, Zhan, Jianfeng, Lan, Chuanxin, Wen, Xu, Wang, Lei, Luo, Chunjie, Dai, Jiahui, Cao, Zheng, Xiong, Xingwang, Jiang, Zihan, Hao, Tianshu, Fan, Fanda, Zhang, Fan, Huang, Yunyou, Chen, Jianan, Du, Mengjia, Ren, Rui, Zheng, Chen, Zheng, Daoyi, Tang, Haoning, Zhan, Kunlin, Wang, Biao, Kong, Defei, Yu, Minghe, Tan, Chongkang, Li, Huan, Tian, Xinhui, Li, Yatao, Lu, Gang, Shao, Junchao, Wang, Zhenyu, Wang, Xiaoyu, Ye, Hainan

arXiv.org Artificial IntelligenceAug-14-2020

Earlier-stage evaluations of a new AI architecture/system need affordable AI benchmarks, while using a few AI component benchmarks alone in the other stages may lead to misleading conclusions. This paper proposes a balanced benchmarking methodology. Performing an exhaustive survey on Internet service AI domains, we identify and implement seventeen representative AI tasks with the state-of-the-art models to guarantee the diversity and representativeness of the benchmarks. Meanwhile, we keep a benchmark subset to a minimum for affordability. We contribute by far the most comprehensive AI training benchmark suite with seventeen industry partners. The evaluations show: (1) AIBench Training outperforms MLPerf Training in terms of the diversity and representativeness of model complexity, computational cost, convergent rate, computation and memory access patterns, and hotspot functions; (2) With respect to the AIBench full benchmarks, its subset shortens the benchmarking cost by 54%, while maintaining the primary workload characteristics; (3) The performance ranking shows the single-purpose AI accelerator like TPU with the optimized TensorFlow framework performs better than that of GPUs while losing the latters' general support for a variety of AI models. The AIBench Training specifications, source code, testbed, and performance numbers are publicly available from the web site http://www.benchcouncil.org/AIBench/index.html.

benchmark, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

2004.1469

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report (0.84)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

HPC AI500: A Benchmark Suite for HPC AI Systems

Jiang, Zihan, Gao, Wanling, Wang, Lei, Xiong, Xingwang, Zhang, Yuchen, Wen, Xu, Luo, Chunjie, Ye, Hainan, Zhang, Yunquan, Feng, Shengzhong, Li, Kenli, Xu, Weijia, Zhan, Jianfeng

arXiv.org Artificial IntelligenceAug-12-2019

In recent years, with the trend of applying deep learning (DL) in high performance scientific computing, the unique characteristics of emerging DL workloads in HPC raise great challenges in designing, implementing HPC AI systems. The community needs a new yard stick for evaluating the future HPC systems. In this paper, we propose HPC AI500 --- a benchmark suite for evaluating HPC systems that running scientific DL workloads. Covering the most representative scientific fields, each workload from HPC AI500 is based on real-world scientific DL applications. Currently, we choose 14 scientific DL benchmarks from perspectives of application scenarios, data sets, and software stack. We propose a set of metrics for comprehensively evaluating the HPC AI systems, considering both accuracy, performance as well as power and cost. We provide a scalable reference implementation of HPC AI500. HPC AI500 is a part of the open-source AIBench project, the specification and source code are publicly available from \url{http://www.benchcouncil.org/AIBench/index.html}.

deep learning, hpc ai500, neural network, (17 more...)

arXiv.org Artificial Intelligence

1908.02607

Country:

Asia > China (0.68)
North America > United States (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback