AITopics | Zhang, Xiaoping

Plotting

Zhang, Xiaoping

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PLPHP: Per-Layer Per-Head Vision Token Pruning for Efficient Large Vision-Language Models

Meng, Yu, Li, Kaiyuan, Huang, Chenran, Gao, Chen, Chen, Xinlei, Li, Yong, Zhang, Xiaoping

arXiv.org Artificial IntelligenceFeb-20-2025

Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities across a range of multimodal tasks. However, their inference efficiency is constrained by the large number of visual tokens processed during decoding. To address this challenge, we propose Per-Layer Per-Head Vision Token Pruning (PLPHP), a two-level fine-grained pruning method including Layer-Level Retention Rate Allocation and Head-Level Vision Token Pruning. Motivated by the Vision Token Re-attention phenomenon across decoder layers, we dynamically adjust token retention rates layer by layer. Layers that exhibit stronger attention to visual information preserve more vision tokens, while layers with lower vision attention are aggressively pruned. Furthermore, PLPHP applies pruning at the attention head level, enabling different heads within the same layer to independently retain critical context. Experiments on multiple benchmarks demonstrate that PLPHP delivers an 18% faster decoding speed and reduces the Key-Value Cache (KV Cache) size by over 50%, all at the cost of 0.46% average performance drop, while also achieving notable performance improvements in multi-image tasks. These results highlight the effectiveness of fine-grained token pruning and contribute to advancing the efficiency and scalability of LVLMs. Our source code will be made publicly available.

artificial intelligence, natural language, vision token, (16 more...)

arXiv.org Artificial Intelligence

2502.14504

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Learning Green's Functions of Linear Reaction-Diffusion Equations with Application to Fast Numerical Solver

Teng, Yuankai, Zhang, Xiaoping, Wang, Zhu, Ju, Lili

arXiv.org Machine LearningMay-23-2021

Partial differential equations are often used to model various physical phenomena, such as heat diffusion, wave propagation, fluid dynamics, elasticity, electrodynamics and image processing, and many analytic approaches or traditional numerical methods have been developed and widely used for their solutions. Inspired by rapidly growing impact of deep learning on scientific and engineering research, in this paper we propose a novel neural network, GF-Net, for learning the Green's functions of linear reaction-diffusion equations in an unsupervised fashion. The proposed method overcomes the challenges for finding the Green's functions of the equations on arbitrary domains by utilizing physics-informed approach and the symmetry of the Green's function. As a consequence, it particularly leads to an efficient way for solving the target equations under different boundary conditions and sources. We also demonstrate the effectiveness of the proposed approach by experiments in square, annular and L-shape domains.

deep learning, equation, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

2105.11045

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback