AITopics | pcformer

Collaborating Authors

pcformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

2432eb0ddcf3bb630b5bcf96ca7e592d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 13:35:24 GMT

coefficient, pcformer, proc, (16 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > China > Liaoning Province (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

IIET: Efficient Numerical Transformer via Implicit Iterative Euler Method

Liu, Xinyu, Li, Bei, Liu, Jiahao, Ruan, Junhao, Jiao, Kechen, Tang, Hongyin, Wang, Jingang, Tong, Xiao, Zhu, Jingbo

arXiv.org Artificial IntelligenceOct-14-2025

High-order numerical methods enhance Transformer performance in tasks like NLP and CV, but introduce a performance-efficiency trade-off due to increased computational overhead. Our analysis reveals that conventional efficiency techniques, such as distillation, can be detrimental to the performance of these models, exemplified by PCformer. To explore more optimizable ODE-based Transformer architectures, we propose the Iterative Implicit Euler Transformer (IIET), which simplifies high-order methods using an iterative implicit Euler approach. This simplification not only leads to superior performance but also facilitates model compression compared to PCformer. To enhance inference efficiency, we introduce Iteration Influence-Aware Distillation (IIAD). Through a flexible threshold, IIAD allows users to effectively balance the performance-efficiency trade-off. On lm-evaluation-harness, IIET boosts average accuracy by 2.65% over vanilla Transformers and 0.8% over PCformer. Its efficient variant, E-IIET, significantly cuts inference overhead by 55% while retaining 99.4% of the original task accuracy. Moreover, the most efficient IIET variant achieves an average performance gain exceeding 1.6% over vanilla Transformer with comparable speed.

iiet, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.22463

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning Bei Li

Neural Information Processing SystemsOct-9-2025, 21:04:23 GMT

On the WMT'14 English-German and English-French tasks, our model achieved BLEU scores of 30.95 and 44.27, respectively.

coefficient, pcformer, proc, (16 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > China > Liaoning Province (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning

Li, Bei, Zheng, Tong, Wang, Rui, Liu, Jiahao, Guo, Qingyan, Guo, Junliang, Tan, Xu, Xiao, Tong, Zhu, Jingbo, Wang, Jingang, Cai, Xunliang

arXiv.org Artificial IntelligenceNov-5-2024

Residual networks, as discrete approximations of Ordinary Differential Equations (ODEs), have inspired significant advancements in neural network design, including multistep methods, high-order methods, and multi-particle dynamical systems. The precision of the solution to ODEs significantly affects parameter optimization, thereby impacting model performance. In this work, we present a series of advanced explorations of Transformer architecture design to minimize the error compared to the true ``solution.'' First, we introduce a predictor-corrector learning framework to minimize truncation errors, which consists of a high-order predictor and a multistep corrector. Second, we propose an exponential moving average-based coefficient learning method to strengthen our higher-order predictor. Extensive experiments on large-scale machine translation, abstractive summarization, language modeling, and natural language understanding benchmarks demonstrate the superiority of our approach. On the WMT'14 English-German and English-French tasks, our model achieved BLEU scores of 30.95 and 44.27, respectively. Furthermore, on the OPUS multilingual machine translation task, our model surpasses a robust 3.8B DeepNet by an average of 2.9 SacreBLEU, using only 1/3 parameters. Notably, it also beats LLama models by 5.7 accuracy points on the LM Harness Evaluation.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.03042

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > China > Liaoning Province (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback