AITopics | selection layer

Collaborating Authors

selection layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference

Jin, Weisheng, Song, Maojia, Pala, Tej Deep, Chia, Yew Ken, Zadeh, Amir, Li, Chuan, Poria, Soujanya

arXiv.org Artificial IntelligenceMar-29-2025

As large language models (LLMs) tackle increasingly complex tasks and longer documents, their computational and memory costs during inference become a major bottleneck. To address this, we propose PromptDistill, a novel, training-free method that improves inference efficiency while preserving generation quality. PromptDistill identifies and retains the most informative tokens by leveraging attention interactions in early layers, preserving their hidden states while reducing the computational burden in later layers. This allows the model to focus on essential contextual information without fully processing all tokens. Unlike previous methods such as H2O and SnapKV, which perform compression only after processing the entire input, or GemFilter, which selects a fixed portion of the initial prompt without considering contextual dependencies, PromptDistill dynamically allocates computational resources to the most relevant tokens while maintaining a global awareness of the input. Experiments using our method and baseline approaches with base models such as LLaMA 3.1 8B Instruct, Phi 3.5 Mini Instruct, and Qwen2 7B Instruct on benchmarks including LongBench, InfBench, and Needle in a Haystack demonstrate that PromptDistill significantly improves efficiency while having minimal impact on output quality compared to the original models. With a single-stage selection strategy, PromptDistill effectively balances performance and efficiency, outperforming prior methods like GemFilter, H2O, and SnapKV due to its superior ability to retain essential information. Specifically, compared to GemFilter, PromptDistill achieves an overall $1\%$ to $5\%$ performance improvement while also offering better time efficiency. Additionally, we explore multi-stage selection, which further improves efficiency while maintaining strong generation performance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.23274

Country:

North America > United States (0.14)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)
(7 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

CoSS: Co-optimizing Sensor and Sampling Rate for Data-Efficient AI in Human Activity Recognition

Liu, Mengxi, Zhao, Zimin, Geißler, Daniel, Zhou, Bo, Suh, Sungho, Lukowicz, Paul

arXiv.org Artificial IntelligenceJan-3-2024

Recent advancements in Artificial Neural Networks have significantly improved human activity recognition using multiple time-series sensors. While employing numerous sensors with high-frequency sampling rates usually improves the results, it often leads to data inefficiency and unnecessary expansion of the ANN, posing a challenge for their practical deployment on edge devices. Addressing these issues, our work introduces a pragmatic framework for data-efficient utilization in HAR tasks, considering the optimization of both sensor modalities and sampling rate simultaneously. Central to our approach are the designed trainable parameters, termed 'Weight Scores,' which assess the significance of each sensor modality and sampling rate during the training phase. These scores guide the sensor modalities and sampling rate selection. The pruning method allows users to make a trade-off between computational budgets and performance by selecting the sensor modalities and sampling rates according to the weight score ranking. We tested our framework's effectiveness in optimizing sensor modality and sampling rate selection using three public HAR benchmark datasets. The results show that the sensor and sampling rate combination selected via CoSS achieves similar classification performance to configurations using the highest sampling rate with all sensors but at a reduced hardware cost.

dataset, selection, sensor, (14 more...)

arXiv.org Artificial Intelligence

2401.05426

Country:

Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.05)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Only 5\% Attention Is All You Need: Efficient Long-range Document-level Neural Machine Translation

Liu, Zihan, Sun, Zewei, Cheng, Shanbo, Huang, Shujian, Wang, Mingxuan

arXiv.org Artificial IntelligenceSep-25-2023

Document-level Neural Machine Translation (DocNMT) has been proven crucial for handling discourse phenomena by introducing document-level context information. One of the most important directions is to input the whole document directly to the standard Transformer model. In this case, efficiency becomes a critical concern due to the quadratic complexity of the attention module. Existing studies either focus on the encoder part, which cannot be deployed on sequence-to-sequence generation tasks, e.g., Machine Translation (MT), or suffer from a significant performance drop. In this work, we keep the translation performance while gaining 20\% speed up by introducing extra selection layer based on lightweight attention that selects a small portion of tokens to be attended. It takes advantage of the original attention to ensure performance and dimension reduction to accelerate inference. Experimental results show that our method could achieve up to 95\% sparsity (only 5\% tokens attended) approximately, and save 93\% computation cost on the attention module compared with the original Transformer, while maintaining the performance.

computational linguistic, proceedings, translation, (12 more...)

arXiv.org Artificial Intelligence

2309.14174

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(12 more...)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A distributed neural network architecture for dynamic sensor selection with application to bandwidth-constrained body-sensor networks

Strypsteen, Thomas, Bertrand, Alexander

arXiv.org Artificial IntelligenceAug-16-2023

We propose a dynamic sensor selection approach for deep neural networks (DNNs), which is able to derive an optimal sensor subset selection for each specific input sample instead of a fixed selection for the entire dataset. This dynamic selection is jointly learned with the task model in an end-to-end way, using the Gumbel-Softmax trick to allow the discrete decisions to be learned through standard backpropagation. We then show how we can use this dynamic selection to increase the lifetime of a wireless sensor network (WSN) by imposing constraints on how often each node is allowed to transmit. We further improve performance by including a dynamic spatial filter that makes the task-DNN more robust against the fact that it now needs to be able to handle a multitude of possible node subsets. Finally, we explain how the selection of the optimal channels can be distributed across the different nodes in a WSN. We validate this method on a use case in the context of body-sensor networks, where we use real electroencephalography (EEG) sensor data to emulate an EEG sensor network. We analyze the resulting trade-offs between transmission load and task accuracy.

artificial intelligence, machine learning, selection, (19 more...)

arXiv.org Artificial Intelligence

2308.08379

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

HanoiT: Enhancing Context-aware Translation via Selective Context

Yang, Jian, Yin, Yuwei, Ma, Shuming, Yang, Liqun, Guo, Hongcheng, Huang, Haoyang, Zhang, Dongdong, Zeng, Yutao, Li, Zhoujun, Wei, Furu

arXiv.org Artificial IntelligenceJan-17-2023

Context-aware neural machine translation aims to use the document-level context to improve translation quality. However, not all words in the context are helpful. The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context. To mitigate this problem, we propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context. To verify the effectiveness of our method, extensive experiments and extra quantitative analysis are conducted on four document-level machine translation benchmarks. The experimental results demonstrate that our model significantly outperforms previous models on all datasets via the soft selection mechanism.

artificial intelligence, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-30675-4_34

2301.06825

Country:

Europe > Austria > Vienna (0.14)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

LouisFoucard/w-net

#artificialintelligenceAug-16-2017, 10:30:15 GMT

See the included notebook for a detailed explanation and implementation. The model is implemented in Keras/Tensoflow, and is trained on data from 22 3d movies, sampled at 1 fps. Validation is perfomred on 3 held out movies. The total number of stereo frame is about 125K, training took 4 days on a gtx 1070 with batches of 6 stereo images with resolution 192x336 per eye.

disparity map, selection layer, stereo image, (15 more...)

#artificialintelligence

Industry: Media > Film (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.99)

Add feedback

Filter Selection Model for Generating Visual Motion Signals

Nowlan, Steven J., Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1993

We present a model of how MT cells aggregate responses from VI to form such a velocity representation. Two different sets of units, with local receptive fields, receive inputs from motion energy filters. One set of units forms estimates of local motion, while the second set computes the utility of these estimates. Outputs from this second set of units "gate" the outputs from the first set through a gain control mechanism. This active process for selecting only a subset of local motion responses to integrate into more global responses distinguishes our model from previous models of velocity estimation.

artificial intelligence, filter selection model, receptive field location, (16 more...)

Neural Information Processing Systems

Country: