pruning
- Europe > Austria > Vienna (0.14)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Communications > Networks (0.93)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Europe > Austria > Vienna (0.14)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Communications > Networks (0.93)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Austria (0.04)
- North America > Dominican Republic (0.04)
- Asia (0.04)
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Despite several works trying to reduce their computational cost, most of LLMs still adopt attention layers between all pairs of tokens in the sequence, thus incurring a quadratic cost. In this study, we present a novel approach that dynamically prunes contextual information while preserving the model's expressiveness, resulting in reduced memory and computational
- Asia > Middle East > Lebanon (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Oregon > Linn County > Lebanon (0.04)
- (7 more...)
- Research Report > Promising Solution (0.87)
- Research Report > New Finding (0.87)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Asia (0.04)
Andrey Kuzmin, Markus Nagel, Mart van Baalen, Arash Behboodi, Tijmen Blankevoort Qualcomm AI Research
In this paper, we set out to answer the question on which is better: neural network quantization or pruning? By answering this question, we hope to inform design decisions made on neural network hardware going forward. We provide an extensive comparison between the two techniques for compressing deep neural networks.
- Telecommunications (0.41)
- Semiconductors & Electronics (0.41)
How a student becomes a teacher: learning and forgetting through Spectral methods
The above scheme proves particularly relevant when the student network is overparameterized (namely, when larger layer sizes are employed) as compared to the underlying teacher network. Under these operating conditions, it is tempting to speculate that the student ability to handle the given task could be eventually stored in a sub-portion of the whole network.
- North America > United States > California (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Belgium > Wallonia > Namur Province > Namur (0.04)