linear approximation
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Europe > Austria (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
- Information Technology > Software Engineering (0.68)
- Information Technology > Security & Privacy (0.67)
d921c3c762b1522c475ac8fc0811bb0f-AuthorFeedback.pdf
We wish to thank all of the reviewers for their time and thorough reading of our paper! We appreciate the reviewer's suggestions regarding clarity. We have added the suggested summary sentence "the key We started with binary sentiment classification, but are actively working on more tasks. RNN hidden states onto the top two PCs for two different input sequences that differ only by two tokens (replacing ' The trajectories start out the same as the initial tokens are identical. We have added a footnote noting this in the main text.
What can linearized neural networks actually say about generalization?
For certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization, but for the networks used in practice, the empirical NTK only provides a rough first-order approximation. Still, a growing body of work keeps leveraging this approximation to successfully analyze important deep learning phenomena and design algorithms for new applications. In our work, we provide strong empirical evidence to determine the practical validity of such approximation by conducting a systematic comparison of the behavior of different neural networks and their linear approximations on different tasks. We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks, even when they achieve very different performances. However, in contrast to what was previously reported, we discover that neural networks do not always perform better than their kernel approximations, and reveal that the performance gap heavily depends on architecture, dataset size and training task. We discover that networks overfit to these tasks mostly due to the evolution of their kernel during training, thus, revealing a new type of implicit bias.
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Efficient Large Language Model Inference with Neural Block Linearization
Erdogan, Mete, Tonin, Francesco, Cevher, Volkan
The high inference demands of transformer-based Large Language Models (LLMs) pose substantial challenges in their deployment. To this end, we introduce Neural Block Linearization (NBL), a novel framework for accelerating transformer model inference by replacing self-attention layers with linear approximations derived from Linear Minimum Mean Squared Error estimators. NBL leverages Canonical Correlation Analysis to compute a theoretical upper bound on the approximation error. Then, we use this bound as a criterion for substitution, selecting the LLM layers with the lowest linearization error. NBL can be efficiently applied to pre-trained LLMs without the need for fine-tuning. In experiments, NBL achieves notable computational speed-ups while preserving competitive accuracy on multiple reasoning benchmarks. For instance, applying NBL to 12 self-attention layers in DeepSeek-R1-Distill-Llama-8B increases the inference speed by 32% with less than 1% accuracy trade-off, making it a flexible and promising solution to improve the inference efficiency of LLMs. The implementation is available at: https://github.com/LIONS-EPFL/NBL.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
Li, Siting, Gao, Xiang, Du, Simon Shaolei
While an image is worth more than a thousand words, only a few provide crucial information for a given task and thus should be focused on. In light of this, ideal text-to-image (T2I) retrievers should prioritize specific visual attributes relevant to queries. To evaluate current retrievers on handling attribute-focused queries, we build COCO-Facet, a COCO-based benchmark with 9,112 queries about diverse attributes of interest. We find that CLIP-like retrievers, which are widely adopted due to their efficiency and zero-shot ability, have poor and imbalanced performance, possibly because their image embeddings focus on global semantics and subjects while leaving out other details. Notably, we reveal that even recent Multimodal Large Language Model (MLLM)-based, stronger retrievers with a larger output dimension struggle with this limitation. Hence, we hypothesize that retrieving with general image embeddings is suboptimal for performing such queries. As a solution, we propose to use promptable image embeddings enabled by these multimodal retrievers, which boost performance by highlighting required attributes. Our pipeline for deriving such embeddings generalizes across query types, image pools, and base retriever architectures. To enhance real-world applicability, we offer two acceleration strategies: Pre-processing promptable embeddings and using linear approximations. We show that the former yields a 15% improvement in Recall@5 when prompts are predefined, while the latter achieves an 8% improvement when prompts are only available during inference.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- (2 more...)
- Leisure & Entertainment > Sports (1.00)
- Information Technology (0.93)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)