Hoffmann, David
LLM-Rank: A Graph Theoretical Approach to Pruning Large Language Models
Hoffmann, David, Budhathoki, Kailash, Kleindessner, Matthaeus
The evolving capabilities of large language models are accompanied by growing sizes and deployment costs, necessitating effective inference optimisation techniques. We propose a novel pruning method utilising centrality measures from graph theory, reducing both the computational requirements and the memory footprint of these models. Specifically, we devise a method for creating a weighted directed acyclical graph representation of multilayer perceptrons to which we apply a modified version of the weighted PageRank centrality measure to compute node importance scores. In combination with uniform pruning this leads to structured sparsity. We call this pruning method MLPRank. Furthermore we introduce an extension to decoder-only transformer models and call it LLMRank. For both variants we demonstrate a strong performance. With MLPRank on average leading to 6.09 % higher accuracy retention than three popular baselines and 13.42 % with LLMRank compared to two popular baselines. Code is available at https://github.com/amazon-science/llm-rank-pruning.
Impact of HPO on AutoML Forecasting Ensembles
Hoffmann, David
Due to this uncertainty over which models will perform best, it is common place in the forecasting space that domain experts and data scientists have to experiment with several methods before they find one that works acceptably well on a particular problem. This exploration process can be time and resource consuming and is not always practical, due to the plethora of unsolved forecasting problems as well as the scarcity of domain experts and data scientists. In recent years, Automated Machine Learning (AutoML) has become more popular, allowing non-technical users to solve machine learning problems without in depth knowledge about the underlying methodology, filling in for the lack of available data scientists through automation [5]. In forecasting there are several approaches to AutoML, one of them being the established method of using ensemble learning and aggregation of forecasts [6]. This has seen a recent increase in attention, with the top performing models in the M4 Competition [7] being of this nature [8]. Ensembling, can be conceptualised as the automation of the previously manual step of exploring the performance of various algorithms on a given problem and selecting the best one or a combination of models. This, however, does not address another important aspect of data science which is the selection of good hyperparameters, leading to better performance of a model trained using a particular algorithm. The combination of ensemble learning and hyperparameter tuning in an AutoML forecasting setup will be discussed in this paper.
Transformer-based Multi-Modal Learning for Multi Label Remote Sensing Image Classification
Hoffmann, David, Clasen, Kai Norman, Demir, Begüm
In this paper, we introduce a novel Synchronized Class Token Fusion (SCT Fusion) architecture in the framework of multi-modal multi-label classification (MLC) of remote sensing (RS) images. The proposed architecture leverages modality-specific attention-based transformer encoders to process varying input modalities, while exchanging information across modalities by synchronizing the special class tokens after each transformer encoder block. The synchronization involves fusing the class tokens with a trainable fusion transformation, resulting in a synchronized class token that contains information from all modalities. As the fusion transformation is trainable, it allows to reach an accurate representation of the shared features among different modalities. Experimental results show the effectiveness of the proposed architecture over single-modality architectures and an early fusion multi-modal architecture when evaluated on a multi-modal MLC dataset. The code of the proposed architecture is publicly available at https://git.tu-berlin.de/rsim/sct-fusion.