AITopics | Hoffmann, David

Collaborating Authors

Hoffmann, David

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLM-Rank: A Graph Theoretical Approach to Pruning Large Language Models

Hoffmann, David, Budhathoki, Kailash, Kleindessner, Matthaeus

arXiv.org Artificial IntelligenceNov-29-2024

The evolving capabilities of large language models are accompanied by growing sizes and deployment costs, necessitating effective inference optimisation techniques. We propose a novel pruning method utilising centrality measures from graph theory, reducing both the computational requirements and the memory footprint of these models. Specifically, we devise a method for creating a weighted directed acyclical graph representation of multilayer perceptrons to which we apply a modified version of the weighted PageRank centrality measure to compute node importance scores. In combination with uniform pruning this leads to structured sparsity. We call this pruning method MLPRank. Furthermore we introduce an extension to decoder-only transformer models and call it LLMRank. For both variants we demonstrate a strong performance. With MLPRank on average leading to 6.09 % higher accuracy retention than three popular baselines and 13.42 % with LLMRank compared to two popular baselines. Code is available at https://github.com/amazon-science/llm-rank-pruning.

large language model, machine learning, pruning, (18 more...)

arXiv.org Artificial Intelligence

2410.13299

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)

Add feedback

Impact of HPO on AutoML Forecasting Ensembles

Hoffmann, David

arXiv.org Artificial IntelligenceNov-7-2023

Due to this uncertainty over which models will perform best, it is common place in the forecasting space that domain experts and data scientists have to experiment with several methods before they find one that works acceptably well on a particular problem. This exploration process can be time and resource consuming and is not always practical, due to the plethora of unsolved forecasting problems as well as the scarcity of domain experts and data scientists. In recent years, Automated Machine Learning (AutoML) has become more popular, allowing non-technical users to solve machine learning problems without in depth knowledge about the underlying methodology, filling in for the lack of available data scientists through automation [5]. In forecasting there are several approaches to AutoML, one of them being the established method of using ensemble learning and aggregation of forecasts [6]. This has seen a recent increase in attention, with the top performing models in the M4 Competition [7] being of this nature [8]. Ensembling, can be conceptualised as the automation of the previously manual step of exploring the performance of various algorithms on a given problem and selecting the best one or a combination of models. This, however, does not address another important aspect of data science which is the selection of good hyperparameters, leading to better performance of a model trained using a particular algorithm. The combination of ensemble learning and hyperparameter tuning in an AutoML forecasting setup will be discussed in this paper.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2311.04034

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (0.46)

Industry:

Energy > Renewable > Solar (0.46)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Transformer-based Multi-Modal Learning for Multi Label Remote Sensing Image Classification

Hoffmann, David, Clasen, Kai Norman, Demir, Begüm

arXiv.org Artificial IntelligenceJun-2-2023

In this paper, we introduce a novel Synchronized Class Token Fusion (SCT Fusion) architecture in the framework of multi-modal multi-label classification (MLC) of remote sensing (RS) images. The proposed architecture leverages modality-specific attention-based transformer encoders to process varying input modalities, while exchanging information across modalities by synchronizing the special class tokens after each transformer encoder block. The synchronization involves fusing the class tokens with a trainable fusion transformation, resulting in a synchronized class token that contains information from all modalities. As the fusion transformation is trainable, it allows to reach an accurate representation of the shared features among different modalities. Experimental results show the effectiveness of the proposed architecture over single-modality architectures and an early fusion multi-modal architecture when evaluated on a multi-modal MLC dataset. The code of the proposed architecture is publicly available at https://git.tu-berlin.de/rsim/sct-fusion.

architecture, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.01523

Country: Europe > Germany (0.29)

Genre: Research Report (0.84)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.63)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback