AITopics | Kiefer, Nicholas

Collaborating Authors

Kiefer, Nicholas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model Fusion via Neuron Transplantation

Öz, Muhammed, Kiefer, Nicholas, Debus, Charlotte, Hörter, Jasmin, Streit, Achim, Götz, Markus

arXiv.org Artificial IntelligenceFeb-7-2025

Ensemble learning is a widespread technique to improve the prediction performance of neural networks. However, it comes at the price of increased memory and inference time. In this work we propose a novel model fusion technique called \emph{Neuron Transplantation (NT)} in which we fuse an ensemble of models by transplanting important neurons from all ensemble members into the vacant space obtained by pruning insignificant neurons. An initial loss in performance post-transplantation can be quickly recovered via fine-tuning, consistently outperforming individual ensemble members of the same model capacity and architecture. Furthermore, NT enables all the ensemble members to be jointly pruned and jointly trained in a combined model. Comparing it to alignment-based averaging (like Optimal-Transport-fusion), it requires less fine-tuning than the corresponding OT-fused model, the fusion itself is faster and requires less memory, while the resulting model performance is comparable or better. The code is available under the following link: https://github.com/masterbaer/neuron-transplantation.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-70359-1_1

2502.06849

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany (0.14)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

A Comparative Study of Pruning Methods in Transformer-based Time Series Forecasting

Kiefer, Nicholas, Weyrauch, Arvid, Öz, Muhammed, Streit, Achim, Götz, Markus, Debus, Charlotte

arXiv.org Artificial IntelligenceDec-17-2024

The current landscape in time-series forecasting is dominated by Transformer-based models. Their high parameter count and corresponding demand in computational resources pose a challenge to real-world deployment, especially for commercial and scientific applications with low-power embedded devices. Pruning is an established approach to reduce neural network parameter count and save compute. However, the implications and benefits of pruning Transformer-based models for time series forecasting are largely unknown. To close this gap, we provide a comparative benchmark study by evaluating unstructured and structured pruning on various state-of-the-art multivariate time series models. We study the effects of these pruning strategies on model predictive performance and computational aspects like model size, operations, and inference time. Our results show that certain models can be pruned even up to high sparsity levels, outperforming their dense counterpart. However, fine-tuning pruned models is necessary. Furthermore, we demonstrate that even with corresponding hardware and software support, structured pruning is unable to provide significant time savings.

data mining, machine learning, pruning, (18 more...)

arXiv.org Artificial Intelligence

2412.12883

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Power Industry (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AB-Training: A Communication-Efficient Approach for Distributed Low-Rank Learning

Coquelin, Daniel, Flügel, Katherina, Weiel, Marie, Kiefer, Nicholas, Öz, Muhammed, Debus, Charlotte, Streit, Achim, Götz, Markus

arXiv.org Artificial IntelligenceJun-30-2024

Communication bottlenecks severely hinder the scalability of distributed neural network training, particularly in high-performance computing (HPC) environments. We introduce AB-training, a novel data-parallel method that leverages low-rank representations and independent training groups to significantly reduce communication overhead. Our experiments demonstrate an average reduction in network traffic of approximately 70.31\% across various scaling scenarios, increasing the training potential of communication-constrained systems and accelerating convergence at scale. AB-training also exhibits a pronounced regularization effect at smaller scales, leading to improved generalization while maintaining or even reducing training time. We achieve a remarkable 44.14 : 1 compression ratio on VGG16 trained on CIFAR-10 with minimal accuracy loss, and outperform traditional data parallel training by 1.55\% on ResNet-50 trained on ImageNet-2012. While AB-training is promising, our findings also reveal that large batch effects persist even in low-rank regimes, underscoring the need for further research into optimized update mechanisms for massively distributed training.

artificial intelligence, batch size, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.01067

Country: Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Harnessing Orthogonality to Train Low-Rank Neural Networks

Coquelin, Daniel, Flügel, Katharina, Weiel, Marie, Kiefer, Nicholas, Debus, Charlotte, Streit, Achim, Götz, Markus

arXiv.org Artificial IntelligenceJan-16-2024

This study explores the learning dynamics of neural networks by analyzing the singular value decomposition (SVD) of their weights throughout training. Our investigation reveals that an orthogonal basis within each multidimensional weight's SVD representation stabilizes during training. Building upon this, we introduce Orthogonality-Informed Adaptive Low-Rank (OIALR) training, a novel training method exploiting the intrinsic orthogonality of neural networks. OIALR seamlessly integrates into existing training workflows with minimal accuracy loss, as demonstrated by benchmarking on various datasets and well-established network architectures. With appropriate hyperparameter tuning, OIALR can surpass conventional training setups, including those of state-of-the-art models.

artificial intelligence, machine learning, oialr, (18 more...)

arXiv.org Artificial Intelligence

2401.08505

Country: Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

A dynamic risk score for early prediction of cardiogenic shock using machine learning

Hu, Yuxuan, Lui, Albert, Goldstein, Mark, Sudarshan, Mukund, Tinsay, Andrea, Tsui, Cindy, Maidman, Samuel, Medamana, John, Jethani, Neil, Puli, Aahlad, Nguy, Vuthy, Aphinyanaphongs, Yindalon, Kiefer, Nicholas, Smilowitz, Nathaniel, Horowitz, James, Ahuja, Tania, Fishman, Glenn I, Hochman, Judith, Katz, Stuart, Bernard, Samuel, Ranganath, Rajesh

arXiv.org Artificial IntelligenceMar-28-2023

Myocardial infarction and heart failure are major cardiovascular diseases that affect millions of people in the US. The morbidity and mortality are highest among patients who develop cardiogenic shock. Early recognition of cardiogenic shock is critical. Prompt implementation of treatment measures can prevent the deleterious spiral of ischemia, low blood pressure, and reduced cardiac output due to cardiogenic shock. However, early identification of cardiogenic shock has been challenging due to human providers' inability to process the enormous amount of data in the cardiac intensive care unit (icu) and lack of an effective risk stratification tool. We developed a deep learning-based risk stratification tool, called CShock, for patients admitted into the cardiac icu with acute decompensated heart failure and/or myocardial infarction to predict onset of cardiogenic shock. To develop and validate CShock, we annotated cardiac icu datasets with physician adjudicated outcomes. CShock achieved an area under the receiver operator characteristic curve (auroc) of 0.820, which substantially outperformed CardShock (auroc 0.519), a well-established risk score for cardiogenic shock prognosis. CShock was externally validated in an independent patient cohort and achieved an auroc of 0.800, demonstrating its generalizability in other cardiac icus.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2303.12888

Country: North America > United States (0.34)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback