AITopics | efficient deployment

Collaborating Authors

efficient deployment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rescaling-Aware Training for Efficient Deployment of Deep Learning Models on Full-Integer Hardware

Mueller, Lion, Garcia-Ortiz, Alberto, Najafi, Ardalan, Fuks, Adam, Bamberg, Lennart

arXiv.org Artificial IntelligenceOct-14-2025

Integer AI inference significantly reduces computational complexity in embedded systems. Quantization-aware training (QAT) helps mitigate accuracy degradation associated with post-training quantization but still overlooks the impact of integer rescaling during inference, which is a hardware costly operation in integer-only AI inference. This work shows that rescaling cost can be dramatically reduced post-training, by applying a stronger quantization to the rescale multiplicands at no model-quality loss. Furthermore, we introduce Rescale-Aware Training, a fine tuning method for ultra-low bit-width rescaling multiplicands. Experiments show that even with 8x reduced rescaler widths, the full accuracy is preserved through minimal incremental retraining. This enables more energy-efficient and cost-efficient AI inference for resource-constrained embedded systems.

artificial intelligence, machine learning, quantization, (19 more...)

arXiv.org Artificial Intelligence

2510.11484

Country: Europe (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment

Neural Information Processing SystemsMay-27-2025, 10:47:59 GMT

Although large language models (LLMs) have demonstrated their strong intelligence ability, the high demand for computation and storage hinders their practical application. To this end, many model compression techniques are proposed to increase the efficiency of LLMs. However, current researches only validate their methods on limited models, datasets, metrics, etc, and still lack a comprehensive evaluation under more general scenarios. So it is still a question of which model compression approach we should use under a specific case. To mitigate this gap, we present the Large Language Model Compression Benchmark (LLMCBench), a rigorously designed benchmark with an in-depth analysis for LLM compression algorithms.

efficient deployment, language model compression, llmcbench, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Lin, Guanyu, Feng, Tao, Han, Pengrui, Liu, Ge, You, Jiaxuan

arXiv.org Artificial IntelligenceSep-6-2024

As scientific research proliferates, researchers face the daunting task of navigating and reading vast amounts of literature. Existing solutions, such as document QA, fail to provide personalized and up-to-date information efficiently. We present Paper Copilot, a self-evolving, efficient LLM system designed to assist researchers, based on thought-retrieval, user profile and high performance optimization. Specifically, Paper Copilot can offer personalized research services, maintaining a real-time updated database. Quantitative evaluation demonstrates that Paper Copilot saves 69.92\% of time after efficient deployment. This paper details the design and implementation of Paper Copilot, highlighting its contributions to personalized academic support and its potential to streamline the research process.

information, paper copilot, retrieval, (13 more...)

arXiv.org Artificial Intelligence

2409.04593

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > Illinois (0.04)
North America > Canada > Quebec (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

EDAC: Efficient Deployment of Audio Classification Models For COVID-19 Detection

Jovanović, Andrej, Mihaly, Mario, Donaldson, Lennon

arXiv.org Artificial IntelligenceSep-11-2023

The global spread of COVID-19 had severe consequences for public health and the world economy. The quick onset of the pandemic highlighted the potential benefits of cheap and deployable pre-screening methods to monitor the prevalence of the disease in a population. Various researchers made use of machine learning methods in an attempt to detect COVID-19. The solutions leverage various input features, such as CT scans or cough audio signals, with state-of-the-art results arising from deep neural network architectures. However, larger models require more compute; a pertinent consideration when deploying to the edge. To address this, we first recreated two models that use cough audio recordings to detect COVID-19. Through applying network pruning and quantisation, we were able to compress these two architectures without reducing the model's predictive performance. Specifically, we were able to achieve an 105.76x and an 19.34x reduction in the compressed model file size with corresponding 1.37x and 1.71x reductions in the inference times of the two models.

audio classification model, dataset, inference time, (13 more...)

arXiv.org Artificial Intelligence

2309.05357

Country: Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Welcome! You are invited to join a webinar: tinyML Talks webcast: 1) Qeexo's Runtime-Free Architecture for Efficient Deployment 2) Democratization of Artificial Intelligence (AI) to Small Scale Farmers. After registering, you will receive a confirmation email about joining the webinar.

#artificialintelligenceOct-7-2020, 10:40:05 GMT

"Qeexo’s Runtime-Free Architecture for Efficient Deployment of Neural Networks on Embedded Targets" Rajen Bhatt Director of Engineering Machine Learning, Qeexo Co Neural networks, including convolutional, feed-forward, recurrent, and convolutional-recurrent, are increasingly popular due to their recent successes in AI applications. Developing neural network models for tinyML applications can be very cumbersome due to constraints of embedded targets having low-power MCUs. Qeexo has developed a runtime-free architecture for efficiently converting TensorFlow-and-PyTorch-generated models to target libraries. This approach builds models which are orders of magnitude smaller than TensorFlow Lite Micro and does not compromise on latency or inference performance. "Democratization of Artificial Intelligence (AI) to Small Scale Farmers - a framework to deploy AI Models to Tiny IoT Edges that operate in constrained environments" Chandrasekar Vuppalapati Senior Vice President - Products & Programs Hanumayamma Innovations and Technologies Inc. Big Data surrounds us. Every minute, our smartphone collects huge amounts of data from geolocations to the next clickable item on an ecommerce site. Data has become one of the most important commodities for individuals and companies. Nevertheless, this data revolution has not touched every economic sector, especially rural economies, e.g., small farmers have been largely passed over the data revolution, in the developing countries due to infrastructure and compute constrained environments. Not only isthis a huge missed opportunity for big data companies, it is one of the significant obstacles in the path towards sustainable food and a huge inhibitor closing economic disparities. The purpose of the talk is to present the TinyML framework to deploy artificial intelligence models in constrained compute environments that enable remote rural areas and small farmers to join the data revolution.

artificial intelligence, machine learning, runtime-free architecture, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Big Data In Healthcare: Paris Hospitals Predict Admission Rates Using Machine Learning

#artificialintelligenceDec-17-2016, 13:10:25 GMT

Hospitals in Paris are trialling Big Data and machine learning systems designed to forecast admission rates – leading to more efficient deployment of resources and better patient outcomes. The result was the first contribution to an open source framework of code designed to carry out the analysis over a scalable, distributed framework. Machine learning is employed to determine which algorithms provide the best indicator of future trends, when they are fed data from the past. The core of the analytics work involves using time series analysis techniques – looking for ways in which patterns in the data can be used to predict the admission rates at different times. This code is already being put to use in several other projects involving healthcare and finance.

data mining, machine learning, paris hospital predict admission rate, (11 more...)

#artificialintelligence

Industry: Health & Medicine > Health Care Providers & Services (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.94)

Add feedback