AITopics | openvino

2504.20198

Country:

North America > United States (0.04)
Europe > Austria (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Fei, Anthony, Abdelfattah, Mohamed S.

NITRO: LLM Inference on Intel Laptop NPUs

arXiv.org Artificial IntelligenceDec-15-2024

Large Language Models (LLMs) have become essential tools in natural language processing, finding large usage in chatbots such as ChatGPT and Gemini, and are a central area of research. A particular area of interest includes designing hardware specialized for these AI applications, with one such example being the neural processing unit (NPU). In 2023, Intel released the Intel Core Ultra processor with codename Meteor Lake, featuring a CPU, GPU, and NPU system-on-chip. However, official software support for the NPU through Intel's OpenVINO framework is limited to static model inference. The dynamic nature of autoregressive token generation in LLMs is therefore not supported out of the box. To address this shortcoming, we present NITRO (NPU Inference for Transformers Optimization), a Python-based framework built on top of OpenVINO to support text and chat generation on NPUs. In this paper, we discuss in detail the key modifications made to the transformer architecture to enable inference, some performance benchmarks, and future steps towards improving the package. The code repository for NITRO can be found here: https://github.com/abdelfattah-lab/nitro.

large language model, machine learning, natural language, (18 more...)

2412.11053

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

PCWorldMar-26-2024, 22:49:20 GMT

Intel pushes harder to make AI apps run best on Core Ultra

Intel said Tuesday that it is expanding what it calls its AI Acceleration program into midrange software vendors, launching an AI developer NUC to speed the process. It's all a bid to lasso software developers and bring them under the Core Ultra banner. For consumers, the program is an ongoing acknowledgement that Intel continues to work to integrate the NPU inside its Core Ultra processor with software vendors, in order to extract actual value from the logic, and not just capitalize on the latest buzzword, AI. There's a more subtle message, too: if Intel is able to convince software developers to use its OpenVINO toolkit to help them code AI applications, it will help ensure that Intel's Core Ultra chips are the preferred or "better" AI chips. That might not actually be the case, of course. But the push to sign up software developers seems similar to the way in which graphics vendors work with game developers to convince them to add GPU-specific features to their games and thus deliver improved performance.

artificial intelligence, developer, software developer, (10 more...)

PCWorld

Industry: Information Technology > Software (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

PCWorldJan-3-2024, 18:54:54 GMT

Audacity's cool audio AI tools are now free for you to try

As AI PCs debut, one question you'll be asking yourself is: What can I do with them? Audacity has an early answer, with the release of its on-chip audio AI tools for music generation, transcription, and more. Intel used Audacity as a demo partner while describing the Meteor Lake (now rebranded as Core Ultra) architecture in Malaysia, showing off some of the tools that it formally released on Monday. The tools use OpenVINO, an open-source toolkit, but one developed by Intel and that the company has separately optimized. Audacity's new AI tools include: The issue is that these new AI tools, in addition to the CPU limitations placed upon them, require a single older version of Audacity installed: Audacity 3.4.2.

ai tool, artificial intelligence, audacity, (5 more...)

PCWorld

Country: Asia > Malaysia (0.27)

Technology: Information Technology > Artificial Intelligence (1.00)

Barad, Haim, Aidova, Ekaterina, Gorbachev, Yury

Leveraging Speculative Sampling and KV-Cache Optimizations Together for Generative AI using OpenVINO

arXiv.org Artificial IntelligenceNov-8-2023

Inference optimizations are critical for improving user experience and reducing infrastructure costs and power consumption. In this article, we illustrate a form of dynamic execution known as speculative sampling to reduce the overall latency of text generation and compare it with standard autoregressive sampling. This can be used together with model-based optimizations (e.g. quantization) to provide an optimized solution. Both sampling methods make use of KV caching. A Jupyter notebook and some sample executions are provided.

deep learning, leveraging speculative sampling, machine learning, (4 more...)

2311.04951

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

#artificialintelligenceNov-19-2022, 05:45:23 GMT

AI at the Edge Spurs New Industrial Opportunities

The world is moving fast, and manufacturers must be able to keep up with the pace of change. Luckily, with technologies like AI, machine learning, computer vision, and edge computing, solution developers have the tools to help them do so. And we are already seeing major results--both inside and outside the factory. For instance, smart manufacturers have started to deploy AI at the edge on the shop floor to reduce the risk of unplanned shutdowns and production issues. By automating the process with AI platforms like the Intel OpenVINO Toolkit, image analysis can be performed directly on smart factory equipment, and workers can be quickly notified of any issues happening. This reduces manual work, which is prone to errors, and stops problems before they snowball.

artificial intelligence, computer vision, manufacturer, (11 more...)

Industry: Information Technology (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (0.34)

#artificialintelligenceOct-19-2022, 23:05:36 GMT

AI Inference Software Fundamentals: Getting Started with Optical Character Recognition

You can find the full source code to today's demo in a Kaggle notebook where it is formatted as a series of very short, numbered blocks. For the sake of brevity, this post will walk through only the most significant snippets of the notebook's code. But, of course, you can study the full notebook at your leisure by the block number and learn how we trained a neural network from scratch to achieve a level of accuracy not possible a decade ago. In blocks 1 to 3, the notebook sets the Python environment for TensorFlow. In blocks 4 to 14, the notebook loads the database MNIST, which is what we will use to create a model that can recognize handwritten digits and train our neural networks. Then the new and exciting part Intel offers today is how these models can be optimized on Intel hardware to run more efficiently and quickly.

artificial intelligence, machine learning, optical character recognition, (16 more...)

Country: North America > United States (0.15)

Industry: Law (0.51)

Technology:

Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceMay-31-2022, 15:11:07 GMT

The AI Journey: Why You Should Pack OpenShift and OpenVINO

AI can be an intimidating field to get into, and there is a lot that goes into deploying an AI application. But if you don't choose the right tools, it can be even more difficult than it needs to be. Luckily, the work that Intel and Red Hat are doing is easing the burden for businesses and developers. They'll discuss machine learning and natural language processing; using the OpenVINO AI toolkit with Red Hat OpenShift; and the life cycle of an AI intelligent application. Ryan Loney: Everything today has some intelligence embedded into it.

artificial intelligence, machine learning, natural language, (14 more...)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.91)

#artificialintelligenceDec-8-2021, 07:24:21 GMT

Template Matching with OpenVINO

什麼是 Template Matching?. “Template Matching with OpenVINO” is published by TM.

artificial intelligence, openvino, template matching, (1 more...)

Technology: Information Technology > Artificial Intelligence (0.43)

Abbasi, Saad, Shafiee, Mohammad Javad, Chan, Ellick, Wong, Alexander

Does Form Follow Function? An Empirical Exploration of the Impact of Deep Neural Network Architecture Design on Hardware-Specific Acceleration

arXiv.org Artificial IntelligenceJul-8-2021

The fine-grained relationship between form and function with respect to deep neural network architecture design and hardware-specific acceleration is one area that is not well studied in the research literature, with form often dictated by accuracy as opposed to hardware function. In this study, a comprehensive empirical exploration is conducted to investigate the impact of deep neural network architecture design on the degree of inference speedup that can be achieved via hardware-specific acceleration. More specifically, we empirically study the impact of a variety of commonly used macro-architecture design patterns across different architectural depths through the lens of OpenVINO microprocessor-specific and GPU-specific acceleration. Experimental results showed that while leveraging hardware-specific acceleration achieved an average inference speed-up of 380%, the degree of inference speed-up varied drastically depending on the macro-architecture design pattern, with the greatest speedup achieved on the depthwise bottleneck convolution design pattern at 550%. Furthermore, we conduct an in-depth exploration of the correlation between FLOPs requirement, level 3 cache efficacy, and network latency with increasing architectural depth and width. Finally, we analyze the inference time reductions using hardware-specific acceleration when compared to native deep learning frameworks across a wide variety of hand-crafted deep convolutional neural network architecture designs as well as ones found via neural architecture search strategies. We found that the DARTS-derived architecture to benefit from the greatest improvement from hardware-specific software acceleration (1200%) while the depthwise bottleneck convolution-based MobileNet-V2 to have the lowest overall inference time of around 2.4 ms.

artificial intelligence, hardware-specific acceleration, machine learning, (19 more...)

2107.04144

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)