AITopics | real-time inference

Collaborating Authors

real-time inference

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Real-Time Inference for Distributed Multimodal Systems under Communication Delay Uncertainty

Croisfelt, Victor, de Souza, João Henrique Inacio, Pandey, Shashi Raj, Soret, Beatriz, Popovski, Petar

arXiv.org Artificial IntelligenceNov-21-2025

Connected cyber-physical systems perform inference based on real-time inputs from multiple data streams. Uncertain communication delays across data streams challenge the temporal flow of the inference process. State-of-the-art (SotA) non-blocking inference methods rely on a reference-modality paradigm, requiring one modality input to be fully received before processing, while depending on costly offline profiling. We propose a novel, neuro-inspired non-blocking inference paradigm that primarily employs adaptive temporal windows of integration (TWIs) to dynamically adjust to stochastic delay patterns across heterogeneous streams while relaxing the reference-modality requirement. Our communication-delay-aware framework achieves robust real-time inference with finer-grained control over the accuracy-latency tradeoff. Experiments on the audio-visual event localization (AVEL) task demonstrate superior adaptability to network dynamics compared to SotA approaches.

artificial intelligence, machine learning, real time system, (15 more...)

arXiv.org Artificial Intelligence

2511.16225

Country: Europe (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.94)
Information Technology > Architecture > Real Time Systems (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)

Add feedback

Real-Time Inference for a Gamma Process Model of Neural Spiking

Neural Information Processing SystemsSep-30-2025, 12:18:24 GMT

With simultaneous measurements from ever increasing populations of neurons, there is a growing need for sophisticated tools to recover signals from individual neurons. In electrophysiology experiments, this classically proceeds in a two-step process: (i) threshold the waveforms to detect putative spikes and (ii) cluster the waveforms into single units (neurons). We extend previous Bayesian nonparamet- ric models of neural spiking to jointly detect and cluster neurons using a Gamma process model. Importantly, we develop an online approximate inference scheme enabling real-time analysis, with performance exceeding the previous state-of-the- art. Via exploratory data analysis--using data with partial ground truth as well as two novel data sets--we find several features of our model collectively contribute to our improved performance including: (i) accounting for colored noise, (ii) de- tecting overlapping spikes, (iii) tracking waveform dynamics, and (iv) using mul- tiple channels.

gamma process model, neural spiking, real-time inference, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Data Science (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.44)

Add feedback

QUART-Online: Latency-Free Large Multimodal Language Model for Quadruped Robot Learning

Tong, Xinyang, Ding, Pengxiang, Wang, Donglin, Zhang, Wenjie, Cui, Can, Sun, Mingyang, Fan, Yiguo, Zhao, Han, Zhang, Hongyin, Dang, Yonghao, Huang, Siteng, Lyu, Shangke

arXiv.org Artificial IntelligenceDec-23-2024

This paper addresses the inherent inference latency challenges associated with deploying multimodal large language models (MLLM) in quadruped vision-language-action (QUAR-VLA) tasks. Our investigation reveals that conventional parameter reduction techniques ultimately impair the performance of the language foundation model during the action instruction tuning phase, making them unsuitable for this purpose. We introduce a novel latency-free quadruped MLLM model, dubbed QUART-Online, designed to enhance inference efficiency without degrading the performance of the language foundation model. By incorporating Action Chunk Discretization (ACD), we compress the original action representation space, mapping continuous action values onto a smaller set of discrete representative vectors while preserving critical information. Subsequently, we fine-tune the MLLM to integrate vision, language, and compressed actions into a unified semantic space. Experimental results demonstrate that QUART-Online operates in tandem with the existing MLLM system, achieving real-time inference in sync with the underlying controller frequency, significantly boosting the success rate across various tasks by 65%. Our project page is https://quart-online.github.io.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.15576

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.43)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

Integrating Edge-AI in Structural Health Monitoring domain

Mishra, Anoop, Gangisetti, Gopinath, Khazanchi, Deepak

arXiv.org Artificial IntelligenceApr-7-2023

Structural health monitoring (SHM) tasks like damage detection are crucial for decision-making regarding maintenance and deterioration. For example, crack detection in SHM is crucial for bridge maintenance as crack progression can lead to structural instability. However, most AI/ML models in the literature have low latency and late inference time issues while performing in real-time environments. This study aims to explore the integration of edge-AI in the SHM domain for real-time bridge inspections. Based on edge-AI literature, its capabilities will be valuable integration for a real-time decision support system in SHM tasks such that real-time inferences can be performed on physical sites. This study will utilize commercial edge-AI platforms, such as Google Coral Dev Board or Kneron KL520, to develop and analyze the effectiveness of edge-AI devices. Thus, this study proposes an edge AI framework for the structural health monitoring domain. An edge-AI-compatible deep learning model is developed to validate the framework to perform real-time crack classification. The effectiveness of this model will be evaluated based on its accuracy, the confusion matrix generated, and the inference time observed in a real-time setting.

artificial intelligence, machine learning, real time system, (16 more...)

arXiv.org Artificial Intelligence

2304.03718

Country: North America > United States > Nebraska > Douglas County > Omaha (0.15)

Genre: Research Report (0.83)

Industry: Health & Medicine > Consumer Health (0.98)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

Add feedback

Real-Time Inference for a Gamma Process Model of Neural Spiking

Neural Information Processing SystemsApr-6-2023, 11:52:47 GMT

gamma process model, neural spiking, real-time inference, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Data Science (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.44)

Add feedback

Real-time Health Monitoring of Heat Exchangers using Hypernetworks and PINNs

Majumdar, Ritam, Jadhav, Vishal, Deodhar, Anirudh, Karande, Shirish, Vig, Lovekesh, Runkana, Venkataramana

arXiv.org Artificial IntelligenceDec-20-2022

We demonstrate a Physics-informed Neural Network (PINN) based model for real-time health monitoring of a heat exchanger, that plays a critical role in improving energy efficiency of thermal power plants. A hypernetwork based approach is used to enable the domain-decomposed PINN learn the thermal behavior of the heat exchanger in response to dynamic boundary conditions, eliminating the need to re-train. As a result, we achieve orders of magnitude reduction in inference time in comparison to existing PINNs, while maintaining the accuracy on par with the physics-based simulations. This makes the approach very attractive for predictive maintenance of the heat exchanger in digital twin environments.

artificial intelligence, machine learning, pinn, (14 more...)

arXiv.org Artificial Intelligence

2212.10032

Genre: Research Report (0.82)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Binary Cycle Geothermal Power Plant (1.00)
Energy > Oil & Gas > Midstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

CPU Real-time Face Detection With Python

#artificialintelligenceOct-4-2022, 05:05:09 GMT

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. Is it possible to implement real-time performance object detection models without a GPU? MediaPipe face detection is a proof of concept that makes it possible to run single-class face detection in real-time on almost any CPU.

detection, face detection, tutorial, (13 more...)

#artificialintelligence

Genre: Instructional Material (0.49)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

Add feedback

BRIGHT -- Graph Neural Networks in Real-Time Fraud Detection

Lu, Mingxuan, Han, Zhichao, Rao, Susie Xi, Zhang, Zitao, Zhao, Yang, Shan, Yinan, Raghunathan, Ramesh, Zhang, Ce, Jiang, Jiawei

arXiv.org Artificial IntelligenceAug-24-2022

Detecting fraudulent transactions is an essential component to control risk in e-commerce marketplaces. Apart from rule-based and machine learning filters that are already deployed in production, we want to enable efficient real-time inference with graph neural networks (GNNs), which is useful to catch multihop risk propagation in a transaction graph. However, two challenges arise in the implementation of GNNs in production. First, future information in a dynamic graph should not be considered in message passing to predict the past. Second, the latency of graph query and GNN model inference is usually up to hundreds of milliseconds, which is costly for some critical online services. To tackle these challenges, we propose a Batch and Real-time Inception GrapH Topology (BRIGHT) framework to conduct an end-to-end GNN learning that allows efficient online real-time inference. BRIGHT framework consists of a graph transformation module (Two-Stage Directed Graph) and a corresponding GNN architecture (Lambda Neural Network). The Two-Stage Directed Graph guarantees that the information passed through neighbors is only from the historical payment transactions. It consists of two subgraphs representing historical relationships and real-time links, respectively. The Lambda Neural Network decouples inference into two stages: batch inference of entity embeddings and real-time inference of transaction prediction. Our experiments show that BRIGHT outperforms the baseline models by >2\% in average w.r.t.~precision. Furthermore, BRIGHT is computationally efficient for real-time fraud detection. Regarding end-to-end performance (including neighbor query and inference), BRIGHT can reduce the P99 latency by >75\%. For the inference stage, our speedup is on average 7.8$\times$ compared to the traditional GNN.

graph, inference, transaction, (16 more...)

arXiv.org Artificial Intelligence

2205.13084

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.05)
Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

How to Start a Career in AI

#artificialintelligenceAug-8-2022, 15:01:52 GMT

How do I start a career as a deep learning engineer? What are some of the key tools and frameworks used in AI? How do I learn more about ethics in AI? Everyone has questions, but the most common questions in AI always return to this: how do I get involved? Cutting through the hype to share fundamental principles for building a career in AI, a group of AI professionals gathered at NVIDIA's GTC conference in the spring offered what may be the best place to start. Each panelist, in a conversation with NVIDIA's Louis Stewart, head of strategic initiatives for the developer ecosystem, came to the industry from very different places. But the speakers -- Katie Kallot, NVIDIA's former head of global developer relations and emerging areas; David Ajoku, founder of startup aware.ai;

find people, panelist, programming language, (14 more...)

#artificialintelligence

Country:

North America > Canada (0.05)
Europe > Finland > Uusimaa > Helsinki (0.05)

Genre:

Instructional Material (0.49)
Summary/Review (0.32)

Industry:

Information Technology > Hardware (1.00)
Education > Educational Setting > Online (0.72)
Education > Educational Technology > Educational Software > Computer Based Training (0.49)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.39)

Add feedback

Design Patterns in Machine Learning Code and Systems

#artificialintelligenceJun-19-2022, 19:10:46 GMT

Design patterns are not just a way to structure code. They also communicate the problem addressed and how the code or component is intended to be used. Here are some patterns I've observed in machine learning code and systems, mostly from the Gang of Four design patterns book. Most developers have some familiarity with these patterns and having a basic understanding provides a shared vocabulary to discuss ideas on design and implementation. The factory pattern decouples objects, such as training data, from how they are created.

algorithm, decorator, widget, (11 more...)

#artificialintelligence

Industry: Education > Curriculum > Subject-Specific Education (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback