Real Time Systems
realSEUDO for real-time calcium imaging analysis
Closed-loop neuroscience experimentation, where recorded neural activity is used to modify the experiment on-the-fly, is critical for deducing causal connections and optimizing experimental time. A critical step in creating a closed-loop experiment is real-time inference of neural activity from streaming recordings. One challenging modality for real-time processing is multi-photon calcium imaging (CI). CI enables the recording of activity in large populations of neurons however, often requires batch processing of the video data to extract single-neuron activity from the fluorescence videos. We use the recently proposed robust time-trace estimator--Sparse Emulation of Unused Dictionary Objects (SEUDO) algorithm--as a basis for a new on-line processing algorithm that simultaneously identifies neurons in the fluorescence video and infers their time traces in a way that is robust to as-yet unidentified neurons. To achieve real-time SEUDO (realSEUDO), we optimize the core estimator via both algorithmic improvements and an fast C-based implementation, and create a new cell finding loop to enable realSEUDO to also identify new cells. We demonstrate comparable performance to offline algorithms (e.g., CNMF), and improved performance over the current on-line approach (OnACID) at speeds of 120 Hz on average.
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis Pengyi Jiao 1 Xingchao Yang 2
Volumetric rendering-based methods, like NeRF, excel in HDR view synthesis from RAW images, especially for nighttime scenes. They suffer from long training times and cannot perform real-time rendering due to dense sampling requirements. The advent of 3D Gaussian Splatting (3DGS) enables real-time rendering and faster training. However, implementing RAW image-based view synthesis directly using 3DGS is challenging due to its inherent drawbacks: 1) in nighttime scenes, extremely low SNR leads to poor structure-from-motion (SfM) estimation in distant views; 2) the limited representation capacity of the spherical harmonics (SH) function is unsuitable for RAW linear color space; and 3) inaccurate scene structure hampers downstream tasks such as refocusing. To address these issues, we propose LE3D (Lighting Every darkness with 3DGS). Our method proposes Cone Scatter Initialization to enrich the estimation of SfM and replaces SH with a Color MLP to represent the RAW linear color space. Additionally, we introduce depth distortion and near-far regularizations to improve the accuracy of scene structure for downstream tasks. These designs enable LE3D to perform real-time novel view synthesis, HDR rendering, refocusing, and tone-mapping changes. Compared to previous volumetric rendering-based methods, LE3D reduces training time to 1% and improves rendering speed by up to 4,000 times for 2K resolution images in terms of FPS.
CALANet: Cheap All-Layer Aggregation for Human Activity Recognition Jaegyun Park 1, Dae-Won Kim
With the steady growth of sensing technology and wearable devices, sensor-based human activity recognition has become essential in widespread applications, such as healthcare monitoring and fitness tracking, where accurate and real-time systems are required. To achieve real-time response, recent studies have focused on lightweight neural network models. Specifically, they designed the network architectures by restricting the number of layers shallowly or connections of each layer. However, these approaches suffer from limited accuracy because the classifier only uses the features at the last layer. In this study, we propose a cheap all-layer aggregation network, CALANet, for accuracy improvement while maintaining the efficiency of existing real-time HAR models. Specifically, CALANet allows the classifier to aggregate the features for all layers, resulting in a performance gain. In addition, this work proves that the theoretical computation cost of CALANet is equivalent to that of conventional networks. Evaluated on seven publicly available datasets, CALANet outperformed existing methods, achieving state-of-the-art performance.
AIhub monthly digest: May 2025 โ materials design, object state classification, and real-time monitoring for healthcare data
Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we learn about drug and material design using generative models and Bayesian optimization, find out about a system for real-time monitoring for healthcare data, and explore domain-specific distribution shifts in volunteer-collected biodiversity datasets. Ananya Joshi recently completed her PhD, where she developed a system that experts have used for the past two years to identify respiratory outbreaks (like COVID-19) in large-scale healthcare streams across the United States. In this interview, she tells us more about this project, how healthcare applications inspire basic AI research, and her future plans. Onur Boyar is a PhD student at Nagoya university, working on generative models and Bayesian methods for materials and drug design.
Real-Time Selection Under General Constraints via Predictive Inference Lin Lu1
Real-time decision-making gets more attention in the big data era. Here, we consider the problem of sample selection in the online setting, where one encounters a possibly infinite sequence of individuals collected over time with covariate information available. The goal is to select samples of interest that are characterized by their unobserved responses until the user-specified stopping time. We derive a new decision rule that enables us to find more preferable samples that meet practical requirements by simultaneously controlling two types of general constraints: individual and interactive constraints, which include the widely utilized False Selection Rate (FSR), cost limitations, diversity of selected samples, etc. The key elements of our approach involve quantifying the uncertainty of response predictions via predictive inference and addressing individual and interactive constraints in a sequential manner. Theoretical and numerical results demonstrate the effectiveness of the proposed method in controlling both individual and interactive constraints.
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates Hexuan Deng 1 Min Zhang
However, existing benchmarks focus on outdated content and limited fields, facing difficulties in real-time updating and leaving new terms unexplored. To address this problem, we propose an adaptive benchmark, NewTerm, for real-time evaluation of new terms. We design a highly automated construction method to ensure high-quality benchmark construction with minimal human effort, allowing flexible updates for real-time information. Empirical results on various LLMs demonstrate over 20% performance reduction caused by new terms. Additionally, while updates to the knowledge cutoff of LLMs can cover some of the new terms, they are unable to generalize to more distant new terms. We also analyze which types of terms are more challenging and why LLMs struggle with new terms, paving the way for future research. Finally, we construct NewTerm 2022 and 2023 to evaluate the new terms updated each year and will continue updating annually.
RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer Jian Wang 1 Qiman Wu1
Recently, transformer-based networks have shown impressive results in semantic segmentation. Yet for real-time semantic segmentation, pure CNN-based approaches still dominate in this field, due to the time-consuming computation mechanism of transformer. We propose RTFormer, an efficient dual-resolution transformer for real-time semantic segmenation, which achieves better trade-off between performance and efficiency than CNN-based models. To achieve high inference efficiency on GPU-like devices, our RTFormer leverages GPU-Friendly Attention with linear complexity and discards the multi-head mechanism. Besides, we find that cross-resolution attention is more efficient to gather global context information for high-resolution branch by spreading the high level knowledge learned from low-resolution branch. Extensive experiments on mainstream benchmarks demonstrate the effectiveness of our proposed RTFormer, it achieves state-of-the-art on Cityscapes, CamVid and COCOStuff, and shows promising results on ADE20K.
FOF: Learning Fourier Occupancy Field for Monocular Real-time Human Reconstruction
The advent of deep learning has led to significant progress in monocular human reconstruction. However, existing representations, such as parametric models, voxel grids, meshes and implicit neural representations, have difficulties achieving high-quality results and real-time speed at the same time. In this paper, we propose Fourier Occupancy Field (FOF), a novel, powerful, efficient and flexible 3D geometry representation, for monocular real-time and accurate human reconstruction. A FOF represents a 3D object with a 2D field orthogonal to the view direction where at each 2D position the occupancy field of the object along the view direction is compactly represented with the first few terms of Fourier series, which retains the topology and neighborhood relation in the 2D domain. A FOF can be stored as a multi-channel image, which is compatible with 2D convolutional neural networks and can bridge the gap between 3D geometries and 2D images. A FOF is very flexible and extensible, e.g., parametric models can be easily integrated into a FOF as a prior to generate more robust results. Meshes and our FOF can be easily inter-converted. Based on FOF, we design the first 30+FPS high-fidelity real-time monocular human reconstruction framework. We demonstrate the potential of FOF on both public datasets and real captured data.