AITopics | mobile device

Real-time novel-view image synthesis on mobile devices is prohibitive due to the limited computational power and storage. Using volumetric rendering methods, such as NeRF and its derivatives, on mobile devices is not suitable due to the high computational cost of volumetric rendering. On the other hand, recent advances in neural light field representations have shown promising real-time view synthesis results on mobile devices. Neural light field methods learn a direct mapping from a ray representation to the pixel color. The current choice of ray representation is either stratified ray sampling or Plücker coordinates, overlooking the classic light slab (two-plane) representation, the preferred representation to interpolate between light field views. In this work, we find that using the light slab representation is an efficient representation for learning a neural light field. More importantly, it is a lower-dimensional ray representation enabling us to learn the 4D ray space using feature grids which are significantly faster to train and render. Although mostly designed for frontal views, we show that the light-slab representation can be further extended to non-frontal scenes using a divide-and-conquer strategy. Our method provides better rendering quality than prior light field methods and a significantly better trade-off between rendering quality and speed than prior light field methods.

fast neural light field, name change, representation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.76)

Add feedback

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Neural Information Processing SystemsDec-24-2025, 21:13:57 GMT

Text-to-image diffusion models can create stunning images from natural language descriptions that rival the work of professional artists and photographers. However, these models are large, with complex network architectures and tens of denoising iterations, making them computationally expensive and slow to run. As a result, high-end GPUs and cloud-based inference are required to run diffusion models at scale. This is costly and has privacy implications, especially when user data is sent to a third party. To overcome these challenges, we present a generic approach that, for the first time, unlocks running text-to-image diffusion models on mobile devices in **less than 2 seconds**. We achieve so by introducing efficient network architecture and improving step distillation. Specifically, we propose an efficient UNet by identifying the redundancy of the original model and reducing the computation of the image decoder via data distillation. Further, we enhance the step distillation by exploring training strategies and introducing regularization from classifier-free guidance. Our extensive experiments on MS-COCO show that our model with $8$ denoising steps achieves better FID and CLIP scores than Stable Diffusion v$1.5$ with $50$ steps.

mobile device, name change, text-to-image diffusion model, (7 more...)

Neural Information Processing Systems

Industry: Information Technology (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

EfficientFormer: Vision Transformers at MobileNet Speed

Neural Information Processing SystemsDec-24-2025, 05:32:18 GMT

Vision Transformers (ViT) have shown rapid progress in computer vision tasks, achieving promising results on various benchmarks. However, due to the massive number of parameters and model design, e.g., attention mechanism, ViT-based models are generally times slower than lightweight convolutional networks. Therefore, the deployment of ViT for real-time applications is particularly challenging, especially on resource-constrained hardware such as mobile devices. Recent efforts try to reduce the computation complexity of ViT through network architecture search or hybrid design with MobileNet block, yet the inference speed is still unsatisfactory. This leads to an important question: can transformers run as fast as MobileNet while obtaining high performance?

efficientformer, name change, vision transformer, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Unified Class and Domain Incremental Learning with Mixture of Experts for Indoor Localization

Singampalli, Akhil, Pasricha, Sudeep

arXiv.org Artificial IntelligenceNov-25-2025

Indoor localization using machine learning has gained traction due to the growing demand for location-based services. However, its long-term reliability is hindered by hardware/software variations across mobile devices, which shift the model's input distribution to create domain shifts. Further, evolving indoor environments can introduce new locations over time, expanding the output space to create class shifts, making static machine learning models ineffective over time. To address these challenges, we propose a novel unified continual learning framework for indoor localization called MOELO that, for the first time, jointly addresses domain-incremental and class-incremental learning scenarios. MOELO enables a lightweight, robust, and adaptive localization solution that can be deployed on resource-limited mobile devices and is capable of continual learning in dynamic, heterogeneous real-world settings. This is made possible by a mixture-of-experts architecture, where experts are incrementally trained per region and selected through an equiangular tight frame based gating mechanism ensuring efficient routing, and low-latency inference, all within a compact model footprint. Experimental evaluations show that MOELO achieves improvements of up to 25.6x in mean localization error, 44.5x in worst-case localization error, and 21.5x lesser forgetting compared to state-of-the-art frameworks across diverse buildings, mobile devices, and learning scenarios.

artificial intelligence, localization, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.17829

Country: North America > United States > Colorado (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Pelee: A Real-Time Object Detection System on Mobile Devices

Neural Information Processing SystemsNov-20-2025, 22:42:33 GMT

An increasing need of running Convolutional Neural Network (CNN) models on mobile devices with limited computing power and memory resource encourages studies on efficient model design. A number of efficient architectures have been proposed in recent years, for example, MobileNet, ShuffleNet, and MobileNetV2. However, all these models are heavily dependent on depthwise separable convolution which lacks efficient implementation in most deep learning frameworks. In this study, we propose an efficient architecture named PeleeNet, which is built with conventional convolution instead.

name change, pelee, real-time object detection system, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Filters

Collaborating Authors

mobile device

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

adb77ecc8ba1c2d3135c86a46b8f2496-Paper-Conference.pdf

631ad9ae3174bf4d6c0f6fdca77335a4-Paper-Conference.pdf

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds Y anyu Li1,2, Huan Wang 1,2, Qing Jin

5452ad8ee6ea6e7dc41db1cbd31ba0b8-Paper-Conference.pdf

GhostNetV2: EnhanceCheapOperationwith Long-RangeAttention

LightSpeed: Light and Fast Neural Light Fields on Mobile Devices

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

EfficientFormer: Vision Transformers at MobileNet Speed

Unified Class and Domain Incremental Learning with Mixture of Experts for Indoor Localization

Pelee: A Real-Time Object Detection System on Mobile Devices