AITopics

2511.2303

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.65)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)

Windisch, Felix, Köhler, Thomas, Radl, Lukas, Steiner, Michael, Schmalstieg, Dieter, Steinberger, Markus

A LoD of Gaussians: Unified Training and Rendering for Ultra-Large Scale Reconstruction with External Memory

arXiv.org Artificial IntelligenceNov-7-2025

Gaussian Splatting has emerged as a high-performance technique for novel view synthesis, enabling real-time rendering and high-quality reconstruction of small scenes. However, scaling to larger environments has so far relied on partitioning the scene into chunks -- a strategy that introduces artifacts at chunk boundaries, complicates training across varying scales, and is poorly suited to unstructured scenarios such as city-scale flyovers combined with street-level views. Moreover, rendering remains fundamentally limited by GPU memory, as all visible chunks must reside in VRAM simultaneously. We introduce A LoD of Gaussians, a framework for training and rendering ultra-large-scale Gaussian scenes on a single consumer-grade GPU -- without partitioning. Our method stores the full scene out-of-core (e.g., in CPU memory) and trains a Level-of-Detail (LoD) representation directly, dynamically streaming only the relevant Gaussians. A hybrid data structure combining Gaussian hierarchies with Sequential Point Trees enables efficient, view-dependent LoD selection, while a lightweight caching and view scheduling system exploits temporal coherence to support real-time streaming and rendering. Together, these innovations enable seamless multi-scale reconstruction and interactive visualization of complex scenes -- from broad aerial views to fine-grained ground-level details.

artificial intelligence, gaussian, machine learning, (18 more...)

2507.0111

Genre: Research Report (0.84)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceSep-29-2025

Prima.cpp: Fast 30-70B LLM Inference on Heterogeneous and Low-Resource Home Clusters

Li, Zonghang, Li, Tao, Feng, Wenjiao, Xiao, Rongxing, She, Jianshu, Huang, Hong, Guizani, Mohsen, Yu, Hongfang, Ho, Qirong, Xiang, Wei, Liu, Steve

On-device inference offers privacy, offline use, and instant response, but consumer hardware restricts large language models (LLMs) to low throughput and capability. To overcome this challenge, we present prima.cpp, a distributed on-device inference system that runs 30-70B LLMs on consumer home clusters with mixed CPUs/GPUs, insufficient RAM/VRAM, slow disks, Wi-Fi links, and heterogeneous OSs. We introduce pipelined-ring parallelism (PRP) to overlap disk I/O with compute and communication, and address the prefetch-release conflict in mmap-based offloading. We further propose Halda, a heterogeneity-aware scheduler that co-optimizes per-device CPU/GPU workloads and device selection under RAM/VRAM constraints. On four consumer home devices, a 70B model reaches 674 ms/token TPOT with <6% memory pressure, and a 32B model with speculative decoding achieves 26 tokens/s. Compared with llama.cpp, exo, and dllama, our proposed prima.cpp achieves 5-17x lower TPOT, supports fine-grained model sizes from 8B to 70B, ensures broader cross-OS and quantization compatibility, and remains OOM-free, while also being Wi-Fi tolerant, privacy-preserving, and hardware-independent. The code is available at https://gitee.com/zonghang-li/prima.cpp.

artificial intelligence, large language model, natural language, (19 more...)

2504.08791

Genre: Research Report (0.40)

Industry: Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

PCWorldAug-15-2025, 17:12:20 GMT

Intel's new configurable VRAM option gives Core laptops an AI boost

For many months, AMD offered a special treat to enthusiasts wishing to run AI chatbot LLMs on their PCs: configurable VRAM that significantly improved performance. Now Intel can say the same. Bob Duffy, who oversees Intel's AI Playground application for running AI art and local chatbots on your PC, tweeted that the company's latest Arc driver for its integrated GPUs now offers a "shared GPU memory override" that offers the ability to adjust your PC's VRAM, provided that you have a supported processor. This is a big deal for AI and even some games, though not an obvious one. If you owned an Intel Core laptop with 32GB of memory, 16GB of it would be assigned to AI and games.

intel, laptop, vram, (10 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.92)

PCWorldAug-7-2025, 15:00:00 GMT

Framework Desktop review: A powerful AI PC, made with love

The Framework Desktop DIY Edition is a thoughtfully engineered small-form-factor desktop PC that is both an entry point into enthusiast computing as well as a powerful AI desktop in its own right. The Framework Desktop DIY Edition is unique: a do-it-yourself desktop without the complexity of building from scratch, forming a compact, personalized "AI workstation." If you're nervous about a less-familiar brand, don't be. Multiple photos show how to tighten a thumbscrew–that's how comfortable they want you to be. I can point to a few things that I thought needed improvement: soldered memory, a beta driver bundle that should be finalized by the time you buy it, and a top panel which didn't clip in as easily as I would have liked. Inserting the SSD stressed me out a bit, too. But Framework's eye for customization (colored tiles you can design and install yourself, plus your choice of I/O) lends itself to fun and productivity. The AMD Ryzen AI Max (Strix Halo) chip inside is slightly out of the ordinary, with its do-everything design. I have high praise for the Framework Desktop, and think you will too.

desktop, framework desktop, mark hachman foundry, (13 more...)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Chaubard, Francois, Kochenderfer, Mykel

Scaling Recurrent Neural Networks to a Billion Parameters with Zero-Order Optimization

arXiv.org Artificial IntelligenceMay-26-2025

During inference, Recurrent Neural Networks (RNNs) scale constant in both FLOPs and GPU memory with increasing context length, as they compress all prior tokens into a fixed-size memory. In contrast, transformers scale linearly in FLOPs and, at best, linearly in memory during generation, since they must attend to all previous tokens explicitly. Despite this inference-time advantage, training large RNNs on long contexts remains impractical because standard optimization methods depend on Backpropagation Through Time (BPTT). BPTT requires retention of all intermediate activations during the forward pass, causing memory usage to scale linearly with both context length and model size. In this paper, we show that Zero-Order Optimization (ZOO) methods such as Random-vector Gradient Estimation (RGE) can successfully replace BPTT to train RNNs with convergence rates that match, or exceed BPTT by up to 19 fold, while using orders of magnitude less memory and cost, as the model remains in inference mode throughout training. We further demonstrate that Central-Difference RGE (CD-RGE) corresponds to optimizing a smoothed surrogate loss, inherently regularizing training and improving generalization. Our method matches or outperforms BPTT across three settings: (1) overfitting, (2) transduction, and (3) language modeling. Across all tasks, with sufficient perturbations, our models generalize as well as or better than those trained with BPTT, often in fewer steps. Despite the need for more forward passes per step, we can surpass BPTT wall-clock time per step using recent advancements such as FlashRNN and distributed inference.

artificial intelligence, cd-rge, machine learning, (17 more...)

2505.17852

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Artificial IntelligenceMar-19-2025

Mixture of Lookup Experts

Jie, Shibo, Tang, Yehui, Han, Kai, Li, Yitong, Tang, Duyu, Deng, Zhi-Hong, Wang, Yunhe

Mixture-of-Experts (MoE) activates only a subset of experts during inference, allowing the model to maintain low inference FLOPs and latency even as the parameter count scales up. However, since MoE dynamically selects the experts, all the experts need to be loaded into VRAM. Their large parameter size still limits deployment, and offloading, which load experts into VRAM only when needed, significantly increase inference latency. To address this, we propose Mixture of Lookup Experts (MoLE), a new MoE architecture that is efficient in both communication and VRAM usage. In MoLE, the experts are Feed-Forward Networks (FFNs) during training, taking the output of the embedding layer as input. Before inference, these experts can be re-parameterized as lookup tables (LUTs) that retrieves expert outputs based on input ids, and offloaded to storage devices. Therefore, we do not need to perform expert computations during inference. Instead, we directly retrieve the expert's computation results based on input ids and load them into VRAM, and thus the resulting communication overhead is negligible. Experiments show that, with the same FLOPs and VRAM usage, MoLE achieves inference speeds comparable to dense models and significantly faster than MoE with experts offloading, while maintaining performance on par with MoE.

large language model, machine learning, natural language, (20 more...)

2503.15798

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

PCWorldMar-7-2025, 13:00:00 GMT

Boost AMD's Ryzen AI Max performance up to 60% with this memory trick

If you've purchased a laptop or tablet with an AMD Ryzen chip inside, there's a performance tweak you absolutely need to know about. Savvy gamers know instinctively that you can boost your game's frame rate by lowering the resolution or the visual quality, or by making an adjustment to the Windows power-performance slider. But the Ryzen AI Max is a new kind of device: a killer mobile processor that can run modern games at elevated frame rates, and serve as an AI powerhouse. A simple adjustment of the Ryzen AI Max's unified frame buffer, or available graphics memory. While it's a simple fix, in my tests, it made an enormous difference: up to a 60 percent performance boost in some cases.

artificial intelligence, frame buffer, natural language, (11 more...)

Industry:

Health & Medicine > Consumer Health (0.40)
Leisure & Entertainment > Games > Computer Games (0.35)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.72)

PCWorldFeb-12-2025, 17:25:52 GMT

Adobe Firefly muscles into AI video–here's what it looks like

Adobe said today that it's bringing AI-generated video, aka the Firefly Video Model, to Adobe Premiere Pro plus its Firefly generative art service. Unlike its generative AI image capabilities, however, it won't be free. AI-generated video has been available for months. In December, OpenAI released Sora, its ability to craft AI video clips of several seconds from a text prompt. What Adobe is offering is authenticity.

artificial intelligence, deep learning, machine learning, (17 more...)

Country: Europe > Iceland (0.05)

Industry: Leisure & Entertainment > Sports (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.56)

PCWorldSep-11-2024, 17:31:21 GMT

AMD's AFMF 2 gives you more frames for free on Ryzen AI 300 laptops

AMD is offering gamers an early version of what it calls AMD Fluid Motion Frames 2 (AFMF2), an AI-powered technology that creates additional frames per second for those gamers with laptops powered by AMD Ryzen AI 300 processors. The improvements appear significant: about 30 to 40 percent. Specifically, AMD says that AFMF 2 will boost frame rates in Cyberpunk: 2077 from 56 frames (1080p, Low settings, Balanced FSR) to about 100 FPS using the new technology, and other games will see a similar boost. For now, AMD is offering AFMF 2 as a technical preview, though the technology will be added to AMD's Adrenalin Edition software at a later date. AFMF2 is the second generation of the AFMF technology AMD launched in January.

afmf 2, amd, ryzen ai 300, (10 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.62)

Technology:

Information Technology > Artificial Intelligence (0.57)
Information Technology > Hardware (0.39)