Goto

Collaborating Authors

 oneplus 12


PowerInfer-2: Fast Large Language Model Inference on a Smartphone

Xue, Zhenliang, Song, Yixin, Mi, Zeyu, Chen, Le, Xia, Yubin, Chen, Haibo

arXiv.org Artificial Intelligence

This paper introduces PowerInfer-2, a framework designed for high-speed inference of Large Language Models (LLMs) on smartphones, particularly effective for models whose sizes exceed the device's memory capacity. The key insight of PowerInfer-2 is to utilize the heterogeneous computation, memory, and I/O resources in smartphones by decomposing traditional matrix computations into fine-grained neuron cluster computations. Specifically, PowerInfer-2 features a polymorphic neuron engine that adapts computational strategies for various stages of LLM inference. Additionally, it introduces segmented neuron caching and fine-grained neuron-cluster-level pipelining, which effectively minimize and conceal the overhead caused by I/O operations. The implementation and evaluation of PowerInfer-2 demonstrate its capability to support a wide array of LLM models on two smartphones, achieving up to a 29.2x speed increase compared with state-of-the-art frameworks. Notably, PowerInfer-2 is the first system to serve the TurboSparse-Mixtral-47B model with a generation rate of 11.68 tokens per second on a smartphone. For models that fit entirely within the memory, PowerInfer-2 can achieve approximately a 40% reduction in memory usage while maintaining inference speeds comparable to llama.cpp and MLC-LLM. For more details, including a demonstration video, please visit the project site at www.powerinfer.ai/v2.


OnePlus 12 review: A no-nonsense flagship for a great price

Engadget

It might be weird to see a new device call back to a time less than a decade ago. But tech moves fast and with the OnePlus 12, it feels like someone made a phone for the pre-AI era. Instead of magic editors and a bunch of machine learning, OnePlus' latest flagship is incredibly simple. It has a nice screen, a solid build, reliable cameras, great performance and even better battery life. So while it won't help you summarize a meeting or remaster a photo, the OP12 covers all the basics with aplomb.


The Morning After: Apple's car project still exists

Engadget

Remember the Apple car rumors? Project Titan, as it's apparently called, is still progressing, with perhaps, a dose of reality. Bloomberg's Mark Gurman says the company's decade-old project has shifted from creating a fully self-driving car to an EV more like Tesla's. The car's autonomous features have reportedly been downgraded from a Level 5 system (full automation) to a Level 4 system (full automation in some circumstances) -- and now to Level 2 (partial automation). For context, Tesla's Autopilot is Level 2. Level 2 doesn't have a formal description yet.