Goto

Collaborating Authors

 unified memory


Apple Mac Studio M4 Max review: A creative powerhouse

Engadget

The Mac Studio is Apple's ultimate performance computer, but this year's model came with a twist: It's equipped with either an M4 Max or an M3 Ultra processor. The latter might seem like a step backward, since nearly all Macs (except the Mac Pro) are now equipped with M4 chips. However, the M3 Ultra is indeed Apple's best-performing processor, which makes the new Mac Studio its fastest computer ever. While the M3 Ultra model appears highly capable for creative pros and engineers, it starts at 4,000 and goes way up from there. I'm intrigued by that model based on benchmarks I saw elsewhere, of course.


Inside Pascal: NVIDIA's Newest Computing Platform

#artificialintelligence

Unlike other technical computing applications that require high-precision floating-point computation, deep neural network architectures have a natural resilience to errors due to the backpropagation algorithm used in their training. Storing FP16 data compared to higher precision FP32 or FP64 reduces memory usage of the neural network, allowing training and deployment of larger networks. Using FP16 computation improves performance up to 2x compared to FP32 arithmetic, and similarly FP16 data transfers take less time than FP32 or FP64 transfers. The GP100 SM ISA provides new arithmetic operations that can perform two FP16 operations at once on a single-precision CUDA Core, and 32-bit GP100 registers can store two FP16 values. Atomic memory operations are important in parallel programming, allowing concurrent threads to correctly perform read-modify-write operations on shared data.