avx-512
Optimization of Oblivious Decision Tree Ensembles Evaluation for CPU
Mironov, Alexey, Khuziev, Ilnur
CatBoost is a popular machine learning library. CatBoost models are based on oblivious decision trees, making training and evaluation rapid. CatBoost has many applications, and some require low latency and high throughput evaluation. This paper investigates the possibilities for improving CatBoost's performance in single-core CPU computations. We explore the new features provided by the AVX instruction sets to optimize evaluation. We increase performance by 20-40% using AVX2 instructions without quality impact. We also introduce a new trade-off between speed and quality. Using float16 for leaf values and AVX-512 instructions, we achieve 50-70% speed-up.
Intel AVX-512 A Big Win For... JSON Parsing Performance
In addition to the many HPC workloads and other scientific computing tasks where Intel's AVX-512 performance on their latest processor proves very beneficial, it also turns out AVX-512 can provide significant benefit to a much more mundane web server task: JSON parsing. The simdjson project that is focused on "parsing gigabytes of JSON per second" this week issued simdjson 2.0 and is headlined by an Intel-led contribution of AVX-512 support. The JavaScript Object Notation (JSON) data interchange format is heavily used by practically all major websites/web-applications in some capacity and can be dealt with by pretty much all programming languages. JSON really need not any introduction. The past few years there has been simdjson as an open-source (Apache 2.0 licensed) project aimed at delivering the highest performance JSON parser that can parse "gigabytes of JSON per second" and claims of being 4 25x faster than alternatives.
Intel defends AVX-512 against critics who wish it to die a 'painful death'
Intel has finally defended its AVX-512 instruction set against critics who have gone so far as to wish it to die "a painful death." Intel Chief Architect Raja Koduri said the community loves it because it yields huge performance boosts, and Intel has an obligation to offer it across its portfolio. "AVX-512 is a great feature. Our HPC community, AI community, love it," Koduri said, responding to a question from PCWorld about the AVX-512 kerfuffle during Intel's Architecture Day on August 11. "Our customers on the data center side really, really, really love it."
Intel's latest Xeon chips based on Skylake due next year
Intel has moved to a new architecture called Kaby Lake for its PC chips, but it isn't done with the previous generation Skylake yet. The company will release new Xeon server chips based on Skylake in mid-2017, and they will boast big performance increases, said Barry Davis, general manager for the accelerated workload group at Intel. The Skylake Xeon chips will go into mainstream servers and could spark a big round of hardware upgrades, Davis said. Xeon chips aren't as visible as Intel's PC chips but remain extremely popular. Companies like Google, Facebook, and Amazon buy thousands of servers loaded with Xeon chips to power their search, social networking, and artificial intelligence tasks.