compiler auto-vectorization
Compiler Auto-Vectorization with Imitation Learning
Modern microprocessors are equipped with single instruction multiple data (SIMD) or vector instruction sets which allow compilers to exploit fine-grained data level parallelism. To exploit this parallelism, compilers employ auto-vectorization techniques to automatically convert scalar code into vector code. Larsen & Amarasinghe (2000) first introduced superword level parallelism (SLP) based vectorization, which is one form of vectorization popularly used by compilers. Current compilers employ hand-crafted heuristics and typically only follow one SLP vectorization strategy which can be suboptimal. Recently, Mendis & Amarasinghe (2018) formulated the instruction packing problem of SLP vectorization by leveraging an integer linear programming (ILP) solver, achieving superior runtime performance. In this work, we explore whether it is feasible to imitate optimal decisions made by their ILP solution by fitting a graph neural network policy. We show that the learnt policy produces a vectorization scheme which is better than industry standard compiler heuristics both in terms of static measures and runtime performance. More specifically, the learnt agent produces a vectorization scheme which has a 22.6% higher average reduction in cost compared to LLVM compiler when measured using its own cost model and achieves a geometric mean runtime speedup of 1.015 on the NAS benchmark suite when compared to LLVM's SLP vectorizer.
Common Concerns 2 Novelty
We thank all reviewers for their valuable comments. We address the concerns raised by them below. The idea of using imitation learning to make approximate decisions is not new. The author needs to provide a wall-clock time cost comparison of different methods. We will include them in the final verision.
Reviews: Compiler Auto-Vectorization with Imitation Learning
This paper uses imitation learning to solve the compiler auto-vectorization problem. It trains an agent to mimic the optimal solution generated by an integer linear programming solver. It outperforms production-level compiler LLVM in the experiments. Originality: The novelty is incremental. This paper directly combines well-known techniques and does not make any new contribution from the machine learning perspective.
Reviews: Compiler Auto-Vectorization with Imitation Learning
The paper is interesting and promising but the three reviewers do agree that this is a borderline paper around the acceptance threshold. The main concerns they have expressed are the following: (i) "incremental" application of reinforcement/imitation learning, (ii) weak experimental work and unsufficient benchmarking to other methods. I personally find it has its merits as an end-to-end application of imitation learning in quite an original context and I think it should be confronted to other application papers in the same domain (if any). Now to answer the points made by the reviewers, I think that (a) modeling the learning agent in this context is far from incremental as it combines intimate knowledge of compilers and solid mastering of graph theory and machine learning methodology; as a matter of fact, it certainly brings a paradigm shift in that community, (b) their experiments cover the case of instruction packing which was very recently related to ILP and they discuss the performance of their method wrt to optimal methods and to existing compilers, which is the best they can do at this stage. Given these two points, I tend to bump up slightly the assessment made by Reviewers #1 an #2 and push for acceptance.
Compiler Auto-Vectorization with Imitation Learning
Modern microprocessors are equipped with single instruction multiple data (SIMD) or vector instruction sets which allow compilers to exploit fine-grained data level parallelism. To exploit this parallelism, compilers employ auto-vectorization techniques to automatically convert scalar code into vector code. Larsen & Amarasinghe (2000) first introduced superword level parallelism (SLP) based vectorization, which is one form of vectorization popularly used by compilers. Current compilers employ hand-crafted heuristics and typically only follow one SLP vectorization strategy which can be suboptimal. Recently, Mendis & Amarasinghe (2018) formulated the instruction packing problem of SLP vectorization by leveraging an integer linear programming (ILP) solver, achieving superior runtime performance.
Compiler Auto-Vectorization with Imitation Learning
Mendis, Charith, Yang, Cambridge, Pu, Yewen, Amarasinghe, Dr.Saman, Carbin, Michael
Modern microprocessors are equipped with single instruction multiple data (SIMD) or vector instruction sets which allow compilers to exploit fine-grained data level parallelism. To exploit this parallelism, compilers employ auto-vectorization techniques to automatically convert scalar code into vector code. Larsen & Amarasinghe (2000) first introduced superword level parallelism (SLP) based vectorization, which is one form of vectorization popularly used by compilers. Current compilers employ hand-crafted heuristics and typically only follow one SLP vectorization strategy which can be suboptimal. Recently, Mendis & Amarasinghe (2018) formulated the instruction packing problem of SLP vectorization by leveraging an integer linear programming (ILP) solver, achieving superior runtime performance.