Reviews: On-the-fly Operation Batching in Dynamic Computation Graphs
–Neural Information Processing Systems
Summary: The authors of this paper extend neural network toolkit DyNet with automatic operation batching. Batching enables efficient utilization of CPUs and GPUs by turning matrix-vector products into matrix-matrix products and reducing kernel launch overhead (for GPUs) but it is commonly done manually. Manual batching is manageable for simple feed-forward-networks but it becomes increasingly a headache as we explore more flexible models that take variable-length input, tree-structured input, or networks that perform dynamic control decisions. Chainer, DyNet, and PyTorch are recently proposed neural network toolkits that allow user to dynamically define the computation graph using the syntax of the host language (if, while, etc in python). This is desirable as it avoids tookit specific constructions (e.g., cond in TensorFlow) and make the network definition intuitive but it tends to limit performance because the network construction and computation happens at the same time.
Neural Information Processing Systems
Oct-8-2024, 09:10:58 GMT
- Technology: