Bitnet.cpp: Efficient Edge Inference for Ternary LLMs

Open in new window