Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models

Open in new window