Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search