Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming