From Queries to Criteria: Understanding How Astronomers Evaluate LLMs