StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs

Open in new window