Reasoning with Preference Constraints: A Benchmark for Language Models in Many-to-One Matching Markets