Language Models Do Not Follow Occam's Razor: A Benchmark for Inductive and Abductive Reasoning