Benchmarking the rationality of AI decision making using the transitivity axiom