Case Study: Testing Model Capabilities in Some Reasoning Tasks