Parameterized Argumentation-based Reasoning Tasks for Benchmarking Generative Language Models

Open in new window