A Theoretically Grounded Benchmark for Evaluating Machine Commonsense

Open in new window