Evaluating Large Language Models on the Frame and Symbol Grounding Problems: A Zero-shot Benchmark

Open in new window