Researchers say we need better benchmarks to build more useful AI assistants

Open in new window