CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning

Open in new window