This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models