On the Reliability of Watermarks for Large Language Models