Evaluating Deep Unlearning in Large Language Models