Behavioral Testing: Can Large Language Models Implicitly Resolve Ambiguous Entities?