Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics