ZERO: Industry-ready Vision Foundation Model with Multi-modal Prompts