JOOCI: a Framework for Learning Comprehensive Speech Representations