Unlocking the Potential of Similarity Matching: Scalability, Supervision and Pre-training