Transition Transfer $Q$-Learning for Composite Markov Decision Processes