Bi-directional personalization reinforcement learning-based architecture with active learning using a multi-model data service for the travel nursing industry