Decentralised Q-Learning for Multi-Agent Markov Decision Processes with a Satisfiability Criterion