TRANSOM: An Efficient Fault-Tolerant System for Training LLMs