COMO: Cross-Mamba Interaction and Offset-Guided Fusion for Multimodal Object Detection