MIS-AVoiDD: Modality Invariant and Specific Representation for Audio-Visual Deepfake Detection