Transformer Redesign for Late Fusion of Audio-Text Features on Ultra-Low-Power Edge Hardware