Synthesizing Sentiment-Controlled Feedback For Multimodal Text and Image Data