Controlling Multimodal LLMs via Reward-guided Decoding

Open in new window