Evaluating Multimodal Large Language Models with Daily Composite Tasks in Home Environments

Open in new window