Can Language Models Laugh at YouTube Short-form Videos?