Deep Learning-Driven Multimodal Detection and Movement Analysis of Objects in Culinary