Goto

Collaborating Authors

 Bagler, Ganesh


Machine learning and natural language processing models to predict the extent of food processing

arXiv.org Artificial Intelligence

The dramatic increase in consumption of ultra-processed food has been associated with numerous adverse health effects. Given the public health consequences linked to ultra-processed food consumption, it is highly relevant to build computational models to predict the processing of food products. We created a range of machine learning, deep learning, and NLP models to predict the extent of food processing by integrating the FNDDS dataset of food products and their nutrient profiles with their reported NOVA processing level. Starting with the full nutritional panel of 102 features, we further implemented coarse-graining of features to 65 and 13 nutrients by dropping flavonoids and then by considering the 13-nutrient panel of FDA, respectively. LGBM Classifier and Random Forest emerged as the best model for 102 and 65 nutrients, respectively, with an F1-score of 0.9411 and 0.9345 and MCC of 0.8691 and 0.8543. For the 13-nutrient panel, Gradient Boost achieved the best F1-score of 0.9284 and MCC of 0.8425. We also implemented NLP based models, which exhibited state-of-the-art performance.


Deep Learning Based Named Entity Recognition Models for Recipes

arXiv.org Artificial Intelligence

Recipes are cultural capsules transmitted across generations via unstructured text. Automated protocols for recognizing named entities, the building blocks of recipe text, are of immense value for various applications ranging from information extraction to novel recipe generation. Named entity recognition is a technique for extracting information from unstructured or semi-structured data with known labels. Starting with manually-annotated data of 6,611 ingredient phrases, we created an augmented dataset of 26,445 phrases cumulatively. Simultaneously, we systematically cleaned and analyzed ingredient phrases from RecipeDB, the gold-standard recipe data repository, and annotated them using the Stanford NER. Based on the analysis, we sampled a subset of 88,526 phrases using a clustering-based approach while preserving the diversity to create the machine-annotated dataset. A thorough investigation of NER approaches on these three datasets involving statistical, fine-tuning of deep learning-based language models and few-shot prompting on large language models (LLMs) provides deep insights. We conclude that few-shot prompting on LLMs has abysmal performance, whereas the fine-tuned spaCy-transformer emerges as the best model with macro-F1 scores of 95.9%, 96.04%, and 95.71% for the manually-annotated, augmented, and machine-annotated datasets, respectively.


Dish detection in food platters: A framework for automated diet logging and nutrition management

arXiv.org Artificial Intelligence

Diet is central to the epidemic of lifestyle disorders. Accurate and effortless diet logging is one of the significant bottlenecks for effective diet management and calorie restriction. Dish detection from food platters is a challenging problem due to a visually complex food layout. We present an end-to-end computational framework for diet management, from data compilation, annotation, and state-of-the-art model identification to its mobile app implementation. As a case study, we implement the framework in the context of Indian food platters known for their complex presentation that poses a challenge for the automated detection of dishes. Starting with the 61 most popular Indian dishes, we identify the state-of-the-art model through a comparative analysis of deep-learning-based object detection architectures. Rooted in a meticulous compilation of 68,005 platter images with 134,814 manual dish annotations, we first compare ten architectures for multi-label classification to identify ResNet152 (mAP=84.51%) as the best model. YOLOv8x (mAP=87.70%) emerged as the best model architecture for dish detection among the eight deep-learning models implemented after a thorough performance evaluation. By comparing with the state-of-the-art model for the IndianFood10 dataset, we demonstrate the superior object detection performance of YOLOv8x for this subset and establish Resnet152 as the best architecture for multi-label classification. The models thus trained on richly annotated data can be extended to include dishes from across global cuisines. The proposed framework is demonstrated through a proof-of-concept mobile application with diverse applications for diet logging, food recommendation systems, nutritional interventions, and mitigation of lifestyle disorders.


A generative grammar of cooking

arXiv.org Artificial Intelligence

Cooking is a uniquely human endeavor for transforming raw ingredients into delicious dishes. Over centuries, cultures worldwide have evolved diverse cooking practices ingrained in their culinary traditions. Recipes, thus, are cultural capsules that capture culinary knowledge in elaborate cooking protocols. While simple quantitative models have probed the patterns in recipe composition and the process of cuisine evolution, unlike other cultural quirks such as language, the principles of cooking remain hitherto unexplored. The fundamental rules that drive the act of cooking, shaping recipe composition and cuisine architecture, are unclear. Here we present a generative grammar of cooking that captures the underlying culinary logic. By studying an extensive repository of structured recipes, we identify core concepts and rules that together forge a combinatorial system for culinary synthesis. Building on the body of work done in the context of language, the demonstration of a logically consistent generative framework offers profound insights into the act of cooking. Given the central role of food in nutrition and lifestyle disorders, culinary grammar provides leverage to improve public health through dietary interventions beyond applications for creative pursuits such as novel recipe generation.


Ratatouille: A tool for Novel Recipe Generation

arXiv.org Artificial Intelligence

Due to availability of a large amount of cooking recipes online, there is a growing interest in using this as data to create novel recipes. Novel Recipe Generation is a problem in the field of Natural Language Processing in which our main interest is to generate realistic, novel cooking recipes. To come up with such novel recipes, we trained various Deep Learning models such as LSTMs and GPT-2 with a large amount of recipe data. We present Ratatouille (https://cosylab.iiitd.edu.in/ratatouille2/), a web based application to generate novel recipes.