Not enough data to create a plot.
Try a different view from the menu above.
Kumar, Vivek
Predator Prey Scavenger Model using Holling's Functional Response of Type III and Physics-Informed Deep Neural Networks
Panchal, Aneesh, Beniwal, Kirti, Kumar, Vivek
Nonlinear mathematical models introduce the relation between various physical and biological interactions present in nature. One of the most famous models is the Lotka-Volterra model which defined the interaction between predator and prey species present in nature. However, predators, scavengers, and prey populations coexist in a natural system where scavengers can additionally rely on the dead bodies of predators present in the system. Keeping this in mind, the formulation and simulation of the predator prey scavenger model is introduced in this paper. For the predation response, respective prey species are assumed to have Holling's functional response of type III. The proposed model is tested for various simulations and is found to be showing satisfactory results in different scenarios. After simulations, the American forest dataset is taken for parameter estimation which imitates the real-world case. For parameter estimation, a physics-informed deep neural network is used with the Adam backpropagation method which prevents the avalanche effect in trainable parameters updation. For neural networks, mean square error and physics-informed informed error are considered. After the neural network, the hence-found parameters are fine-tuned using the Broyden-Fletcher-Goldfarb-Shanno algorithm. Finally, the hence-found parameters using a natural dataset are tested for stability using Jacobian stability analysis. Future research work includes minimization of error induced by parameters, bifurcation analysis, and sensitivity analysis of the parameters.
Unlocking LLMs: Addressing Scarce Data and Bias Challenges in Mental Health
Kumar, Vivek, Ntoutsi, Eirini, Rajawat, Pushpraj Singh, Medda, Giacomo, Recupero, Diego Reforgiato
Large language models (LLMs) have shown promising capabilities in healthcare analysis but face several challenges like hallucinations, parroting, and bias manifestation. These challenges are exacerbated in complex, sensitive, and low-resource domains. Therefore, in this work we introduce IC-AnnoMI, an expert-annotated motivational interviewing (MI) dataset built upon AnnoMI by generating in-context conversational dialogues leveraging LLMs, particularly ChatGPT. IC-AnnoMI employs targeted prompts accurately engineered through cues and tailored information, taking into account therapy style (empathy, reflection), contextual relevance, and false semantic change. Subsequently, the dialogues are annotated by experts, strictly adhering to the Motivational Interviewing Skills Code (MISC), focusing on both the psychological and linguistic dimensions of MI dialogues. We comprehensively evaluate the IC-AnnoMI dataset and ChatGPT's emotional reasoning ability and understanding of domain intricacies by modeling novel classification tasks employing several classical machine learning and current state-of-the-art transformer approaches. Finally, we discuss the effects of progressive prompting strategies and the impact of augmented data in mitigating the biases manifested in IC-AnnoM. Our contributions provide the MI community with not only a comprehensive dataset but also valuable insights for using LLMs in empathetic text generation for conversational therapy in supervised settings.
PhilHumans: Benchmarking Machine Learning for Personal Health
Liventsev, Vadim, Kumar, Vivek, Susaiyah, Allmin Pradhap Singh, Wu, Zixiu, Rodin, Ivan, Yaar, Asfand, Balloccu, Simone, Beraziuk, Marharyta, Battiato, Sebastiano, Farinella, Giovanni Maria, Hรคrmรค, Aki, Helaoui, Rim, Petkovic, Milan, Recupero, Diego Reforgiato, Reiter, Ehud, Riboni, Daniele, Sterling, Raymond
Understaffing has been consistently identified as the major challenge facing Healthcare today [7, 1, 2, 21, 55, 82, 97, 87, 124]. Automation tools that make use of Machine Learning (also known as Healthcare 4.0 [126]) have been consistently identified as crucial for reducing the workload of Healthcare professionals and improving the quality of care [5, 34, 44, 46, 78, 86, 94, 136]. In turn, the shortage of standard benchmarks has been consistently identified as a central roadblock for machine learning in Healthcare [27, 31, 49, 52, 59, 76, 81, 95, 110]. Whether it's ImageNet [32] in Computer Vision or GLUE [128] in natural language processing, benchmarks are a core research tool in mature applications of machine learning, enabling quantitative analysis of learning methodologies to guide and orient their development.
Ask the experts: sourcing high-quality datasets for nutritional counselling through Human-AI collaboration
Balloccu, Simone, Reiter, Ehud, Kumar, Vivek, Recupero, Diego Reforgiato, Riboni, Daniele
Large Language Models (LLMs), with their flexible generation abilities, can be powerful data sources in domains with few or no available corpora. However, problems like hallucinations and biases limit such applications. In this case study, we pick nutrition counselling, a domain lacking any public resource, and show that high-quality datasets can be gathered by combining LLMs, crowd-workers and nutrition experts. We first crowd-source and cluster a novel dataset of diet-related issues, then work with experts to prompt ChatGPT into producing related supportive text. Finally, we let the experts evaluate the safety of the generated text. We release HAI-coaching, the first expert-annotated nutrition counselling dataset containing ~2.4K dietary struggles from crowd workers, and ~97K related supportive texts generated by ChatGPT. Extensive analysis shows that ChatGPT while producing highly fluent and human-like text, also manifests harmful behaviours, especially in sensitive topics like mental health, making it unsuitable for unsupervised use.
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Yu, Lijun, Cheng, Yong, Wang, Zhiruo, Kumar, Vivek, Macherey, Wolfgang, Huang, Yanping, Ross, David A., Essa, Irfan, Bisk, Yonatan, Yang, Ming-Hsuan, Murphy, Kevin, Hauptmann, Alexander G., Jiang, Lu
In this work, we introduce Semantic Pyramid AutoEncoder (SPAE) for enabling frozen LLMs to perform both understanding and generation tasks involving non-linguistic modalities such as images or videos. SPAE converts between raw pixels and interpretable lexical tokens (or words) extracted from the LLM's vocabulary. The resulting tokens capture both the semantic meaning and the fine-grained details needed for visual reconstruction, effectively translating the visual content into a language comprehensible to the LLM, and empowering it to perform a wide array of multimodal tasks. Our approach is validated through in-context learning experiments with frozen PaLM 2 and GPT 3.5 on a diverse set of image understanding and generation tasks. Our method marks the first successful attempt to enable a frozen LLM to generate image content while surpassing state-of-the-art performance in image understanding tasks, under the same setting, by over 25%.
VISU at WASSA 2023 Shared Task: Detecting Emotions in Reaction to News Stories Leveraging BERT and Stacked Embeddings
Kumar, Vivek, Singh, Sushmita, Tiwari, Prayag
Our system, VISU, participated in the WASSA 2023 Shared Task (3) of Emotion Classification from essays written in reaction to news articles. Emotion detection from complex dialogues is challenging and often requires context/domain understanding. Therefore in this research, we have focused on developing deep learning (DL) models using the combination of word embedding representations with tailored prepossessing strategies to capture the nuances of emotions expressed. Our experiments used static and contextual embeddings (individual and stacked) with Bidirectional Long short-term memory (BiLSTM) and Transformer based models. We occupied rank tenth in the emotion detection task by scoring a Macro F1-Score of 0.2717, validating the efficacy of our implemented approaches for small and imbalanced datasets with mixed categories of target emotions.
MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior
Sun, Jennifer J., Marks, Markus, Ulmer, Andrew, Chakraborty, Dipam, Geuther, Brian, Hayes, Edward, Jia, Heng, Kumar, Vivek, Oleszko, Sebastian, Partridge, Zachary, Peelman, Milan, Robie, Alice, Schretter, Catherine E., Sheppard, Keith, Sun, Chao, Uttarwar, Param, Wagner, Julian M., Werner, Eric, Parker, Joseph, Perona, Pietro, Yue, Yisong, Branson, Kristin, Kennedy, Ann
We introduce MABe22, a large-scale, multi-agent video and trajectory benchmark to assess the quality of learned behavior representations. This dataset is collected from a variety of biology experiments, and includes triplets of interacting mice (4.7 million frames video+pose tracking data, 10 million frames pose only), symbiotic beetle-ant interactions (10 million frames video data), and groups of interacting flies (4.4 million frames of pose tracking data). Accompanying these data, we introduce a panel of real-life downstream analysis tasks to assess the quality of learned representations by evaluating how well they preserve information about the experimental conditions (e.g. strain, time of day, optogenetic stimulation) and animal behavior. We test multiple state-of-the-art self-supervised video and trajectory representation learning methods to demonstrate the use of our benchmark, revealing that methods developed using human action datasets do not fully translate to animal datasets. We hope that our benchmark and dataset encourage a broader exploration of behavior representation learning methods across species and settings.
Prediction of Malignant & Benign Breast Cancer: A Data Mining Approach in Healthcare Applications
Kumar, Vivek, Mishra, Brojo Kumar, Mazzara, Manuel, Thanh, Dang N. H., Verma, Abhishek
As much as data science is playing a pivotal role everywhere, healthcare also finds it prominent application. Breast Cancer is the top rated type of cancer amongst women; which took away 627,000 lives alone. This high mortality rate due to breast cancer does need attention, for early detection so that prevention can be done in time. As a potential contributor to state-of-art technology development, data mining finds a multi-fold application in predicting Brest cancer. This work focuses on different classification techniques implementation for data mining in predicting malignant and benign breast cancer. Breast Cancer Wisconsin data set from the UCI repository has been used as experimental dataset while attribute clump thickness being used as an evaluation class. The performances of these twelve algorithms: Ada Boost M 1, Decision Table, J Rip, Lazy IBK, Logistics Regression, Multiclass Classifier, Multilayer Perceptron, Naive Bayes, Random forest and Random Tree are analyzed on this data set. Keywords- Data Mining, Classification Techniques, UCI repository, Breast Cancer, Classification Algorithms
A Conjoint Application of Data Mining Techniques for Analysis of Global Terrorist Attacks -- Prevention and Prediction for Combating Terrorism
Kumar, Vivek, Mazzara, Manuel, Gen., Maj., Messina, Angelo, Lee, JooYoung
Terrorism has become one of the most tedious problems to deal with and a prominent threat to mankind. To enhance counter-terrorism, several research works are developing efficient and precise systems, data mining is not an exception. Immense data is floating in our lives, though the scarce availability of authentic terrorist attack data in the public domain makes it complicated to fight terrorism. This manuscript focuses on data mining classification techniques and discusses the role of United Nations in counter-terrorism. It analyzes the performance of classifiers such as Lazy Tree, Multilayer Perceptron, Multiclass and Na\"ive Bayes classifiers for observing the trends for terrorist attacks around the world. The database for experiment purpose is created from different public and open access sources for years 1970-2015 comprising of 156,772 reported attacks causing massive losses of lives and property. This work enumerates the losses occurred, trends in attack frequency and places more prone to it, by considering the attack responsibilities taken as evaluation class.