Not enough data to create a plot.
Try a different view from the menu above.
Mishra, Shreyash
Designing Multi-Step Action Models for Enterprise AI Adoption
Mishra, Shreyash, Shah, Shrey, Pereira, Rex
This paper introduces the Multi-Step Action Model (MSAM), a closed-source AI model designed by Empsing to address challenges hindering AI adoption in enterprises. Through a holistic examination, this paper explores MSAM's foundational principles, design architecture, and future trajectory. It evaluates MSAM's performance via rigorous testing methodologies and envisions its potential impact on advancing AI adoption within organizations.
Factify 2: A Multimodal Fake News and Satire News Dataset
Suryavardan, S, Mishra, Shreyash, Patwa, Parth, Chakraborty, Megha, Rani, Anku, Reganti, Aishwarya, Chadha, Aman, Das, Amitava, Sheth, Amit, Chinnakotla, Manoj, Ekbal, Asif, Kumar, Srijan
The internet gives the world an open platform to express their views and share their stories. While this is very valuable, it makes fake news one of our society's most pressing problems. Manual fact checking process is time consuming, which makes it challenging to disprove misleading assertions before they cause significant harm. This is he driving interest in automatic fact or claim verification. Some of the existing datasets aim to support development of automating fact-checking techniques, however, most of them are text based. Multi-modal fact verification has received relatively scant attention. In this paper, we provide a multi-modal fact-checking dataset called FACTIFY 2, improving Factify 1 by using new data sources and adding satire articles. Factify 2 has 50,000 new data instances. Similar to FACTIFY 1.0, we have three broad categories - support, no-evidence, and refute, with sub-categories based on the entailment of visual and textual data. We also provide a BERT and Vison Transformer based baseline, which achieves 65% F1 score in the test set. The baseline codes and the dataset will be made available at https://github.com/surya1701/Factify-2.0.
Memotion 3: Dataset on Sentiment and Emotion Analysis of Codemixed Hindi-English Memes
Mishra, Shreyash, Suryavardan, S, Patwa, Parth, Chakraborty, Megha, Rani, Anku, Reganti, Aishwarya, Chadha, Aman, Das, Amitava, Sheth, Amit, Chinnakotla, Manoj, Ekbal, Asif, Kumar, Srijan
Memes are the new-age conveyance mechanism for humor on social media sites. Memes often include an image and some text. Memes can be used to promote disinformation or hatred, thus it is crucial to investigate in details. We introduce Memotion 3, a new dataset with 10,000 annotated memes. Unlike other prevalent datasets in the domain, including prior iterations of Memotion, Memotion 3 introduces Hindi-English Codemixed memes while prior works in the area were limited to only the English memes. We describe the Memotion task, the data collection and the dataset creation methodologies. We also provide a baseline for the task.
Overview of Memotion 3: Sentiment and Emotion Analysis of Codemixed Hinglish Memes
Mishra, Shreyash, Suryavardan, S, Chakraborty, Megha, Patwa, Parth, Rani, Anku, Chadha, Aman, Reganti, Aishwarya, Das, Amitava, Sheth, Amit, Chinnakotla, Manoj, Ekbal, Asif, Kumar, Srijan
Analyzing memes on the internet has emerged as a crucial endeavor due to the impact this multi-modal form of content wields in shaping online discourse. Memes have become a powerful tool for expressing emotions and sentiments, possibly even spreading hate and misinformation, through humor and sarcasm. In this paper, we present the overview of the Memotion 3 shared task, as part of the DeFactify 2 workshop at AAAI-23. The task released an annotated dataset of Hindi-English code-mixed memes based on their Sentiment (Task A), Emotion (Task B), and Emotion intensity (Task C). Each of these is defined as an individual task and the participants are ranked separately for each task. Over 50 teams registered for the shared task and 5 made final submissions to the test set of the Memotion 3 dataset. CLIP, BERT modifications, ViT etc. were the most popular models among the participants along with approaches such as Student-Teacher model, Fusion, and Ensembling. The best final F1 score for Task A is 34.41, Task B is 79.77 and Task C is 59.82.
IMAGINATOR: Pre-Trained Image+Text Joint Embeddings using Word-Level Grounding of Images
Krishna, Varuna, Suryavardan, S, Mishra, Shreyash, Ramamoorthy, Sathyanarayanan, Patwa, Parth, Chakraborty, Megha, Chadha, Aman, Das, Amitava, Sheth, Amit
Word embeddings, i.e., semantically meaningful vector representation of words, are largely influenced by the distributional hypothesis "You shall know a word by the company it keeps" (Harris, 1954), whereas modern prediction-based neural network embeddings rely on design choices and hyperparameter optimization. Word embeddings like Word2Vec, GloVe etc. well capture the contextuality and real-world analogies but contemporary convolution-based image embeddings such as VGGNet, AlexNet, etc. do not capture contextual knowledge. The popular king-queen analogy does not hold true for most commonly used vision embeddings. In this paper, we introduce a pre-trained joint embedding (JE), named IMAGINATOR, trained on 21K distinct image objects level from 1M image+text pairs. JE is a way to encode multimodal data into a vector space where the text modality serves as the ground-ing key, which the complementary modality (in this case, the image) is anchored with. IMAGINATOR encapsulates three individual representations: (i) object-object co-location, (ii) word-object co-location, and (iii) word-object correlation. These three ways capture complementary aspects of the two modalities which are further combined to obtain the final JEs. Generated JEs are intrinsically evaluated to assess how well they capture the contextuality and real-world analogies. We also evaluate pre-trained IMAGINATOR JEs on three downstream tasks: (i) image captioning, (ii) Image2Tweet, and (iii) text-based image retrieval. IMAGINATOR establishes a new standard on the aforementioned down-stream tasks by outperforming the current SoTA on all the selected tasks. IMAGINATOR will be made publicly available. The codes are available at https://github.com/varunakk/IMAGINATOR