AITopics | Mohanty, Shrestha

Collaborating Authors

Mohanty, Shrestha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bridging Context Gaps: Enhancing Comprehension in Long-Form Social Conversations Through Contextualized Excerpts

Mohanty, Shrestha, Xuan, Sarah, Jobraeel, Jacob, Kumar, Anurag, Roy, Deb, Kabbara, Jad

arXiv.org Artificial IntelligenceDec-27-2024

We focus on enhancing comprehension in small-group recorded conversations, which serve as a medium to bring people together and provide a space for sharing personal stories and experiences on crucial social matters. One way to parse and convey information from these conversations is by sharing highlighted excerpts in subsequent conversations. This can help promote a collective understanding of relevant issues, by highlighting perspectives and experiences to other groups of people who might otherwise be unfamiliar with and thus unable to relate to these experiences. The primary challenge that arises then is that excerpts taken from one conversation and shared in another setting might be missing crucial context or key elements that were previously introduced in the original conversation. This problem is exacerbated when conversations become lengthier and richer in themes and shared experiences. To address this, we explore how Large Language Models (LLMs) can enrich these excerpts by providing socially relevant context. We present approaches for effective contextualization to improve comprehension, readability, and empathy. We show significant improvements in understanding, as assessed through subjective and objective evaluations. While LLMs can offer valuable context, they struggle with capturing key social aspects. We release the Human-annotated Salient Excerpts (HSE) dataset to support future work. Additionally, we show how context-enriched excerpts can provide more focused and comprehensive conversation summaries.

excerpt, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.19966

Country: North America > United States (0.93)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > K-12 Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Bridging the Data Provenance Gap Across Text, Speech and Video

Longpre, Shayne, Singh, Nikhil, Cherep, Manuel, Tiwary, Kushagra, Materzynska, Joanna, Brannon, William, Mahari, Robert, Dey, Manan, Hamdy, Mohammed, Saxena, Nayan, Anis, Ahmad Mustafa, Alghamdi, Emad A., Chien, Vu Minh, Obeng-Marnu, Naana, Yin, Da, Qian, Kun, Li, Yizhi, Liang, Minnie, Dinh, An, Mohanty, Shrestha, Mataciunas, Deividas, South, Tobin, Zhang, Jianguo, Lee, Ariel N., Lund, Campbell S., Klamm, Christopher, Sileo, Damien, Misra, Diganta, Shippole, Enrico, Klyman, Kevin, Miranda, Lester JV, Muennighoff, Niklas, Ye, Seonghyeon, Kim, Seungone, Gupta, Vipul, Sharma, Vivek, Zhou, Xuhui, Xiong, Caiming, Villa, Luis, Biderman, Stella, Pentland, Alex, Hooker, Sara, Kabbara, Jad

arXiv.org Artificial IntelligenceDec-18-2024

Progress in AI is driven largely by the scale and quality of training data. Despite this, there is a deficit of empirical analysis examining the attributes of well-established datasets beyond text. In this work we conduct the largest and first-of-its-kind longitudinal audit across modalities--popular text, speech, and video datasets--from their detailed sourcing trends and use restrictions to their geographical and linguistic representation. Our manual analysis covers nearly 4000 public datasets between 1990-2024, spanning 608 languages, 798 sources, 659 organizations, and 67 countries. We find that multimodal machine learning applications have overwhelmingly turned to web-crawled, synthetic, and social media platforms, such as YouTube, for their training sets, eclipsing all other sources since 2019. Secondly, tracing the chain of dataset derivations we find that while less than 33% of datasets are restrictively licensed, over 80% of the source content in widely-used text, speech, and video datasets, carry non-commercial restrictions. Finally, counter to the rising number of languages and geographies represented in public AI training datasets, our audit demonstrates measures of relative geographical and multilingual representation have failed to significantly improve their coverage since 2013. We believe the breadth of our audit enables us to empirically examine trends in data sourcing, restrictions, and Western-centricity at an ecosystem-level, and that visibility into these questions are essential to progress in responsible AI. As a contribution to ongoing improvements in dataset transparency and responsible use, we release our entire multimodal audit, allowing practitioners to trace data provenance across text, speech, and video.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.17847

Country:

Europe (1.00)
Asia > Middle East (0.92)
Africa (0.67)
(3 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Information Technology > Security & Privacy (0.67)
Media > News (0.67)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents

Mohanty, Shrestha, Arabzadeh, Negar, Tupini, Andrea, Sun, Yuxuan, Skrynnik, Alexey, Zholus, Artem, Côté, Marc-Alexandre, Kiseleva, Julia

arXiv.org Artificial IntelligenceJul-11-2024

Seamless interaction between AI agents and humans using natural language remains a key goal in AI research. This paper addresses the challenges of developing interactive agents capable of understanding and executing grounded natural language instructions through the IGLU competition at NeurIPS. Despite advancements, challenges such as a scarcity of appropriate datasets and the need for effective evaluation platforms persist. We introduce a scalable data collection tool for gathering interactive grounded language instructions within a Minecraft-like environment, resulting in a Multi-Modal dataset with around 9,000 utterances and over 1,000 clarification questions. Additionally, we present a Human-in-the-Loop interactive evaluation platform for qualitative analysis and comparison of agent performance through multi-turn communication with human annotators. We offer to the community these assets referred to as IDAT (IGLU Dataset And Toolkit) which aim to advance the development of intelligent, interactive AI agents and provide essential resources for further research.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.08898

Country:

Europe > France (0.14)
North America > United States > Massachusetts (0.14)
Europe > Middle East > Malta (0.14)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Transforming Human-Centered AI Collaboration: Redefining Embodied Agents Capabilities through Interactive Grounded Language Instructions

Mohanty, Shrestha, Arabzadeh, Negar, Kiseleva, Julia, Zholus, Artem, Teruel, Milagro, Awadallah, Ahmed, Sun, Yuxuan, Srinet, Kavya, Szlam, Arthur

arXiv.org Artificial IntelligenceMay-18-2023

Human intelligence's adaptability is remarkable, allowing us to adjust to new tasks and multi-modal environments swiftly. This skill is evident from a young age as we acquire new abilities and solve problems by imitating others or following natural language instructions. The research community is actively pursuing the development of interactive "embodied agents" that can engage in natural conversations with humans and assist them with real-world tasks. These agents must possess the ability to promptly request feedback in case communication breaks down or instructions are unclear. Additionally, they must demonstrate proficiency in learning new vocabulary specific to a given domain. In this paper, we made the following contributions: (1) a crowd-sourcing tool for collecting grounded language instructions; (2) the largest dataset of grounded language instructions; and (3) several state-of-the-art baselines. These contributions are suitable as a foundation for further research.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.10783

Country: Europe (0.68)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Collecting Interactive Multi-modal Datasets for Grounded Language Understanding

Mohanty, Shrestha, Arabzadeh, Negar, Teruel, Milagro, Sun, Yuxuan, Zholus, Artem, Skrynnik, Alexey, Burtsev, Mikhail, Srinet, Kavya, Panov, Aleksandr, Szlam, Arthur, Côté, Marc-Alexandre, Kiseleva, Julia

arXiv.org Artificial IntelligenceMar-21-2023

Human intelligence can remarkably adapt quickly to new tasks and environments. Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions. To facilitate research which can enable similar capabilities in machines, we made the following contributions (1) formalized the collaborative embodied agent using natural language task; (2) developed a tool for extensive and scalable data collection; and (3) collected the first dataset for interactive grounded language understanding.

artificial intelligence, collecting interactive multi-modal dataset, natural language

arXiv.org Artificial Intelligence

2211.06552

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

NeurIPS 2021 Competition IGLU: Interactive Grounded Language Understanding in a Collaborative Environment

Kiseleva, Julia, Li, Ziming, Aliannejadi, Mohammad, Mohanty, Shrestha, ter Hoeve, Maartje, Burtsev, Mikhail, Skrynnik, Alexey, Zholus, Artem, Panov, Aleksandr, Srinet, Kavya, Szlam, Arthur, Sun, Yuxuan, Hofmann, Katja, Galley, Michel, Awadallah, Ahmed

arXiv.org Artificial IntelligenceOct-14-2021

Human intelligence has the remarkable ability to adapt to new tasks and environments quickly. Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions. To facilitate research in this direction, we propose IGLU: Interactive Grounded Language Understanding in a Collaborative Environment. The primary goal of the competition is to approach the problem of how to build interactive agents that learn to solve a task while provided with grounded natural language instructions in a collaborative environment. Understanding the complexity of the challenge, we split it into sub-tasks to make it feasible for participants. This research challenge is naturally related, but not limited, to two fields of study that are highly relevant to the NeurIPS community: Natural Language Understanding and Generation (NLU/G) and Reinforcement Learning (RL). Therefore, the suggested challenge can bring two communities together to approach one of the important challenges in AI. Another important aspect of the challenge is the dedication to perform a human-in-the-loop evaluation as a final evaluation for the agents developed by contestants.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2110.06536

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback