AITopics | Jangra, Anubhav

Collaborating Authors

Jangra, Anubhav

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-hop Question Answering

Mavi, Vaibhav, Jangra, Anubhav, Jatowt, Adam

arXiv.org Artificial IntelligenceMay-31-2024

The task of Question Answering (QA) has attracted significant research interest for long. Its relevance to language understanding and knowledge retrieval tasks, along with the simple setting makes the task of QA crucial for strong AI systems. Recent success on simple QA tasks has shifted the focus to more complex settings. Among these, Multi-Hop QA (MHQA) is one of the most researched tasks over the recent years. In broad terms, MHQA is the task of answering natural language questions that involve extracting and combining multiple pieces of information and doing multiple steps of reasoning. An example of a multi-hop question would be "The Argentine PGA Championship record holder has won how many tournaments worldwide?". Answering the question would need two pieces of information: "Who is the record holder for Argentine PGA Championship tournaments?" and "How many tournaments did [Answer of Sub Q1] win?". The ability to answer multi-hop questions and perform multi step reasoning can significantly improve the utility of NLP systems. Consequently, the field has seen a surge with high quality datasets, models and evaluation strategies. The notion of 'multiple hops' is somewhat abstract which results in a large variety of tasks that require multi-hop reasoning. This leads to different datasets and models that differ significantly from each other and makes the field challenging to generalize and survey. We aim to provide a general and formal definition of the MHQA task, and organize and summarize existing MHQA frameworks. We also outline some best practices for building MHQA datasets. This book provides a systematic and thorough introduction as well as the structuring of the existing attempts to this highly interesting, yet quite challenging task.

large language model, machine learning, question answering, (22 more...)

arXiv.org Artificial Intelligence

2204.0914

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Leisure & Entertainment > Sports (1.00)
Health & Medicine (1.00)
Education > Curriculum > Subject-Specific Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(7 more...)

Add feedback

TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions

Mozafari, Jamshid, Jangra, Anubhav, Jatowt, Adam

arXiv.org Artificial IntelligenceMay-10-2024

Nowadays, individuals tend to engage in dialogues with Large Language Models, seeking answers to their questions. In times when such answers are readily accessible to anyone, the stimulation and preservation of human's cognitive abilities, as well as the assurance of maintaining good reasoning skills by humans becomes crucial. This study addresses such needs by proposing hints (instead of final answers or before giving answers) as a viable solution. We introduce a framework for the automatic hint generation for factoid questions, employing it to construct TriviaHG, a novel large-scale dataset featuring 160,230 hints corresponding to 16,645 questions from the TriviaQA dataset. Additionally, we present an automatic evaluation method that measures the Convergence and Familiarity quality attributes of hints. To evaluate the TriviaHG dataset and the proposed evaluation method, we enlisted 10 individuals to annotate 2,791 hints and tasked 6 humans with answering questions using the provided hints. The effectiveness of hints varied, with success rates of 96%, 78%, and 36% for questions with easy, medium, and hard answers, respectively. Moreover, the proposed automatic evaluation methods showed a robust correlation with annotators' results. Conclusively, the findings highlight three key insights: the facilitative role of hints in resolving unknown questions, the dependence of hint quality on answer difficulty, and the feasibility of employing automatic evaluation methods for hint assessment.

information retrieval, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3657855

2403.18426

Country:

Europe (1.00)
Asia (1.00)
North America > United States > District of Columbia > Washington (0.15)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.48)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)

Add feedback

Navigating the Landscape of Hint Generation Research: From the Past to the Future

Jangra, Anubhav, Mozafari, Jamshid, Jatowt, Adam, Muresan, Smaranda

arXiv.org Artificial IntelligenceApr-6-2024

Digital education has gained popularity in the last decade, especially after the COVID-19 pandemic. With the improving capabilities of large language models to reason and communicate with users, envisioning intelligent tutoring systems (ITSs) that can facilitate self-learning is not very far-fetched. One integral component to fulfill this vision is the ability to give accurate and effective feedback via hints to scaffold the learning process. In this survey article, we present a comprehensive review of prior research on hint generation, aiming to bridge the gap between research in education and cognitive science, and research in AI and Natural Language Processing. Informed by our findings, we propose a formal definition of the hint generation task, and discuss the roadmap of building an effective hint generation system aligned with the formal definition, including open challenges, future directions and ethical considerations.

large language model, learner, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2404.04728

Country:

Europe (0.93)
North America > United States (0.68)

Genre:

Overview (1.00)
Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Survey on Multi-modal Summarization

Jangra, Anubhav, Mukherjee, Sourajit, Jatowt, Adam, Saha, Sriparna, Hasanuzzaman, Mohammad

arXiv.org Artificial IntelligenceFeb-13-2023

The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this paper, we present a comprehensive survey of the existing research in the area of MMS, covering various modalities like text, image, audio, and video. Apart from highlighting the different evaluation metrics and datasets used for the MMS task, our work also discusses the current challenges and future directions in this field.

data mining, machine learning, natural language, (24 more...)

arXiv.org Artificial Intelligence

2109.05199

Country:

Europe (1.00)
Asia > India > Bihar > Patna (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Overview (1.00)

Industry:

Leisure & Entertainment > Sports > Tennis (1.00)
Information Technology (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
(9 more...)

Add feedback

Large Scale Multi-Lingual Multi-Modal Summarization Dataset

Verma, Yash, Jangra, Anubhav, Kumar, Raghvendra, Saha, Sriparna

arXiv.org Artificial IntelligenceFeb-13-2023

Significant developments in techniques such as encoder-decoder models have enabled us to represent information comprising multiple modalities. This information can further enhance many downstream tasks in the field of information retrieval and natural language processing; however, improvements in multi-modal techniques and their performance evaluation require large-scale multi-modal data which offers sufficient diversity. Multi-lingual modeling for a variety of tasks like multi-modal summarization, text generation, and translation leverages information derived from high-quality multi-lingual annotated data. In this work, we present the current largest multi-lingual multi-modal summarization dataset (M3LS), and it consists of over a million instances of document-image pairs along with a professionally annotated multi-modal summary for each pair. It is derived from news articles published by British Broadcasting Corporation(BBC) over a decade and spans 20 languages, targeting diversity across five language roots, it is also the largest summarization dataset for 13 languages and consists of cross-lingual summarization data for 2 languages. We formally define the multi-lingual multi-modal summarization task utilizing our dataset and report baseline scores from various state-of-the-art summarization techniques in a multi-lingual setting. We also compare it with many similar datasets to analyze the uniqueness and difficulty of M3LS.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2302.0656

Country:

Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.84)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

T-STAR: Truthful Style Transfer using AMR Graph as Intermediate Representation

Jangra, Anubhav, Nema, Preksha, Raghuveer, Aravindan

arXiv.org Artificial IntelligenceDec-3-2022

Unavailability of parallel corpora for training text style transfer (TST) models is a very challenging yet common scenario. Also, TST models implicitly need to preserve the content while transforming a source sentence into the target style. To tackle these problems, an intermediate representation is often constructed that is devoid of style while still preserving the meaning of the source sentence. In this work, we study the usefulness of Abstract Meaning Representation (AMR) graph as the intermediate style agnostic representation. We posit that semantic notations like AMR are a natural choice for an intermediate representation. Hence, we propose T-STAR: a model comprising of two components, text-to-AMR encoder and a AMR-to-text decoder. We propose several modeling improvements to enhance the style agnosticity of the generated AMR. To the best of our knowledge, T-STAR is the first work that uses AMR as an intermediate representation for TST. With thorough experimental evaluation we show T-STAR significantly outperforms state of the art techniques by achieving on an average 15.2% higher content preservation with negligible loss (3% approx.) in style accuracy. Through detailed human evaluation with 90,000 ratings, we also show that T-STAR has up to 50% lesser hallucinations compared to state of the art TST models.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2212.01667

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.94)
(2 more...)

Add feedback

A Survey on Medical Document Summarization

Jain, Raghav, Jangra, Anubhav, Saha, Sriparna, Jatowt, Adam

arXiv.org Artificial IntelligenceDec-3-2022

The rise of the internet and the corresponding digitization of many aspects of daily life has had a profound impact on society leading to information overload [18]. The sheer amount of information available today can be overwhelming. To combat this, individuals can use summarization techniques to distill the information down to its most essential points. The internet also had a profound impact on medical science. With the proliferation of online health tools, it is now easier than ever before to access medical information and resources [88]. For example, individuals can easily search for medical information, research medical conditions and treatments, and find healthcare providers. Additionally, social media platforms have provided a platform for medical professionals to collaborate, share information, and discuss current medical topics. This has allowed medical professionals to quickly access the latest research, treatments, and developments in the field.

bioinformatics, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2212.01669

Country:

North America > United States (1.00)
Europe (1.00)
Asia > India > Bihar (0.28)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
(5 more...)

Add feedback