AITopics | Sagare, Shivprasad

Collaborating Authors

Sagare, Shivprasad

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Audio-visual training for improved grounding in video-text LLMs

Sagare, Shivprasad, S, Hemachandran, Sarabhai, Kinshuk, Ullegaddi, Prashant, SA, Rajeshkumar

arXiv.org Artificial IntelligenceJul-20-2024

Recent advances in multimodal LLMs, have led to several video-text models being proposed for critical video-related tasks. However, most of the previous works support visual input only, essentially muting the audio signal in the video. Few models that support both audio and visual input, are not explicitly trained on audio data. Hence, the effect of audio towards video understanding is largely unexplored. To this end, we propose a model architecture that handles audio-visual inputs explicitly. We train our model with both audio and visual data from a video instruction-tuning dataset. Comparison with vision-only baselines, and other audiovisual models showcase that training on audio data indeed leads to improved grounding of responses. Figure 1: An example of improved grounding in the For better evaluation of audio-visual video-text LLM outputs, due to the additional audio models, we also release a human-annotated signal as input.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.15046

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

XWikiGen: Cross-lingual Summarization for Encyclopedic Text Generation in Low Resource Languages

Taunk, Dhaval, Sagare, Shivprasad, Patil, Anupam, Subramanian, Shivansh, Gupta, Manish, Varma, Vasudeva

arXiv.org Artificial IntelligenceApr-18-2023

Lack of encyclopedic text contributors, especially on Wikipedia, makes automated text generation for low resource (LR) languages a critical problem. Existing work on Wikipedia text generation has focused on English only where English reference articles are summarized to generate English Wikipedia pages. But, for low-resource languages, the scarcity of reference articles makes monolingual summarization ineffective in solving this problem. Hence, in this work, we propose XWikiGen, which is the task of cross-lingual multi-document summarization of text from multiple reference articles, written in various languages, to generate Wikipedia-style text. Accordingly, we contribute a benchmark dataset, XWikiRef, spanning ~69K Wikipedia articles covering five domains and eight languages. We harness this dataset to train a two-stage system where the input is a set of citations and a section title and the output is a section-specific LR summary. The proposed system is based on a novel idea of neural unsupervised extractive summarization to coarsely identify salient information followed by a neural abstractive model to generate the section-specific text. Extensive experiments show that multi-domain training is better than the multi-lingual setup on average.

artificial intelligence, natural language, text processing, (16 more...)

arXiv.org Artificial Intelligence

2303.12308

Country:

North America > United States (0.14)
Europe > Spain (0.14)
Asia > India (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Tennis (0.93)
Media > Film (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback