AITopics | Magdum, Vidula

Collaborating Authors

Magdum, Vidula

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

L3Cube-MahaNews: News-based Short Text and Long Document Classification Datasets in Marathi

Mittal, Saloni, Magdum, Vidula, Dhekane, Omkar, Hiwarkhedkar, Sharayu, Joshi, Raviraj

arXiv.org Artificial IntelligenceApr-28-2024

The availability of text or topic classification datasets in the low-resource Marathi language is limited, typically consisting of fewer than 4 target labels, with some achieving nearly perfect accuracy. In this work, we introduce L3Cube-MahaNews, a Marathi text classification corpus that focuses on News headlines and articles. This corpus stands out as the largest supervised Marathi Corpus, containing over 1.05L records classified into a diverse range of 12 categories. To accommodate different document lengths, MahaNews comprises three supervised datasets specifically designed for short text, long documents, and medium paragraphs. The consistent labeling across these datasets facilitates document length-based analysis. We provide detailed data statistics and baseline results on these datasets using state-of-the-art pre-trained BERT models. We conduct a comparative analysis between monolingual and multilingual BERT models, including MahaBERT, IndicBERT, and MuRIL. The monolingual MahaBERT model outperforms all others on every dataset. These resources also serve as Marathi topic classification datasets or models and are publicly available at https://github.com/l3cube-pune/MarathiNLP .

machine learning, natural language, text classification, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-58495-4_4

2404.18216

Country:

Asia > Middle East > UAE (0.14)
Asia > India > Tamil Nadu (0.14)
Asia > India > Maharashtra (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

mahaNLP: A Marathi Natural Language Processing Library

Magdum, Vidula, Dhekane, Omkar, Hiwarkhedkar, Sharayu, Mittal, Saloni, Joshi, Raviraj

arXiv.org Artificial IntelligenceNov-5-2023

We present mahaNLP, an open-source natural language processing (NLP) library specifically built for the Marathi language. It aims to enhance the support for the low-resource Indian language Marathi in the field of NLP. It is an easy-to-use, extensible, and modular toolkit for Marathi text analysis built on state-of-the-art MahaBERT-based transformer models. Our work holds significant importance as other existing Indic NLP libraries provide basic Marathi processing support and rely on older models with restricted performance. Our toolkit stands out by offering a comprehensive array of NLP tasks, encompassing both fundamental preprocessing tasks and advanced NLP tasks like sentiment analysis, NER, hate speech detection, and sentence completion. This paper focuses on an overview of the mahaNLP framework, its features, and its usage.

library, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2311.02579

Country:

Asia > India > Maharashtra (0.14)
Asia > India > Tamil Nadu (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback