AITopics | kannada

Collaborating Authors

kannada

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

M3DR: Towards Universal Multilingual Multimodal Document Retrieval

Kolavi, Adithya S, Jain, Vyoman

arXiv.org Artificial IntelligenceDec-4-2025

Multimodal document retrieval systems have shown strong progress in aligning visual and textual content for semantic search. However, most existing approaches remain heavily English-centric, limiting their effectiveness in multilingual contexts. In this work, we present M3DR (Multilingual Multimodal Document Retrieval), a framework designed to bridge this gap across languages, enabling applicability across diverse linguistic and cultural contexts. M3DR leverages synthetic multilingual document data and generalizes across different vision-language architectures and model sizes, enabling robust cross-lingual and cross-modal alignment. Using contrastive training, our models learn unified representations for text and document images that transfer effectively across languages. We validate this capability on 22 typologically diverse languages, demonstrating consistent performance and adaptability across linguistic and script variations. We further introduce a comprehensive benchmark that captures real-world multilingual scenarios, evaluating models under monolingual, multilingual, and mixed-language settings. M3DR generalizes across both single dense vector and ColBERT-style token-level multi-vector retrieval paradigms. Our models, NetraEmbed and ColNetraEmbed achieve state-of-the-art performance with ~150% relative improvements on cross-lingual retrieval.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.03514

Country:

Asia (0.67)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

The Riddle of Reflection: Evaluating Reasoning and Self-Awareness in Multilingual LLMs using Indian Riddles

M, Abhinav P, Saxena, Ojasva, C, Oswald, Krishnamurthy, Parameswari

arXiv.org Artificial IntelligenceNov-5-2025

The extent to which large language models (LLMs) can perform culturally grounded reasoning across non-English languages remains underexplored. This paper examines the reasoning and self-assessment abilities of LLMs across seven major Indian languages-Bengali, Gujarati, Hindi, Kannada, Malayalam, Tamil, and Telugu. We introduce a multilingual riddle dataset combining traditional riddles with context-reconstructed variants and evaluate five LLMs-Gemini 2.5 Pro, Gemini 2.5 Flash, Mistral-Saba, LLaMA 4 Scout, and LLaMA 4 Maverick-under seven prompting strategies. In the first stage, we assess riddle-solving performance and find that while Gemini 2.5 Pro performs best overall, few-shot methods yield only marginal gains, and accuracy varies notably across languages. In the second stage, we conduct a self-evaluation experiment to measure reasoning consistency. The results reveal a key finding: a model's initial accuracy is inversely correlated with its ability to identify its own mistakes. Top-performing models such as Gemini 2.5 Pro are overconfident (4.34% True Negative Rate), whereas lower-performing models like LLaMA 4 Scout are substantially more self-aware (42.09% True Negative Rate). These results point to clear gaps in multilingual reasoning and highlight the need for models that not only reason effectively but also recognize their own limitations.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.0096

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Hidden Costs of Translation Accuracy: Distillation, Quantization, and Environmental Impact

Vijay, Dhaathri, Vadapalli, Anandaswarup

arXiv.org Artificial IntelligenceOct-3-2025

The rapid expansion of large language models (LLMs) has heightened concerns about their computational and environmental costs. This study investigates the trade-offs between translation quality and efficiency by comparing full-scale, distilled, and quantized models using machine translation as a case study. We evaluated performance on the Flores+ benchmark and through human judgments of conversational translations in French, Hindi, and Kannada. Our analysis revealed that the full 3.3B FP32 model, while achieving the highest BLEU scores, incurred the largest environmental footprint (~ 0.007-0.008 kg CO2 per run). The distilled 600M FP32 model reduced inference time by 71-78% and carbon emissions by 63-65% compared with the full model, with only minimal reductions in BLEU scores. Human evaluations further showed that even aggressive quantization (INT4) preserved high levels of accuracy and fluency, with differences between models generally minor. These findings demonstrate that model compression strategies can substantially reduce computational demands and environmental impact while maintaining competitive translation quality, though trade-offs are more pronounced in low-resource settings. We argue for evaluation frameworks that integrate efficiency and sustainability alongside accuracy as central dimensions of progress in NLP.

artificial intelligence, natural language, translation quality, (14 more...)

arXiv.org Artificial Intelligence

2509.2399

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.89)

Industry: Law > Environmental Law (0.62)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

MorphNAS: Differentiable Architecture Search for Morphologically-Aware Multilingual NER

Devadiga, Prathamesh, Shetty, Omkaar Jayadev, Nachnani, Hiya, R, Prema

arXiv.org Artificial IntelligenceAug-25-2025

This work introduces MorphNAS, a novel differentiable neural architecture search framework designed to address these challenges. MorphNAS enhances Differentiable Architecture Search (DARTS) by incorporating linguistic meta-features--such as script type and morphological complexity--to optimize neural architectures for Named Entity Recognition (NER). It automatically identifies optimal micro-architectural elements tailored to language-specific morphology. By automating this search, MorphNAS aims to maximize the proficiency of multilingual NLP models, leading to improved comprehension and processing of these complex languages.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2508.15836

Country: Asia > India (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.56)

Add feedback

Towards Inclusive NLP: Assessing Compressed Multilingual Transformers across Diverse Language Benchmarks

Alshehhi, Maitha, Sharshar, Ahmed, Guizani, Mohsen

arXiv.org Artificial IntelligenceJul-29-2025

Although LLMs have attained significant success in high-resource languages, their capacity in low-resource linguistic environments like Kannada and Arabic is not yet fully understood. This work benchmarking the performance of multilingual and monolingual Large Language Models (LLMs) across Arabic, English, and Indic languages, with particular emphasis on the effects of model compression strategies such as pruning and quantization. Findings shows significant performance differences driven by linguistic diversity and resource availability on SOTA LLMS as BLOOMZ, AceGPT, Jais, LLaMA-2, XGLM, and AraGPT2. We find that multilingual versions of the model outperform their language-specific counterparts across the board, indicating substantial cross-lingual transfer benefits. Quantization (4-bit and 8-bit) is effective in maintaining model accuracy while promoting efficiency, but aggressive pruning significantly compromises performance, especially in bigger models. Our findings pinpoint key strategies to construct scalable and fair multilingual NLP solutions and underscore the need for interventions to address hallucination and generalization errors in the low-resource setting.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.19699

Country: Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Kinship in Speech: Leveraging Linguistic Relatedness for Zero-Shot TTS in Indian Languages

Pathak, Utkarsh, Gunda, Chandra Sai Krishna, Prakash, Anusha, Agarwal, Keshav, Murthy, Hema A.

arXiv.org Artificial IntelligenceJun-5-2025

Text-to-speech (TTS) systems typically require high-quality studio data and accurate transcriptions for training. India has 1369 languages, with 22 official using 13 scripts. Training a TTS system for all these languages, most of which have no digital resources, seems a Herculean task. Our work focuses on zero-shot synthesis, particularly for languages whose scripts and phonotactics come from different families. The novelty of our work is in the augmentation of a shared phone representation and modifying the text parsing rules to match the phonotac-tics of the target language, thus reducing the synthesiser overhead and enabling rapid adaptation. Intelligible and natural speech was generated for Sanskrit, Maharashtrian and Canara Konkani, Maithili and Kurukh by leveraging linguistic connections across languages with suitable synthesisers. Evaluations confirm the effectiveness of this approach, highlighting its potential to expand speech technology access for under-represented languages. Index T erms: zero-shot synthesis, unseen Indian languages, common label set (CLS), low resource, unified parser.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.03884

Country: Asia > India (0.51)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.84)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.52)

Add feedback

Semantically Cohesive Word Grouping in Indian Languages

Karthika, N J, Patra, Adyasha, Naidu, Nagasai Saketh, Bhattacharya, Arnab, Ramakrishnan, Ganesh, Dangarikar, Chaitali

arXiv.org Artificial IntelligenceJan-7-2025

Indian languages are inflectional and agglutinative and typically follow clause-free word order. The structure of sentences across most major Indian languages are similar when their dependency parse trees are considered. While some differences in the parsing structure occur due to peculiarities of a language or its preferred natural way of conveying meaning, several apparent differences are simply due to the granularity of representation of the smallest semantic unit of processing in a sentence. The semantic unit is typically a word, typographically separated by whitespaces. A single whitespace-separated word in one language may correspond to a group of words in another. Hence, grouping of words based on semantics helps unify the parsing structure of parallel sentences across languages and, in the process, morphology. In this work, we propose word grouping as a major preprocessing step for any computational or linguistic processing of sentences for Indian languages. Among Indian languages, since Hindi is one of the least agglutinative, we expect it to benefit the most from word-grouping. Hence, in this paper, we focus on Hindi to study the effects of grouping. We perform quantitative assessment of our proposal with an intrinsic method that perturbs sentences by shuffling words as well as an extrinsic evaluation that verifies the importance of word grouping for the task of Machine Translation (MT) using decomposed prompting. We also qualitatively analyze certain aspects of the syntactic structure of sentences. Our experiments and analyses show that the proposed grouping technique brings uniformity in the syntactic structures, as well as aids underlying NLP tasks.

artificial intelligence, indian language, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.03988

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

AMPS: ASR with Multimodal Paraphrase Supervision

Parulekar, Amruta, Gupta, Abhishek, Chattopadhyay, Sameep, Jyothi, Preethi

arXiv.org Artificial IntelligenceNov-27-2024

Spontaneous or conversational multilingual speech presents many challenges for state-of-the-art automatic speech recognition (ASR) systems. In this work, we present a new technique AMPS that augments a multilingual multimodal ASR system with paraphrase-based supervision for improved conversational ASR in multiple languages, including Hindi, Marathi, Malayalam, Kannada, and Nyanja. We use paraphrases of the reference transcriptions as additional supervision while training the multimodal ASR model and selectively invoke this paraphrase objective for utterances with poor ASR performance. Using AMPS with a state-of-the-art multimodal model SeamlessM4T, we obtain significant relative reductions in word error rates (WERs) of up to 5%. We present detailed analyses of our system using both objective and human evaluation metrics.

asr, speech, transcription, (16 more...)

arXiv.org Artificial Intelligence

2411.18368

Country:

Asia > India > Tripura (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Prompt Engineering Using GPT for Word-Level Code-Mixed Language Identification in Low-Resource Dravidian Languages

Deroy, Aniket, Maity, Subhankar

arXiv.org Artificial IntelligenceNov-6-2024

Language Identification (LI) is crucial for various natural language processing tasks, serving as a foundational step in applications such as sentiment analysis, machine translation, and information retrieval. In multilingual societies like India, particularly among the youth engaging on social media, text often exhibits code-mixing, blending local languages with English at different linguistic levels. This phenomenon presents formidable challenges for LI systems, especially when languages intermingle within single words. Dravidian languages, prevalent in southern India, possess rich morphological structures yet suffer from under-representation in digital platforms, leading to the adoption of Roman or hybrid scripts for communication. This paper introduces a prompt based method for a shared task aimed at addressing word-level LI challenges in Dravidian languages. In this work, we leveraged GPT-3.5 Turbo to understand whether the large language models is able to correctly classify words into correct categories. Our findings show that the Kannada model consistently outperformed the Tamil model across most metrics, indicating a higher accuracy and reliability in identifying and categorizing Kannada language instances. In contrast, the Tamil model showed moderate performance, particularly needing improvement in precision and recall.

kannada, language identification, proceedings, (11 more...)

arXiv.org Artificial Intelligence

2411.04025

Country:

Asia > India > West Bengal > Kharagpur (0.05)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Law (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Everyday Speech in the Indian Subcontinent

Pathak, Utkarsh, Gunda, Chandra Sai Krishna, Sathiyamoorthy, Sujitha, Agarwal, Keshav, Murthy, Hema A.

arXiv.org Artificial IntelligenceOct-14-2024

India has 1369 languages of which 22 are official. About 13 different scripts are used to represent these languages. A Common Label Set (CLS) was developed based on phonetics to address the issue of large vocabulary of units required in the End to End (E2E) framework for multilingual synthesis. This reduced the footprint of the synthesizer and also enabled fast adaptation to new languages which had similar phonotactics, provided language scripts belonged to the same family. In this paper, we provide new insights into speech synthesis, where the script belongs to one family, while the phonotactics comes from another. Indian language text is first converted to CLS, and then a synthesizer that matches the phonotactics of the language is used. Quality akin to that of a native speaker is obtained for Sanskrit and Konkani with zero adaptation data, using Kannada and Marathi synthesizers respectively. Further, this approach also lends itself seamless code switching across 13 Indian languages and English in a given native speaker's voice.

artificial intelligence, machine learning, synthesis, (15 more...)

arXiv.org Artificial Intelligence

2410.10508

Country:

Asia > India > Tamil Nadu > Chennai (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.54)

Add feedback