AITopics | language information

Collaborating Authors

language information

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How Language Directions Align with Token Geometry in Multilingual LLMs

Kim, JaeSeong, Lee, Suan

arXiv.org Artificial IntelligenceNov-24-2025

Multilingual LLMs demonstrate strong performance across diverse languages, yet there has been limited systematic analysis of how language information is structured within their internal representation space and how it emerges across layers. We conduct a comprehensive probing study on six multilingual LLMs, covering all 268 transformer layers, using linear and nonlinear probes together with a new Token--Language Alignment analysis to quantify the layer-wise dynamics and geometric structure of language encoding. Our results show that language information becomes sharply separated in the first transformer block (+76.4$\pm$8.2 percentage points from Layer 0 to 1) and remains almost fully linearly separable throughout model depth. We further find that the alignment between language directions and vocabulary embeddings is strongly tied to the language composition of the training data. Notably, Chinese-inclusive models achieve a ZH Match@Peak of 16.43\%, whereas English-centric models achieve only 3.90\%, revealing a 4.21$\times$ structural imprinting effect. These findings indicate that multilingual LLMs distinguish languages not by surface script features but by latent representational structures shaped by the training corpus. Our analysis provides practical insights for data composition strategies and fairness in multilingual representation learning. All code and analysis scripts are publicly available at: https://github.com/thisiskorea/How-Language-Directions-Align-with-Token-Geometry-in-Multilingual-LLMs.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.16693

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

Fadeeva, Anastasiia, Coriou, Vincent, Antognini, Diego, Musat, Claudiu, Maksai, Andrii

arXiv.org Artificial IntelligenceMar-29-2025

Tablets and styluses are increasingly popular for taking notes. To optimize this experience and ensure a smooth and efficient workflow, it's important to develop methods for accurately interpreting and understanding the content of handwritten digital notes. We introduce a foundational model called InkFM for analyzing full pages of handwritten content. Trained on a diverse mixture of tasks, this model offers a unique combination of capabilities: recognizing text in 28 different scripts, mathematical expressions recognition, and segmenting pages into distinct elements like text and drawings. Our results demonstrate that these tasks can be effectively unified within a single model, achieving SoTA text line segmentation out-of-the-box quality surpassing public baselines like docTR. Fine- or LoRA-tuning our base model on public datasets further improves the quality of page segmentation, achieves state-of the art text recognition (DeepWriting, CASIA, SCUT, and Mathwriting datasets) and sketch classification (QuickDraw). This adaptability of InkFM provides a powerful starting point for developing applications with handwritten input.

large language model, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2503.23081

Country: North America > United States (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.67)

Add feedback

Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper

Yang, Chih-Kai, Huang, Kuan-Po, Lee, Hung-yi

arXiv.org Artificial IntelligenceJul-9-2024

This research explores how the information of prompts interacts with the high-performing speech recognition model, Whisper. We compare its performances when prompted by prompts with correct information and those corrupted with incorrect information. Our results unexpectedly show that Whisper may not understand the textual prompts in a human-expected way. Additionally, we find that performance improvement is not guaranteed even with stronger adherence to the topic information in textual prompts. It is also noted that English prompts generally outperform Mandarin ones on datasets of both languages, likely due to differences in training data distributions for these languages despite the mismatch with pre-training scenarios. Conversely, we discover that Whisper exhibits awareness of misleading information in language tokens by ignoring incorrect language tokens and focusing on the correct ones. In sum, We raise insightful questions about Whisper's prompt understanding and reveal its counter-intuitive behaviors. We encourage further studies.

information, language token, textual prompt, (17 more...)

arXiv.org Artificial Intelligence

2406.05806

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > Taiwan (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Education > Educational Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.89)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model

Huang, Jiawen, Benetos, Emmanouil

arXiv.org Artificial IntelligenceJun-25-2024

Multilingual automatic lyrics transcription (ALT) is a challenging task due to the limited availability of labelled data and the challenges introduced by singing, compared to multilingual automatic speech recognition. Although some multilingual singing datasets have been released recently, English continues to dominate these collections. Multilingual ALT remains underexplored due to the scale of data and annotation quality. In this paper, we aim to create a multilingual ALT system with available datasets. Inspired by architectures that have been proven effective for English ALT, we adapt these techniques to the multilingual scenario by expanding the target vocabulary set. We then evaluate the performance of the multilingual model in comparison to its monolingual counterparts. Additionally, we explore various conditioning methods to incorporate language information into the model. We apply analysis by language and combine it with the language classification performance. Our findings reveal that the multilingual model performs consistently better than the monolingual models trained on the language subsets. Furthermore, we demonstrate that incorporating language information significantly enhances performance.

lyric transcription, multilingual model, transformer, (12 more...)

arXiv.org Artificial Intelligence

2406.17618

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Ontario > Toronto (0.04)
(10 more...)

Genre: Research Report > New Finding (0.48)

Industry: Media (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting

Kashiwagi, Yosuke, Futami, Hayato, Tsunoo, Emiru, Arora, Siddhant, Watanabe, Shinji

arXiv.org Artificial IntelligenceJun-18-2024

End-to-end multilingual speech recognition models handle multiple languages through a single model, often incorporating language identification to automatically detect the language of incoming speech. Since the common scenario is where the language is already known, these models can perform as language-specific by using language information as prompts, which is particularly beneficial for attention-based encoder-decoder architectures. However, the Connectionist Temporal Classification (CTC) approach, which enhances recognition via joint decoding and multi-task training, does not normally incorporate language prompts due to its conditionally independent output tokens. To overcome this, we introduce an encoder prompting technique within the self-conditioned CTC framework, enabling language-specific adaptation of the CTC model in a zero-shot manner. Our method has shown to significantly reduce errors by 28% on average and by 41% on low-resource languages.

encoder, recognition, speech recognition, (14 more...)

arXiv.org Artificial Intelligence

2406.12611

Country:

North America > United States (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.92)

Add feedback

Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter

Li, Song, You, Yongbin, Wang, Xuezhi, Ding, Ke, Wan, Guanglu

arXiv.org Artificial IntelligenceSep-19-2023

Ref. [6, 7] introduced an additional language identification (LID) module Multilingual intelligent assistants, such as ChatGPT, have to predict language information, while Ref. [2] treated language recently gained popularity. To further expand the applications information as a special textual token and concatenated of multilingual artificial intelligence (AI) assistants and it to the input of the decoder of the autoregressive speech facilitate international communication, it is essential to enhance recognition model, achieving joint modeling of speech recognition the performance of multilingual speech recognition, and language identification. Ref. [3] provided language which is a crucial component of speech interaction. In this information directly as prior information to speech recognition paper, we propose two simple and parameter-efficient methods: models, this can be achieved by encoding language information language prompt tuning and f rame-level language as a one-hot vector or embedding and concatenating adapter, to respectively enhance language-configurable and it with acoustic features.

language information, recognition, speech recognition, (12 more...)

arXiv.org Artificial Intelligence

2309.09443

Country: Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Assisting Language Learners: Automated Trans-Lingual Definition Generation via Contrastive Prompt Learning

Zhang, Hengyuan, Li, Dawei, Li, Yanran, Shang, Chenming, Shi, Chufan, Jiang, Yong

arXiv.org Artificial IntelligenceJun-9-2023

The standard definition generation task requires to automatically produce mono-lingual definitions (e.g., English definitions for English words), but ignores that the generated definitions may also consist of unfamiliar words for language learners. In this work, we propose a novel task of Trans-Lingual Definition Generation (TLDG), which aims to generate definitions in another language, i.e., the native speaker's language. Initially, we explore the unsupervised manner of this task and build up a simple implementation of fine-tuning the multi-lingual machine translation model. Then, we develop two novel methods, Prompt Combination and Contrastive Prompt Learning, for further enhancing the quality of the generation. Our methods are evaluated against the baseline Pipeline method in both rich- and low-resource settings, and we empirically establish its superiority in generating higher-quality trans-lingual definitions.

definition generation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.06058

Country:

Europe > France (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.49)

Add feedback

Leveraging Language Identification to Enhance Code-Mixed Text Classification

Takawane, Gauri, Phaltankar, Abhishek, Patwardhan, Varad, Patil, Aryan, Joshi, Raviraj, Takalikar, Mukta S.

arXiv.org Artificial IntelligenceJun-8-2023

The usage of more than one language in the same text is referred to as Code Mixed. It is evident that there is a growing degree of adaption of the use of code-mixed data, especially English with a regional language, on social media platforms. Existing deep-learning models do not take advantage of the implicit language information in the code-mixed text. Our study aims to improve BERT-based models performance on low-resource Code-Mixed Hindi-English Datasets by experimenting with language augmentation approaches. We propose a pipeline to improve code-mixed systems that comprise data preprocessing, word-level language identification, language augmentation, and model training on downstream tasks like sentiment analysis. For language augmentation in BERT models, we explore word-level interleaving and post-sentence placement of language information. We have examined the performance of vanilla BERT-based models and their code-mixed HingBERT counterparts on respective benchmark datasets, comparing their results with and without using word-level language information. The models were evaluated using metrics such as accuracy, precision, recall, and F1 score. Our findings show that the proposed language augmentation approaches work well across different BERT models. We demonstrate the importance of augmenting code-mixed text with language information on five different code-mixed Hindi-English downstream datasets based on sentiment analysis, hate speech detection, and emotion detection.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.04964

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Indonesia > Bali (0.04)
Asia > India > Maharashtra > Pune (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing Multilingual ASR

Jayakumar, Kaousheik, Sukhadia, Vrunda N., Arunkumar, A, Umesh, S.

arXiv.org Artificial IntelligenceMay-31-2023

Building a multilingual Automated Speech Recognition (ASR) system in a linguistically diverse country like India can be a challenging task due to the differences in scripts and the limited availability of speech data. This problem can be solved by exploiting the fact that many of these languages are phonetically similar. These languages can be converted into a Common Label Set (CLS) by mapping similar sounds to common labels. In this paper, new approaches are explored and compared to improve the performance of CLS based multilingual ASR model. Specific language information is infused in the ASR model by giving Language ID or using CLS to Native script converter on top of the CLS Multilingual model. These methods give a significant improvement in Word Error Rate (WER) compared to the CLS baseline. These methods are further tried on out-of-distribution data to check their robustness.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.19584

Country:

Asia > India (0.35)
Asia > Indonesia > Bali (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Filters

Collaborating Authors

language information

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

How Language Directions Align with Token Geometry in Multilingual LLMs

eddb904a6db773755d2857aacadb1cb0-Supplemental.pdf

InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper

Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model

Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting

Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter

Assisting Language Learners: Automated Trans-Lingual Definition Generation via Contrastive Prompt Learning

Leveraging Language Identification to Enhance Code-Mixed Text Classification

The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing Multilingual ASR