linguistic
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Dominican Republic (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (6 more...)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > France (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York > New York County > New York City (0.06)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (9 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (14 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (29 more...)
- Health & Medicine (0.67)
- Media (0.46)
- North America > United States > Hawaii (0.24)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- (12 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Shanghai > Shanghai (0.05)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (10 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > Indonesia > Bali (0.04)
- (15 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Education (0.67)
- Government (0.46)
- Law (0.46)
- Media (0.46)
Transformer-Enabled Diachronic Analysis of Vedic Sanskrit: Neural Methods for Quantifying Types of Language Change
Hariharan, Ananth, Mortensen, David
This study demonstrates how hybrid neural-symbolic methods can yield significant new insights into the evolution of a morphologically rich, low-resource language. We challenge the naive assumption that linguistic change is simplification by quantitatively analyzing over 2,000 years of Sanskrit, demonstrating how weakly-supervised hybrid methods can yield new insights into the evolution of morphologically rich, low-resource languages. Our approach addresses data scarcity through weak supervision, using 100+ high-precision regex patterns to generate pseudo-labels for fine-tuning a multilingual BERT. We then fuse symbolic and neural outputs via a novel confidence-weighted ensemble, creating a system that is both scalable and interpretable. Applying this framework to a 1.47-million-word diachronic corpus, our ensemble achieves a 52.4% overall feature detection rate. Our findings reveal that Sanskrit's overall morphological complexity does not decrease but is instead dynamically redistributed: while earlier verbal features show cyclical patterns of decline, complexity shifts to other domains, evidenced by a dramatic expansion in compounding and the emergence of new philosophical terminology. Critically, our system produces well-calibrated uncertainty estimates, with confidence strongly correlating with accuracy (Pearson r = 0.92) and low overall calibration error (ECE = 0.043), bolstering the reliability of these findings for computational philology.
- Europe > Austria > Vienna (0.14)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
MTQ-Eval: Multilingual Text Quality Evaluation for Language Models
Pokharel, Rhitabrat, Agrawal, Ameeta
The use of large language models (LLMs) for evaluating outputs is becoming an increasingly effective and scalable approach. However, it remains uncertain whether this capability extends beyond task-specific evaluations to more general assessments of text quality, particularly in multilingual contexts. In this study, we introduce, MTQ-Eval, a novel framework for multilingual text quality evaluation that learns from examples of both high- and low-quality texts, adjusting its internal representations. To develop MTQ-Eval, we first automatically generate text quality preference data and then use it to train open-source base LLMs to align with ratings of high- and low-quality text. Our comprehensive evaluation across 115 languages demonstrates the improved performance of the proposed model. Upon further analysis, we find that this enhanced evaluation capability also leads to notable improvements in downstream tasks.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.05)
- Asia > Singapore (0.04)
- (7 more...)