A scale of conceptual orality and literacy: Automatic text categorization in the tradition of "N\"ahe und Distanz"
–arXiv.org Artificial Intelligence
Koch and Oesterreicher's model of "N\"ahe und Distanz" (N\"ahe = immediacy, conceptual orality; Distanz = distance, conceptual literacy) is constantly used in German linguistics. However, there is no statistical foundation for use in corpus linguistic analyzes, while it is increasingly moving into empirical corpus linguistics. Theoretically, it is stipulated, among other things, that written texts can be rated on a scale of conceptual orality and literacy by linguistic features. This article establishes such a scale based on PCA and combines it with automatic analysis. Two corpora of New High German serve as examples. When evaluating established features, a central finding is that features of conceptual orality and literacy must be distinguished in order to rank texts in a differentiated manner. The scale is also discussed with a view to its use in corpus compilation and as a guide for analyzes in larger corpora. With a theory-driven starting point and as a "tailored" dimension, the approach compared to Biber's Dimension 1 is particularly suitable for these supporting, controlling tasks.
arXiv.org Artificial Intelligence
Feb-5-2025
- Country:
- North America > United States
- New York (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- Sweden > Östergötland County
- Linköping (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Germany
- Saxony > Leipzig (0.04)
- North Rhine-Westphalia > Düsseldorf Region
- Düsseldorf (0.04)
- Hesse > Darmstadt Region
- Frankfurt (0.04)
- Bavaria
- Upper Franconia > Bayreuth (0.04)
- Middle Franconia > Nuremberg (0.04)
- Baden-Württemberg
- Tübingen Region > Tübingen (0.04)
- Stuttgart Region > Stuttgart (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Finland > Southwest Finland
- Turku (0.04)
- United Kingdom > England
- North America > United States
- Genre:
- Research Report
- Experimental Study (0.68)
- New Finding (0.46)
- Research Report
- Technology: