Architectures of Meaning, A Systematic Corpus Analysis of NLP Systems

Wysocki, Oskar, Florea, Malina, Landers, Donal, Freitas, Andre

Jul-16-2021–arXiv.org Artificial Intelligence

Natural Language Processing (NLP) systems have been subjected to a Cambrian explosion of architectural paradigms in the past few years. The scale on the number of contributions and its exponential growth, bring challenges in understanding how NLP architectural patterns evolve and consolidate in different sub-areas and tasks. This paper aims to provide the methodological support for the interpretation of NLP architectural patterns at scale by applying statistical corpus analysis methods over large-scale NLP corpora. We analyse the use of corpus statistics to compute large-scale collocation patterns jointly with graph visualisation methods as a device to interpret architectural patterns at scale. The proposed methods aims to address questions such as: - What is the complete list of architectural patterns present in NLP? - What are the prevailing architectural patterns (classifiers, layers, regularisation, linguistic resources) for each NLP task? - How these patterns are evolving over time and what are the emerging consolidated/canonical architectural motifs?

collocation, lemmatization, word embedding, (14 more...)

arXiv.org Artificial Intelligence

Jul-16-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Hawaii > Honolulu County
    - Honolulu (0.04)
  - California > San Francisco County
    - San Francisco (0.14)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)

Genre:
- Research Report (0.67)
- Overview (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (0.58)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.47)