Dynamic Syntax Mapping: A New Approach to Unsupervised Syntax Parsing

Gohsh, Buvarp, Ali, Woods, Michael, Anders

Dec-18-2023–arXiv.org Artificial Intelligence

The advancement of large pretrained language models (LLMs) such as BERT [1] has been a catalyst in enhancing performance across a spectrum of natural language processing (NLP) tasks. This progress has spurred investigations into the mechanics of natural language comprehension within these models, particularly concerning their linguistic capabilities. Building on this foundation, our study tests the hypothesis that LLMs inherently contain syntactic dependencies that can be extracted without the need for additional parameters or external guidance. Existing literature in the realm of syntax has delved into two key areas: (1) the examination of syntactically dependent behaviors in language models, such as their ability to handle complex subject-verb agreement scenarios [2, 3, 4, 5, 6, 7, 8], and (2) the feasibility of retrieving syntactic structures from internal representations or mechanisms within the models [11, 12, 13]. The former approach, while agnostic to specific syntactic theories, lacks the clarity offered by the explicit induction of syntactic structures, which is the focus of our work. In an exemplary study, Hewitt and Manning [11] developed a probe to map word representations into a new vector space, facilitating the application of a maximum spanning tree algorithm (MST) for syntactic parsing.

computational linguistic, dynamic syntax mapping, proceedings, (14 more...)

arXiv.org Artificial Intelligence

Dec-18-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California > Santa Clara County
      - Palo Alto (0.04)
- Europe
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - Italy
    - Tuscany > Florence (0.04)
    - Liguria > Genoa (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - France
    - Île-de-France > Paris
      - Paris (0.04)
    - Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
      - Marseille (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)