Learning Syntax from Naturally-Occurring Bracketings
Shi, Tianze, İrsoy, Ozan, Malioutov, Igor, Lee, Lillian
–arXiv.org Artificial Intelligence
Naturally-occurring bracketings, such as answer fragments to natural language questions and hyperlinks on webpages, can reflect human syntactic intuition regarding phrasal boundaries. Their availability and approximate correspondence to syntax make them appealing as distant information sources to incorporate into unsupervised constituency parsing. But they are noisy and incomplete; to address this challenge, we develop a partial-brackets-aware structured ramp loss in learning. Experiments demonstrate that our distantly-supervised models trained on naturally-occurring bracketing data are more accurate in inducing syntactic structures than competing unsupervised systems. On the English WSJ corpus, our models achieve an unlabeled F1 score of 68.9 for constituency parsing.
arXiv.org Artificial Intelligence
Apr-28-2021
- Country:
- Oceania > Australia
- North America
- United States
- Massachusetts (0.04)
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.15)
- Maryland > Prince George's County
- College Park (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- Delaware > New Castle County
- Newark (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Pennsylvania
- Philadelphia County > Philadelphia (0.04)
- Allegheny County > Pittsburgh (0.04)
- California
- San Francisco County > San Francisco (0.14)
- San Diego County > San Diego (0.04)
- Colorado > Boulder County
- Boulder (0.04)
- New Jersey > Hudson County
- Hoboken (0.04)
- New York > New York County
- New York City (0.04)
- Canada
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- United States
- Europe
- France (0.04)
- Czechia > Prague (0.04)
- Middle East > Malta
- Port Region > Southern Harbour District > Valletta (0.04)
- Germany
- Brandenburg > Potsdam (0.04)
- Berlin (0.04)
- North Rhine-Westphalia > Cologne Region
- Bonn (0.04)
- Spain
- Valencian Community > Valencia Province
- Valencia (0.04)
- Basque Country > Biscay Province
- Bilbao (0.04)
- Valencian Community > Valencia Province
- Sweden > Uppsala County
- Uppsala (0.05)
- Bulgaria > Sofia City Province
- Sofia (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Italy
- Sardinia (0.04)
- Tuscany
- Florence (0.04)
- Pisa Province > Pisa (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Asia
- South Korea (0.04)
- Singapore (0.04)
- China > Hong Kong (0.04)
- Thailand > Chiang Mai
- Chiang Mai (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- Japan > Honshū
- Kansai > Osaka Prefecture > Osaka (0.04)
- Genre:
- Research Report (0.50)
- Technology: