Why Historical Language Is a Challenge for Artificial Intelligence

Nov-17-2021, 18:10:30 GMT–#artificialintelligence

One of the central challenges of Natural Language Processing (NLP) systems is to derive essential insights from a wide variety of written materials. Contributing sources for a training dataset for a new NLP algorithm could be as linguistically diverse as Twitter, broadsheet newspapers, and scientific journals, with all the appellant eccentricities unique to each of just those three sources. When an NLP algorithm has to consider material that comes from multiple eras, it typically struggles to reconcile the very different ways that people speak or write across national and sub-national communities, and especially across different periods in history. Yet, using text data (such as historical treatises and venerable scientific works) that straddles epochs is a potentially useful method of generating a historical oversight of a topic, and of formulating statistical timeline reconstructions that predate the adoption and maintenance of metrics for a domain. For example, weather information contributing to climate change predictive AI models was not adequately recorded around the world until 1880, while data-mining of classical texts offers older records of major meteorological events that may be useful in providing pre-Victorian weather data.

artificial intelligence, natural language, temporal misalignment, (15 more...)

#artificialintelligence

Nov-17-2021, 18:10:30 GMT

News Web Page

Add feedback

AI-Alerts:
- 2021 > 2021-11 > AAAI AI-Alert for Nov 23, 2021 (1.00)

Genre:
- Research Report > New Finding (0.30)

Industry:
- Media > News (0.36)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language (1.00)