Am Timan
Leave no Place Behind: Improved Geolocation in Humanitarian Documents
Belliardo, Enrico M., Kalimeri, Kyriaki, Mejova, Yelena
Geographical location is a crucial element of humanitarian response, outlining vulnerable populations, ongoing events, and available resources. Latest developments in Natural Language Processing may help in extracting vital information from the deluge of reports and documents produced by the humanitarian sector. However, the performance and biases of existing state-of-the-art information extraction tools are unknown. In this work, we develop annotated resources to fine-tune the popular Named Entity Recognition (NER) tools Spacy and roBERTa to perform geotagging of humanitarian texts. We then propose a geocoding method FeatureRank which links the candidate locations to the GeoNames database. We find that not only does the humanitarian-domain data improves the performance of the classifiers (up to F1 = 0.92), but it also alleviates some of the bias of the existing tools, which erroneously favor locations in the Western countries. Thus, we conclude that more resources from non-Western documents are necessary to ensure that off-the-shelf NER systems are suitable for the deployment in the humanitarian sector.
- Asia > Middle East > Syria (0.14)
- Europe > Portugal > Lisbon > Lisbon (0.05)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- (29 more...)
Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data
Naggita, Keziah, LaChance, Julienne, Xiang, Alice
Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.
- Asia > Brunei (0.14)
- North America > Canada > Quebec > Montreal (0.06)
- Africa > Sierra Leone (0.06)
- (142 more...)
- Health & Medicine (0.92)
- Information Technology > Services (0.75)
- Government > Regional Government (0.46)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)