AITopics | Maputo

Many studies have predicted SocioEconomic Position (SEP) for aggregated spatial units such as villages using satellite data, but SEP prediction at the household level and other sources of imagery have not been yet explored. We assembled a dataset of 975 households in a semi-rural district in southern Mozambique, consisting of self-reported asset, expenditure, and income SEP data, as well as multimodal imagery including satellite images and a ground-based photograph survey of 11 household elements. We fine-tuned a convolutional neural network to extract feature vectors from the images, which we then used in regression analyzes to model household SEP using different sets of image types. The best prediction performance was found when modeling asset-based SEP using random forest models with all image types, while the performance for expenditure- and income-based SEP was lower. Using SHAP, we observed clear differences between the images with the largest positive and negative effects, as well as identified the most relevant household elements in the predictions. Finally, we fitted an additional reduced model using only the identified relevant household elements, which had an only slightly lower performance compared to models using all images. Our results show how ground-based household photographs allow to zoom in from an area-level to an individual household prediction while minimizing the data collection effort by using explainable machine learning. The developed workflow can be potentially integrated into routine household surveys, where the collected household imagery could be used for other purposes, such as refined asset characterization and environmental exposure assessment.

image type, imagery, satellite and household imagery, (11 more...)

arXiv.org Artificial Intelligence

2411.08934

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Austria > Vienna (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Materials (1.00)
Health & Medicine > Epidemiology (0.69)
Government > Regional Government (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Low-resourced Languages and Online Knowledge Repositories: A Need-Finding Study

Nigatu, Hellina Hailu, Canny, John, Chasins, Sarah E.

arXiv.org Artificial IntelligenceMay-26-2024

Online Knowledge Repositories (OKRs) like Wikipedia offer communities a way to share and preserve information about themselves and their ways of living. However, for communities with low-resourced languages -- including most African communities -- the quality and volume of content available are often inadequate. One reason for this lack of adequate content could be that many OKRs embody Western ways of knowledge preservation and sharing, requiring many low-resourced language communities to adapt to new interactions. To understand the challenges faced by low-resourced language contributors on the popular OKR Wikipedia, we conducted (1) a thematic analysis of Wikipedia forum discussions and (2) a contextual inquiry study with 14 novice contributors. We focused on three Ethiopian languages: Afan Oromo, Amharic, and Tigrinya. Our analysis revealed several recurring themes; for example, contributors struggle to find resources to corroborate their articles in low-resourced languages, and language technology support, like translation systems and spellcheck, result in several errors that waste contributors' time. We hope our study will support designers in making online knowledge repositories accessible to low-resourced language speakers.

contributor, participant, wikipedia, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3613904.3642605

2405.16669

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Africa > Kenya > Nairobi City County > Nairobi (0.04)
(26 more...)

Genre:

Questionnaire & Opinion Survey (0.92)
Research Report > New Finding (0.68)

Industry:

Health & Medicine (1.00)
Information Technology (0.92)
Government (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Knowledge Management (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Leveraging Social Media Data to Identify Factors Influencing Public Attitude Towards Accessibility, Socioeconomic Disparity and Public Transportation

Momin, Khondhaker Al, Sadri, Arif Mohaimin, Hasnine, Md Sami

arXiv.org Artificial IntelligenceJan-22-2024

This study proposes a novel method to understand the factors affecting individuals' perception of transport accessibility, socioeconomic disparity, and public infrastructure. As opposed to the time consuming and expensive survey-based approach, this method can generate organic large-scale responses from social media and develop statistical models to understand individuals' perceptions of various transportation issues. This study retrieved and analyzed 36,098 tweets from New York City from March 19, 2020, to May 15, 2022. A state-of-the-art natural language processing algorithm is used for text mining and classification. A data fusion technique has been adopted to generate a series of socioeconomic traits that are used as explanatory variables in the model. The model results show that females and individuals of Asian origin tend to discuss transportation accessibility more than their counterparts, with those experiencing high neighborhood traffic also being more vocal. However, disadvantaged individuals, including the unemployed and those living in low-income neighborhoods or in areas with high natural hazard risks, tend to communicate less about such issues. As for socioeconomic disparity, individuals of Asian origin and those experiencing various types of air pollution are more likely to discuss these topics on Twitter, often with a negative sentiment. However, unemployed, or disadvantaged individuals, as well as those living in areas with high natural hazard risks or expected losses, are less inclined to tweet about this subject. Lack of internet accessibility could be a reason why many disadvantaged individuals do not tweet about transport accessibility and subsidized internet could be a possible solution.

accessibility, socioeconomic disparity, tweet, (10 more...)

arXiv.org Artificial Intelligence

2402.01682

Country:

North America > United States > New York (0.25)
North America > United States > Oklahoma > Cleveland County > Norman (0.04)
North America > United States > District of Columbia > Washington (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)
(3 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Taking it further: leveraging pseudo labels for field delineation across label-scarce smallholder regions

Rufin, Philippe, Wang, Sherrie, Lisboa, Sá Nogueira, Hemmerling, Jan, Tulbure, Mirela G., Meyfroidt, Patrick

arXiv.org Artificial IntelligenceDec-12-2023

Transfer learning allows for resource-efficient geographic transfer of pre-trained field delineation models. However, the scarcity of labeled data for complex and dynamic smallholder landscapes, particularly in Sub-Saharan Africa, remains a major bottleneck for large-area field delineation. This study explores opportunities of using sparse field delineation pseudo labels for fine-tuning models across geographies and sensor characteristics. We build on a FracTAL ResUNet trained for crop field delineation in India (median field size of 0.24 ha) and use this pre-trained model to generate pseudo labels in Mozambique (median field size of 0.06 ha). We designed multiple pseudo label selection strategies and compared the quantities, area properties, seasonal distribution, and spatial agreement of the pseudo labels against human-annotated training labels (n = 1,512). We then used the human-annotated labels and the pseudo labels for model fine-tuning and compared predictions against human field annotations (n = 2,199). Our results indicate i) a good baseline performance of the pre-trained model in both field delineation and field size estimation, and ii) the added value of regional fine-tuning with performance improvements in nearly all experiments. Moreover, we found iii) substantial performance increases when using only pseudo labels (up to 77% of the IoU increases and 68% of the RMSE decreases obtained by human labels), and iv) additional performance increases when complementing human annotations with pseudo labels. Pseudo labels can be efficiently generated at scale and thus facilitate domain adaptation in label-scarce settings. The workflow presented here is a stepping stone for overcoming the persisting data gaps in heterogeneous smallholder agriculture of Sub-Saharan Africa, where labels are commonly scarce.

field delineation, human label, pseudo label, (15 more...)

arXiv.org Artificial Intelligence

2312.08384

Country:

Africa > Sub-Saharan Africa (0.45)
Asia > India (0.25)
Africa > Kenya (0.04)
(20 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (1.00)
Food & Agriculture > Agriculture (1.00)
Media > Television (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Sem@$K$: Is my knowledge graph embedding model semantic-aware?

Hubert, Nicolas, Monnin, Pierre, Brun, Armelle, Monticolo, Davy

arXiv.org Artificial IntelligenceDec-7-2023

Using knowledge graph embedding models (KGEMs) is a popular approach for predicting links in knowledge graphs (KGs). Traditionally, the performance of KGEMs for link prediction is assessed using rank-based metrics, which evaluate their ability to give high scores to ground-truth entities. However, the literature claims that the KGEM evaluation procedure would benefit from adding supplementary dimensions to assess. That is why, in this paper, we extend our previously introduced metric Sem@K that measures the capability of models to predict valid entities w.r.t. domain and range constraints. In particular, we consider a broad range of KGs and take their respective characteristics into account to propose different versions of Sem@K. We also perform an extensive study to qualify the abilities of KGEMs as measured by our metric. Our experiments show that Sem@K provides a new perspective on KGEM quality. Its joint analysis with rank-based metrics offers different conclusions on the predictive power of models. Regarding Sem@K, some KGEMs are inherently better than others, but this semantic superiority is not indicative of their performance w.r.t. rank-based metrics. In this work, we generalize conclusions about the relative performance of KGEMs w.r.t. rank-based and semantic-oriented metrics at the level of families of models. The joint analysis of the aforementioned metrics gives more insight into the peculiarities of each model. This work paves the way for a more comprehensive evaluation of KGEM adequacy for specific downstream tasks.

knowledge graph, relation, sem, (12 more...)

arXiv.org Artificial Intelligence

2301.05601

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Africa > Mozambique > Maputo City > Maputo (0.04)
North America > United States > New York > New York County > New York City (0.04)
(39 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.69)

Add feedback

Regions are Who Walk Them: a Large Pre-trained Spatiotemporal Model Based on Human Mobility for Ubiquitous Urban Sensing

Zhang, Ruixing, Han, Liangzhe, Sun, Leilei, Liu, Yunqi, Wang, Jibin, Lv, Weifeng

arXiv.org Artificial IntelligenceNov-17-2023

User profiling and region analysis are two tasks of significant commercial value. However, in practical applications, modeling different features typically involves four main steps: data preparation, data processing, model establishment, evaluation, and optimization. This process is time-consuming and labor-intensive. Repeating this workflow for each feature results in abundant development time for tasks and a reduced overall volume of task development. Indeed, human mobility data contains a wealth of information. Several successful cases suggest that conducting in-depth analysis of population movement data could potentially yield meaningful profiles about users and areas. Nonetheless, most related works have not thoroughly utilized the semantic information within human mobility data and trained on a fixed number of the regions. To tap into the rich information within population movement, based on the perspective that Regions Are Who walk them, we propose a large spatiotemporal model based on trajectories (RAW). It possesses the following characteristics: 1) Tailored for trajectory data, introducing a GPT-like structure with a parameter count of up to 1B; 2) Introducing a spatiotemporal fine-tuning module, interpreting trajectories as collection of users to derive arbitrary region embedding. This framework allows rapid task development based on the large spatiotemporal model. We conducted extensive experiments to validate the effectiveness of our proposed large spatiotemporal model. It's evident that our proposed method, relying solely on human mobility data without additional features, exhibits a certain level of relevance in user profiling and region analysis. Moreover, our model showcases promising predictive capabilities in trajectory generation tasks based on the current state, offering the potential for further innovative work utilizing this large spatiotemporal model.

information, region analysis, trajectory, (12 more...)

arXiv.org Artificial Intelligence

2311.10471

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Beijing > Beijing (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(10 more...)

Genre: Research Report (0.64)

Industry: Information Technology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data

Naggita, Keziah, LaChance, Julienne, Xiang, Alice

arXiv.org Artificial IntelligenceAug-16-2023

Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.

artificial intelligence, geotagged image, social media, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3600211.3604659

2308.08656

Country:

Asia > Brunei (0.14)
North America > Canada > Quebec > Montreal (0.06)
Africa > Sierra Leone (0.06)
(142 more...)

Genre: Research Report > Experimental Study (0.66)

Industry:

Health & Medicine (0.92)
Information Technology > Services (0.75)
Government > Regional Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback