Goto

Collaborating Authors

 cartography


A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

Linåker, Johan, Osborne, Cailean, Ding, Jennifer, Burtenshaw, Ben

arXiv.org Artificial Intelligence

The proliferation of open large language models (LLMs) is fostering a vibrant ecosystem of research and innovation in artificial intelligence (AI). However, the methods of collaboration used to develop open LLMs both before and after their public release have not yet been comprehensively studied, limiting our understanding of how open LLM projects are initiated, organized, and governed as well as what opportunities there are to foster this ecosystem even further. We address this gap through an exploratory analysis of open collaboration throughout the development and reuse lifecycle of open LLMs, drawing on semi-structured interviews with the developers of 14 open LLMs from grassroots projects, research institutes, startups, and Big Tech companies in North America, Europe, Africa, and Asia. We make three key contributions to research and practice. First, collaboration in open LLM projects extends far beyond the LLMs themselves, encompassing datasets, benchmarks, open source frameworks, leaderboards, knowledge sharing and discussion forums, and compute partnerships, among others. Second, open LLM developers have a variety of social, economic, and technological motivations, from democratizing AI access and promoting open science to building regional ecosystems and expanding language representation. Third, the sampled open LLM projects exhibit five distinct organizational models, ranging from single company projects to non-profit-sponsored grassroots projects, which vary in their centralization of control and community engagement strategies used throughout the open LLM lifecycle. We conclude with practical recommendations for stakeholders seeking to support the global community building a more open future for AI.


CartoAgent: a multimodal large language model-powered multi-agent cartographic framework for map style transfer and evaluation

Wang, Chenglong, Kang, Yuhao, Gong, Zhaoya, Zhao, Pengjun, Feng, Yu, Zhang, Wenjia, Li, Ge

arXiv.org Artificial Intelligence

The rapid development of generative artificial intelligence (GenAI) presents new opportunities to advance the cartographic process. Previous studies have either overlooked the artistic aspects of maps or faced challenges in creating both accurate and informative maps. In this study, we propose CartoAgent, a novel multi-agent cartographic framework powered by multimodal large language models (MLLMs). This framework simulates three key stages in cartographic practice: preparation, map design, and evaluation. At each stage, different MLLMs act as agents with distinct roles to collaborate, discuss, and utilize tools for specific purposes. In particular, CartoAgent leverages MLLMs' visual aesthetic capability and world knowledge to generate maps that are both visually appealing and informative. By separating style from geographic data, it can focus on designing stylesheets without modifying the vector-based data, thereby ensuring geographic accuracy. We applied CartoAgent to a specific task centered on map restyling-namely, map style transfer and evaluation. The effectiveness of this framework was validated through extensive experiments and a human evaluation study. CartoAgent can be extended to support a variety of cartographic design decisions and inform future integrations of GenAI in cartography.


Artificial Intelligence and the Spatial Documentation of Languages

Ghanim, Hakam

arXiv.org Artificial Intelligence

The advancement in technology has made interdisciplinary research more accessible. Particularly, the breakthrough in Artificial Intelligence (AI) has given huge advantages to researchers working in interdisciplinary and multidisciplinary fields. This study investigates the ability of AI models, particularly GPT-4 and GPT Data Analyst, in creating language maps for language documentation. The study Integrates documentary linguistics, linguistic geography, and AI by showcasing how AI models facilitate the spatial documentation of languages through the creation of language maps with minimal cartographic expertise. The study is conducted using a CSV file and a GeoJSON file both obtained from HDX and from the researcher's fieldwork. The study data is then applied in realtime conversations with the AI models in order to generate the language distribution maps. The study highlights the two AI models capabilities in generating high-quality static and interactive web maps and streamlining the mapmaking process, despite facing challenges like inconsistencies and difficulties in adding legends. The findings suggest a promising future for AI in generating language maps and enhancing the work of documentary linguists as they collect their data in the field, pointing towards the need for further development to fully harness AI's potential in this field. Key words: language documentation, linguistic geography, geo-linguistics, cartography, artificial intelligence, ChatGPT 1-Introduction The evolution of technology has profoundly shaped the field of language documentation, marking a journey from the humble pen and notebook to the sophisticated realms of digital mapping and artificial intelligence.


Improving QA Model Performance with Cartographic Inoculation

Chen, Allen, Tanrikulu, Okan

arXiv.org Artificial Intelligence

QA models are faced with complex and openended contextual reasoning problems, but can often learn well-performing solution heuristics by exploiting dataset-specific patterns in their training data. These patterns, or "dataset artifacts", reduce the model's ability to generalize to real-world QA problems. Utilizing an ElectraSmallDiscriminator model trained for QA, we analyze the impacts and incidence of dataset artifacts using an adversarial challenge set designed to confuse models reliant on artifacts for prediction. Extending existing work on methods for mitigating artifact impacts, we propose cartographic inoculation, a novel method that fine-tunes models on an optimized subset of the challenge data to reduce model reliance on dataset artifacts. We show Figure 1: Visualization depicting the inoculation by that by selectively fine-tuning a model on ambiguous fine-tuning method and potential outcomes, figure adversarial examples from a challenge adapted from Liu et al. (2019) set, significant performance improvements can be made on the full challenge dataset with minimal loss of model generalizability to other


Artificial Intelligence Studies in Cartography: A Review and Synthesis of Methods, Applications, and Ethics

Kang, Yuhao, Gao, Song, Roth, Robert E.

arXiv.org Artificial Intelligence

The past decade has witnessed the rapid development of geospatial artificial intelligence (GeoAI) primarily due to the ground-breaking achievements in deep learning and machine learning. A growing number of scholars from cartography have demonstrated successfully that GeoAI can accelerate previously complex cartographic design tasks and even enable cartographic creativity in new ways. Despite the promise of GeoAI, researchers and practitioners have growing concerns about the ethical issues of GeoAI for cartography. In this paper, we conducted a systematic content analysis and narrative synthesis of research studies integrating GeoAI and cartography to summarize current research and development trends regarding the usage of GeoAI for cartographic design. Based on this review and synthesis, we first identify dimensions of GeoAI methods for cartography such as data sources, data formats, map evaluations, and six contemporary GeoAI models, each of which serves a variety of cartographic tasks. These models include decision trees, knowledge graph and semantic web technologies, deep convolutional neural networks, generative adversarial networks, graph neural networks, and reinforcement learning. Further, we summarize seven cartographic design applications where GeoAI have been effectively employed: generalization, symbolization, typography, map reading, map interpretation, map analysis, and map production. We also raise five potential ethical challenges that need to be addressed in the integration of GeoAI for cartography: commodification, responsibility, privacy, bias, and (together) transparency, explainability, and provenance. We conclude by identifying four potential research directions for future cartographic research with GeoAI: GeoAI-enabled active cartographic symbolism, human-in-the-loop GeoAI for cartography, GeoAI-based mapping-as-a-service, and generative GeoAI for cartography.


Quantized Radio Map Estimation Using Tensor and Deep Generative Models

Timilsina, Subash, Shrestha, Sagar, Fu, Xiao

arXiv.org Artificial Intelligence

Spectrum cartography (SC), also known as radio map estimation (RME), aims at crafting multi-domain (e.g., frequency and space) radio power propagation maps from limited sensor measurements. While early methods often lacked theoretical support, recent works have demonstrated that radio maps can be provably recovered using low-dimensional models -- such as the block-term tensor decomposition (BTD) model and certain deep generative models (DGMs) -- of the high-dimensional multi-domain radio signals. However, these existing provable SC approaches assume that sensors send real-valued (full-resolution) measurements to the fusion center, which is unrealistic. This work puts forth a quantized SC framework that generalizes the BTD and DGM-based SC to scenarios where heavily quantized sensor measurements are used. A maximum likelihood estimation (MLE)-based SC framework under a Gaussian quantizer is proposed. Recoverability of the radio map using the MLE criterion are characterized under realistic conditions, e.g., imperfect radio map modeling and noisy measurements. Simulations and real-data experiments are used to showcase the effectiveness of the proposed approach.


Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods

Jukić, Josip, Tutek, Martin, Šnajder, Jan

arXiv.org Artificial Intelligence

A popular approach to unveiling the black box of neural NLP models is to leverage saliency methods, which assign scalar importance scores to each input component. A common practice for evaluating whether an interpretability method is faithful has been to use evaluation-by-agreement -- if multiple methods agree on an explanation, its credibility increases. However, recent work has found that saliency methods exhibit weak rank correlations even when applied to the same model instance and advocated for the use of alternative diagnostic methods. In our work, we demonstrate that rank correlation is not a good fit for evaluating agreement and argue that Pearson-$r$ is a better-suited alternative. We further show that regularization techniques that increase faithfulness of attention explanations also increase agreement between saliency methods. By connecting our findings to instance categories based on training dynamics, we show that the agreement of saliency method explanations is very low for easy-to-learn instances. Finally, we connect the improvement in agreement across instance categories to local representation space statistics of instances, paving the way for work on analyzing which intrinsic model properties improve their predisposition to interpretability methods.


Explaining the ghosts: Feminist intersectional XAI and cartography as methods to account for invisible labour

Klumbyte, Goda, Piehl, Hannah, Draude, Claude

arXiv.org Artificial Intelligence

Contemporary automation through AI entails a substantial amount of behind-the-scenes human labour, which is often both invisibilised and underpaid. Since invisible labour, including labelling and maintenance work, is an integral part of contemporary AI systems, it remains important to sensitise users to its role. We suggest that this could be done through explainable AI (XAI) design, particularly feminist intersectional XAI. We propose the method of cartography, which stems from feminist intersectional research, to draw out a systemic perspective of AI and include dimensions of AI that pertain to invisible labour.


Discovering associations in COVID-19 related research papers

Fister, Iztok Jr., Fister, Karin, Fister, Iztok

arXiv.org Artificial Intelligence

A COVID-19 pandemic has already proven itself to be a global challenge. It proves how vulnerable humanity can be. It has also mobilized researchers from different sciences and different countries in the search for a way to fight this potentially fatal disease. In line with this, our study analyses the abstracts of papers related to COVID-19 and coronavirus-related-research using association rule text mining in order to find the most interestingness words, on the one hand, and relationships between them on the other. Then, a method, called information cartography, was applied for extracting structured knowledge from a huge amount of association rules. On the basis of these methods, the purpose of our study was to show how researchers have responded in similar epidemic/pandemic situations throughout history.


Functional Nonlinear Sparse Models

Chamon, Luiz F. O., Eldar, Yonina C., Ribeiro, Alejandro

arXiv.org Machine Learning

Signal processing in inherently continuous and often nonlinear applications, such as radar, magnetic resonance imaging, and super-resolution microscopy, in which sparsity plays a key role in obtaining state-of-the-art results. Coping with the infinite dimensionality and non-convexity of these estimation problems typically involves discretization and convex relaxations, e.g., using atomic norms. Although successful, this approaches are not without issues. Discretization often leads to high dimensional, potentially ill-conditioned optimization problems. Moreover, due to grid mismatch and other coherence issues, a sparse signal in the continuous domain may no longer be sparse when discretized. Finally, nonlinear problems remain non-convex even after relaxing the sparsity objective. And even in the linear case, performance guarantees for atomic norm relaxations hold under assumptions that may be hard to meet in practice. We propose to address these issues by directly tackling the continuous, nonlinear problem cast as a sparse functional optimization program. We prove that these problems have no duality gap and show that they can be solved efficiently using duality and a (stochastic) subgradient ascent-type algorithm. We illustrate the wide range of applications for this new approach by formulating and solving sparse problems in super-resolution (nonlinear line spectral estimation) and vector field estimation (spectrum cartography).