Gao, Song
GeoAI-Enhanced Community Detection on Spatial Networks with Graph Deep Learning
Liang, Yunlei, Zhu, Jiawei, Ye, Wen, Gao, Song
Spatial networks are useful for modeling geographic phenomena where spatial interaction plays an important role. To analyze the spatial networks and their internal structures, graph-based methods such as community detection have been widely used. Community detection aims to extract strongly connected components from the network and reveal the hidden relationships between nodes, but they usually do not involve the attribute information. To consider edge-based interactions and node attributes together, this study proposed a family of GeoAI-enhanced unsupervised community detection methods called region2vec based on Graph Attention Networks (GAT) and Graph Convolutional Networks (GCN). The region2vec methods generate node neural embeddings based on attribute similarity, geographic adjacency and spatial interactions, and then extract network communities based on node embeddings using agglomerative clustering. The proposed GeoAI-based methods are compared with multiple baselines and perform the best when one wants to maximize node attribute similarity and spatial interaction intensity simultaneously within the spatial network communities. It is further applied in the shortage area delineation problem in public health and demonstrates its promise in regionalization problems.
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Chen, Ziru, Chen, Shijie, Ning, Yuting, Zhang, Qianheng, Wang, Boshi, Yu, Botao, Li, Yifei, Liao, Zeyi, Wei, Chen, Lu, Zitong, Dey, Vishal, Xue, Mingyi, Baker, Frazier N., Burns, Benjamin, Adu-Ampratwum, Daniel, Huang, Xuhui, Ning, Xia, Gao, Song, Su, Yu, Sun, Huan
The advancements of language language models (LLMs) have piqued growing interest in developing LLM-based language agents to automate scientific discovery end-to-end, which has sparked both excitement and skepticism about their true capabilities. In this work, we call for rigorous assessment of agents on individual tasks in a scientific workflow before making bold claims on end-to-end automation. To ensure the scientific authenticity and real-world relevance of our benchmark, we extract 102 tasks from 44 peer-reviewed publications in four disciplines and engage nine subject matter experts to validate them. We unify the target output for every task to a self-contained Python program file and employ an array of evaluation metrics to examine the generated programs, execution results, and costs. Each task goes through multiple rounds of manual validation by annotators and subject matter experts to ensure its annotation quality and scientific plausibility. We also propose two effective strategies to mitigate data contamination concerns. Using our benchmark, we evaluate five open-weight and proprietary LLMs, each with three frameworks: direct prompting, OpenHands CodeAct, and self-debug. Given three attempts for each task, the best-performing agent can only solve 32.4% of the tasks independently and 34.3% with expert-provided knowledge. In addition, we evaluate OpenAI o1 with direct prompting and self-debug, which demonstrates the effectiveness of increasing inference-time compute. Still, our results underscore the limitations of current language agents in generating code for data-driven discovery, let alone end-to-end automation for scientific research.
Fine-Tuning is Fine, if Calibrated
Mai, Zheda, Chowdhury, Arpita, Zhang, Ping, Tu, Cheng-Hao, Chen, Hong-You, Pahuja, Vardaan, Berger-Wolf, Tanya, Gao, Song, Stewart, Charles, Su, Yu, Chao, Wei-Lun
Fine-tuning is arguably the most straightforward way to tailor a pre-trained model (e.g., a foundation model) to downstream applications, but it also comes with the risk of losing valuable knowledge the model had learned in pre-training. For example, fine-tuning a pre-trained classifier capable of recognizing a large number of classes to master a subset of classes at hand is shown to drastically degrade the model's accuracy in the other classes it had previously learned. As such, it is hard to further use the fine-tuned model when it encounters classes beyond the fine-tuning data. In this paper, we systematically dissect the issue, aiming to answer the fundamental question, "What has been damaged in the fine-tuned model?" To our surprise, we find that the fine-tuned model neither forgets the relationship among the other classes nor degrades the features to recognize these classes. Instead, the fine-tuned model often produces more discriminative features for these other classes, even if they were missing during fine-tuning! What really hurts the accuracy is the discrepant logit scales between the fine-tuning classes and the other classes, implying that a simple post-processing calibration would bring back the pre-trained model's capability and at the same time unveil the feature improvement over all classes. We conduct an extensive empirical study to demonstrate the robustness of our findings and provide preliminary explanations underlying them, suggesting new directions for future theoretical analysis.
Bringing Back the Context: Camera Trap Species Identification as Link Prediction on Multimodal Knowledge Graphs
Pahuja, Vardaan, Luo, Weidi, Gu, Yu, Tu, Cheng-Hao, Chen, Hong-You, Berger-Wolf, Tanya, Stewart, Charles, Gao, Song, Chao, Wei-Lun, Su, Yu
Camera traps are valuable tools in animal ecology for biodiversity monitoring and conservation. However, challenges like poor generalization to deployment at new unseen locations limit their practical application. Images are naturally associated with heterogeneous forms of context possibly in different modalities. In this work, we leverage the structured context associated with the camera trap images to improve out-of-distribution generalization for the task of species identification in camera traps. For example, a photo of a wild animal may be associated with information about where and when it was taken, as well as structured biology knowledge about the animal species. While typically overlooked by existing work, bringing back such context offers several potential benefits for better image understanding, such as addressing data scarcity and enhancing generalization. However, effectively integrating such heterogeneous context into the visual domain is a challenging problem. To address this, we propose a novel framework that reformulates species classification as link prediction in a multimodal knowledge graph (KG). This framework seamlessly integrates various forms of multimodal context for visual recognition. We apply this framework for out-of-distribution species classification on the iWildCam2020-WILDS and Snapshot Mountain Zebra datasets and achieve competitive performance with state-of-the-art approaches. Furthermore, our framework successfully incorporates biological taxonomy for improved generalization and enhances sample efficiency for recognizing under-represented species.
Artificial Intelligence and Human Geography
Gao, Song
This paper examines the recent advances and applications of AI in human geography especially the use of machine (deep) learning, including place representation and modeling, spatial analysis and predictive mapping, and urban planning and design. AI technologies have enabled deeper insights into complex human-environment interactions, contributing to more effective scientific exploration, understanding of social dynamics, and spatial decision-making. Furthermore, human geography offers crucial contributions to AI, particularly in context-aware model development, human-centered design, biases and ethical considerations, and data privacy. The synergy beween AI and human geography is essential for addressing global challenges like disaster resilience, poverty, and equitable resource access. This interdisciplinary collaboration between AI and geography will help advance the development of GeoAI and promise a better and sustainable world for all.
Artificial Intelligence Studies in Cartography: A Review and Synthesis of Methods, Applications, and Ethics
Kang, Yuhao, Gao, Song, Roth, Robert E.
The past decade has witnessed the rapid development of geospatial artificial intelligence (GeoAI) primarily due to the ground-breaking achievements in deep learning and machine learning. A growing number of scholars from cartography have demonstrated successfully that GeoAI can accelerate previously complex cartographic design tasks and even enable cartographic creativity in new ways. Despite the promise of GeoAI, researchers and practitioners have growing concerns about the ethical issues of GeoAI for cartography. In this paper, we conducted a systematic content analysis and narrative synthesis of research studies integrating GeoAI and cartography to summarize current research and development trends regarding the usage of GeoAI for cartographic design. Based on this review and synthesis, we first identify dimensions of GeoAI methods for cartography such as data sources, data formats, map evaluations, and six contemporary GeoAI models, each of which serves a variety of cartographic tasks. These models include decision trees, knowledge graph and semantic web technologies, deep convolutional neural networks, generative adversarial networks, graph neural networks, and reinforcement learning. Further, we summarize seven cartographic design applications where GeoAI have been effectively employed: generalization, symbolization, typography, map reading, map interpretation, map analysis, and map production. We also raise five potential ethical challenges that need to be addressed in the integration of GeoAI for cartography: commodification, responsibility, privacy, bias, and (together) transparency, explainability, and provenance. We conclude by identifying four potential research directions for future cartographic research with GeoAI: GeoAI-enabled active cartographic symbolism, human-in-the-loop GeoAI for cartography, GeoAI-based mapping-as-a-service, and generative GeoAI for cartography.
Here Is Not There: Measuring Entailment-Based Trajectory Similarity for Location-Privacy Protection and Beyond
Liu, Zilong, Janowicz, Krzysztof, Currier, Kitty, Shi, Meilin, Rao, Jinmeng, Gao, Song, Cai, Ling, Graser, Anita
While the paths humans take play out in social as well as physical space, measures to describe and compare their trajectories are carried out in abstract, typically Euclidean, space. When these measures are applied to trajectories of actual individuals in an application area, alterations that are inconsequential in abstract space may suddenly become problematic once overlaid with geographic reality. In this work, we present a different view on trajectory similarity by introducing a measure that utilizes logical entailment. This is an inferential perspective that considers facts as triple statements deduced from the social and environmental context in which the travel takes place, and their practical implications. We suggest a formalization of entailment-based trajectory similarity, measured as the overlapping proportion of facts, which are spatial relation statements in our case study. With the proposed measure, we evaluate LSTM-TrajGAN, a privacy-preserving trajectory-generation model. The entailment-based model evaluation reveals potential consequences of disregarding the rich structure of geographic space (e.g., miscalculated insurance risk due to regional shifts in our toy example). Our work highlights the advantage of applying logical entailment to trajectory-similarity reasoning for location-privacy protection and beyond.
Holistic Transfer: Towards Non-Disruptive Fine-Tuning with Partial Target Data
Tu, Cheng-Hao, Chen, Hong-You, Mai, Zheda, Zhong, Jike, Pahuja, Vardaan, Berger-Wolf, Tanya, Gao, Song, Stewart, Charles, Su, Yu, Chao, Wei-Lun
We propose a learning problem involving adapting a pre-trained source model to the target domain for classifying all classes that appeared in the source data, using target data that covers only a partial label space. This problem is practical, as it is unrealistic for the target end-users to collect data for all classes prior to adaptation. However, it has received limited attention in the literature. To shed light on this issue, we construct benchmark datasets and conduct extensive experiments to uncover the inherent challenges. We found a dilemma -- on the one hand, adapting to the new target domain is important to claim better performance; on the other hand, we observe that preserving the classification accuracy of classes missing in the target adaptation data is highly challenging, let alone improving them. To tackle this, we identify two key directions: 1) disentangling domain gradients from classification gradients, and 2) preserving class relationships. We present several effective solutions that maintain the accuracy of the missing classes and enhance the overall performance, establishing solid baselines for holistic transfer of pre-trained models with partial target data.
FLEE-GNN: A Federated Learning System for Edge-Enhanced Graph Neural Network in Analyzing Geospatial Resilience of Multicommodity Food Flows
Qu, Yuxiao, Rao, Jinmeng, Gao, Song, Zhang, Qianheng, Chao, Wei-Lun, Su, Yu, Miller, Michelle, Morales, Alfonso, Huber, Patrick
Within the networks is a global imperative to tackle increasing food agrifood systems, food supply networks are pivotal in upholding insecurity. However, the complexity of these networks, with their global food security and facilitating the transit, dissemination, multidimensional interactions and decisions, presents significant and sale of food. It's imperative that these networks demonstrate challenges. This paper proposes FLEE-GNN, a novel Federated resilience and sturdiness [12, 15, 21]. Learning System for Edge-Enhanced Graph Neural Network, However, the complexity inherent in them, arising from designed to overcome these challenges and enhance the analysis of diverse food needs, shipment timeframes and costs, promotional geospatial resilience of multicommodity food flow network, which strategies, cultural and environmental considerations, among is one type of spatial networks. FLEE-GNN addresses the limitations others, complicates the assessment of their durability and of current methodologies, such as entropy-based methods, in terms adaptability [2, 23]. Given the intricate nature of food supply of generalizability, scalability, and data privacy. It combines the networks, the concept of resilience is often interpreted in diverse robustness and adaptability of graph neural networks with the ways by different individuals and groups [6, 12, 16, 22]. The term privacy-conscious and decentralized aspects of federated learning "resilience" in this study predominantly pertains to the capacity of on food supply network resilience analysis across geographical the food flow networks to sustain essential food supplies across regions.
Building Privacy-Preserving and Secure Geospatial Artificial Intelligence Foundation Models
Rao, Jinmeng, Gao, Song, Mai, Gengchen, Janowicz, Krzysztof
In recent years we have seen substantial advances in foundation models for artificial intelligence, including language, vision, and multimodal models. Recent studies have highlighted the potential of using foundation models in geospatial artificial intelligence, known as GeoAI Foundation Models, for geographic question answering, remote sensing image understanding, map generation, and location-based services, among others. However, the development and application of GeoAI foundation models can pose serious privacy and security risks, which have not been fully discussed or addressed to date. This paper introduces the potential privacy and security risks throughout the lifecycle of GeoAI foundation models and proposes a comprehensive blueprint for research directions and preventative and control strategies. Through this vision paper, we hope to draw the attention of researchers and policymakers in geospatial domains to these privacy and security risks inherent in GeoAI foundation models and advocate for the development of privacy-preserving and secure GeoAI foundation models.