South-East District
Global urban visual perception varies across demographics and personalities
Quintana, Matias, Gu, Youlong, Liang, Xiucheng, Hou, Yujun, Ito, Koichi, Zhu, Yihan, Abdelrahman, Mahmoud, Biljecki, Filip
Understanding people's preferences is crucial for urban planning, yet current approaches often combine responses from multi-cultural populations, obscuring demographic differences and risking amplifying biases. We conducted a largescale urban visual perception survey of streetscapes worldwide using street view imagery, examining how demographics -- including gender, age, income, education, race and ethnicity, and personality traits -- shape perceptions among 1,000 participants with balanced demographics from five countries and 45 nationalities. This dataset, Street Perception Evaluation Considering Socioeconomics (SPECS), reveals demographic- and personality-based differences across six traditional indicators -- safe, lively, wealthy, beautiful, boring, depressing -- and four new ones -- live nearby, walk, cycle, green. Location-based sentiments further shape these preferences. Machine learning models trained on existing global datasets tend to overestimate positive indicators and underestimate negative ones compared to human responses, underscoring the need for local context. Our study aspires to rectify the myopic treatment of street perception, which rarely considers demographics or personality traits.
- Asia > Singapore (0.07)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- (20 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (0.93)
- Banking & Finance (0.92)
- Education (0.67)
- Health & Medicine > Therapeutic Area (0.46)
SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation
Besrour, Ines, He, Jingbo, Schreieder, Tobias, Färber, Michael
We present SQuAI (https://squai.scads.ai/), a scalable and trustworthy multi-agent retrieval-augmented generation (RAG) framework for scientific question answering (QA) with large language models (LLMs). SQuAI addresses key limitations of existing RAG systems in the scholarly domain, where complex, open-domain questions demand accurate answers, explicit claims with citations, and retrieval across millions of scientific documents. Built on over 2.3 million full-text papers from arXiv.org, SQuAI employs four collaborative agents to decompose complex questions into sub-questions, retrieve targeted evidence via hybrid sparse-dense retrieval, and adaptively filter documents to improve contextual relevance. To ensure faithfulness and traceability, SQuAI integrates in-line citations for each generated claim and provides supporting sentences from the source documents. Our system improves faithfulness, answer relevance, and contextual relevance by up to +0.088 (12%) over a strong RAG baseline. We further release a benchmark of 1,000 scientific question-answer-evidence triplets to support reproducibility. With transparent reasoning, verifiable citations, and domain-wide scalability, SQuAI demonstrates how multi-agent RAG enables more trustworthy scientific QA with LLMs.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- (15 more...)
UrbanFusion: Stochastic Multimodal Fusion for Contrastive Learning of Robust Spatial Representations
Mühlematter, Dominik J., Che, Lin, Hong, Ye, Raubal, Martin, Wiedemann, Nina
Forecasting urban phenomena such as housing prices and public health indicators requires the effective integration of various geospatial data. Current methods primarily utilize task-specific models, while recent foundation models for spatial representations often support only limited modalities and lack multimodal fusion capabilities. To overcome these challenges, we present UrbanFusion, a Geo-Foundation Model (GeoFM) that features Stochastic Multimodal Fusion (SMF). The framework employs modality-specific encoders to process different types of inputs, including street view imagery, remote sensing data, cartographic maps, and points of interest (POIs) data. These multimodal inputs are integrated via a Transformer-based fusion module that learns unified representations. An extensive evaluation across 41 tasks in 56 cities worldwide demonstrates UrbanFusion's strong generalization and predictive performance compared to state-of-the-art GeoAI models. Specifically, it 1) outperforms prior foundation models on location-encoding, 2) allows multimodal input during inference, and 3) generalizes well to regions unseen during training. UrbanFusion can flexibly utilize any subset of available modalities for a given location during both pretraining and inference, enabling broad applicability across diverse data availability scenarios. All source code is available at https://github.com/DominikM198/UrbanFusion.
- North America > United States > District of Columbia > Washington (0.14)
- Europe > United Kingdom (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
- (39 more...)
- Banking & Finance > Real Estate (0.66)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)
- Transportation > Ground > Road (0.45)
- Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)
TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use
He, Pengfei, Dai, Zhenwei, He, Bing, Liu, Hui, Tang, Xianfeng, Lu, Hanqing, Li, Juanhui, Ding, Jiayuan, Mukherjee, Subhabrata, Wang, Suhang, Xing, Yue, Tang, Jiliang, Dumoulin, Benoit
Large language model (LLM)-based agents increasingly rely on tool use to complete real-world tasks. While existing works evaluate the LLMs' tool use capability, they largely focus on the final answers yet overlook the detailed tool usage trajectory, i.e., whether tools are selected, parameterized, and ordered correctly. We introduce TRAJECT-Bench, a trajectory-aware benchmark to comprehensively evaluate LLMs' tool use capability through diverse tasks with fine-grained evaluation metrics. TRAJECT-Bench pairs high-fidelity, executable tools across practical domains with tasks grounded in production-style APIs, and synthesizes trajectories that vary in breadth (parallel calls) and depth (interdependent chains). Besides final accuracy, TRAJECT-Bench also reports trajectory-level diagnostics, including tool selection and argument correctness, and dependency/order satisfaction. Analyses reveal failure modes such as similar tool confusion and parameter-blind selection, and scaling behavior with tool diversity and trajectory length where the bottleneck of transiting from short to mid-length trajectories is revealed, offering actionable guidance for LLMs' tool use.
- Europe > Austria > Vienna (0.14)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
- Europe > France (0.04)
- (35 more...)
- Leisure & Entertainment (1.00)
- Consumer Products & Services > Travel (1.00)
- Media > Music (0.96)
- (2 more...)
Mechanistic Interpretability with SAEs: Probing Religion, Violence, and Geography in Large Language Models
Simbeck, Katharina, Mahran, Mariam
Despite growing research on bias in large language models (LLMs), most work has focused on gender and race, with little attention to religious identity. This paper explores how religion is internally represented in LLMs and how it intersects with concepts of violence and geography. Using mechanistic interpretability and Sparse Autoencoders (SAEs) via the Neuronpedia API, we analyze latent feature activations across five models. We measure overlap between religion- and violence-related prompts and probe semantic patterns in activation contexts. While all five religions show comparable internal cohesion, Islam is more frequently linked to features associated with violent language. In contrast, geographic associations largely reflect real-world religious demographics, revealing how models embed both factual distributions and cultural stereotypes. These findings highlight the value of structural analysis in auditing not just outputs but also internal representations that shape model behavior.
- North America > United States > New York > New York County > New York City (0.28)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.14)
- (225 more...)
Where on Earth Do Users Say They Are?: Geo-Entity Linking for Noisy Multilingual User Input
Masis, Tessa, O'Connor, Brendan
Geo-entity linking is the task of linking a location mention to the real-world geographic location. In this paper we explore the challenging task of geo-entity linking for noisy, multilingual social media data. There are few open-source multilingual geo-entity linking tools available and existing ones are often rule-based, which break easily in social media settings, or LLM-based, which are too expensive for large-scale datasets. We present a method which represents real-world locations as averaged embeddings from labeled user-input location names and allows for selective prediction via an interpretable confidence score. We show that our approach improves geo-entity linking on a global and multilingual social media dataset, and discuss progress and problems with evaluating at different geographic granularities.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- Asia > Middle East > Republic of Türkiye > İzmir Province > İzmir (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- (33 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Yang, Sohee, Gribovskaya, Elena, Kassner, Nora, Geva, Mor, Riedel, Sebastian
We study whether Large Language Models (LLMs) latently perform multi-hop reasoning with complex prompts such as "The mother of the singer of 'Superstition' is". We look for evidence of a latent reasoning pathway where an LLM (1) latently identifies "the singer of 'Superstition'" as Stevie Wonder, the bridge entity, and (2) uses its knowledge of Stevie Wonder's mother to complete the prompt. We analyze these two hops individually and consider their co-occurrence as indicative of latent multi-hop reasoning. For the first hop, we test if changing the prompt to indirectly mention the bridge entity instead of any other entity increases the LLM's internal recall of the bridge entity. For the second hop, we test if increasing this recall causes the LLM to better utilize what it knows about the bridge entity. We find strong evidence of latent multi-hop reasoning for the prompts of certain relation types, with the reasoning pathway used in more than 80% of the prompts. However, the utilization is highly contextual, varying across different types of prompts. Also, on average, the evidence for the second hop and the full multi-hop traversal is rather moderate and only substantial for the first hop. Moreover, we find a clear scaling trend with increasing model size for the first hop of reasoning but not for the second hop. Our experimental findings suggest potential challenges and opportunities for future development and applications of LLMs.
- Africa > Middle East > Somalia (0.14)
- North America > Nicaragua (0.14)
- Europe > Finland (0.14)
- (30 more...)
- Research Report > New Finding (0.66)
- Research Report > Experimental Study (0.46)
- Media (1.00)
- Leisure & Entertainment (1.00)
- Government > Regional Government > North America Government (0.46)
Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data
Naggita, Keziah, LaChance, Julienne, Xiang, Alice
Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.
- Asia > Brunei (0.14)
- North America > Canada > Quebec > Montreal (0.06)
- Africa > Sierra Leone (0.06)
- (142 more...)
- Health & Medicine (0.92)
- Information Technology > Services (0.75)
- Government > Regional Government (0.46)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
How to Create Dummy Data in Python
Dummy data is randomly generated data that can be substituted for live data. Whether you are a Developer, Software Engineer, or Data Scientist, sometimes you need dummy data to test what you have built, it can be a web app, mobile app, or machine learning model. If you are using python language, you can use a faker python package to create dummy data of any type, for example, dates, transactions, names, texts, time, and others. Faker is a simple python package that generates fake data with different data types. Faker package is heavily inspired by PHP Faker, Perl Faker, and by Ruby Faker.
- Europe > Slovakia (0.16)
- South America > Brazil (0.05)
- Oceania > Wallis and Futuna (0.05)
- (10 more...)
Using Big Data Analytics for Transboundary Water Management
Southern Africa has experienced drought-flood cycles for the past decade that strain the ability of any country to properly manage water resources. This dynamic is exacerbated by human drivers such as the heavy reliance of sectors such as mining and agriculture on groundwater and surface water, as well as subsistence agriculture in rural areas along rivers. These factors have progressively depleted natural freshwater systems and contributed to an accumulation of sediment in river systems. In a region where two or more countries share many of the groundwater and surface resources, water security cuts across the socioeconomic divide and is both a rural and urban issue. For example, the City of Cape Town had to heavily ration all water uses in 2017 and 2018, as its dams were drying up.
- Africa > Southern Africa (0.31)
- Africa > South Africa > Western Cape > Cape Town (0.26)
- North America > United States (0.24)
- (3 more...)
- Information Technology > Data Science > Data Mining > Big Data (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (0.79)