AITopics

Country:

Asia (0.68)
North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsApr-25-2026, 12:56:46 GMT

1b57aaddf85ab01a2445a79c9edc1f4b-Paper-Conference.pdf

artificial intelligence, deep learning, machine learning, (17 more...)

Country:

Europe (0.67)
North America > United States (0.28)

Genre: Research Report (0.93)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)

Neural Information Processing SystemsFeb-8-2026, 13:06:31 GMT

1b57aaddf85ab01a2445a79c9edc1f4b-Supplemental-Conference.pdf

artificial intelligence, deep learning, machine learning, (17 more...)

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Poland (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Neural Information Processing SystemsFeb-8-2026, 13:06:27 GMT

1b57aaddf85ab01a2445a79c9edc1f4b-Paper-Conference.pdf

encoder, geoclip, gps coordinate, (14 more...)

Country:

North America > United States (0.14)
Europe > Poland (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(3 more...)

Genre: Research Report (0.93)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)

Neural Information Processing SystemsDec-24-2025, 02:43:57 GMT

GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization

Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth. This task has considerable challenges due to the immense variation in geographic landscapes. The image-to-image retrieval-based approaches fail to solve this problem on a global scale as it is not feasible to construct a large gallery of images covering the entire world. Instead, existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task. However, their performance is limited by the predefined classes and often results in inaccurate localizations when an image's location significantly deviates from its class center.

clip-inspired alignment, geoclip, location and image, (5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.55)

Gilani, Atefeh, Tasnim, Naima, Sankar, Lalitha, Kosut, Oliver

GeoClip: Geometry-Aware Clipping for Differentially Private SGD

arXiv.org Artificial IntelligenceOct-29-2025

Differentially private stochastic gradient descent (DP-SGD) is the most widely used method for training machine learning models with provable privacy guarantees. A key challenge in DP-SGD is setting the per-sample gradient clipping threshold, which significantly affects the trade-off between privacy and utility. While recent adaptive methods improve performance by adjusting this threshold during training, they operate in the standard coordinate system and fail to account for correlations across the coordinates of the gradient. We propose GeoClip, a geometry-aware framework that clips and perturbs gradients in a transformed basis aligned with the geometry of the gradient distribution. GeoClip adaptively estimates this transformation using only previously released noisy gradients, incurring no additional privacy cost. We provide convergence guarantees for GeoClip and derive a closed-form solution for the optimal transformation that minimizes the amount of noise added while keeping the probability of gradient clipping under control. Experiments on both tabular and image datasets demonstrate that GeoClip consistently outperforms existing adaptive clipping methods under the same privacy budget.

artificial intelligence, geoclip, machine learning, (18 more...)

2506.06549

Country: North America > United States (0.47)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.53)

arXiv.org Artificial IntelligenceSep-8-2025

Towards Interpretable Geo-localization: a Concept-Aware Global Image-GPS Alignment Framework

Jia, Furong, Liu, Lanxin, Hou, Ce, Zhang, Fan, Liu, Xinyan, Liu, Yu

Worldwide geo-localization involves determining the exact geographic location of images captured globally, typically guided by geographic cues such as climate, landmarks, and architectural styles. Despite advancements in geo-localization models like GeoCLIP, which leverages images and location alignment via contrastive learning for accurate predictions, the interpretability of these models remains insufficiently explored. Current concept-based interpretability methods fail to align effectively with Geo-alignment image-location embedding objectives, resulting in suboptimal inter-pretability and performance. To address this gap, we propose a novel framework integrating global geo-localization with concept bottlenecks. Our method inserts a Concept-A ware Alignment Module that jointly projects image and location embeddings onto a shared bank of geographic concepts (e.g., tropical climate, mountain, cathedral) and minimizes a concept-level loss, enhancing alignment in a concept-specific subspace and enabling robust interpretability. To our knowledge, this is the first work to introduce interpretability into geo-localization. Extensive experiments demonstrate that our approach surpasses GeoCLIP in geo-localization accuracy and boosts performance across diverse geospatial prediction tasks, revealing richer semantic insights into geographic decision-making processes.

large language model, machine learning, natural language, (14 more...)

2509.0191

Country:

North America > United States (0.46)
Asia > China (0.29)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceMar-23-2025

LocDiffusion: Identifying Locations on Earth by Diffusing in the Hilbert Space

Wang, Zhangyu, Zhang, Jielu, Zhou, Zhongliang, Cao, Qian, Wu, Nemin, Liu, Zeping, Mu, Lan, Song, Yang, Xie, Yiqun, Lao, Ni, Mai, Gengchen

Image geolocalization is a fundamental yet challenging task, aiming at inferring the geolocation on Earth where an image is taken. Existing methods approach it either via grid-based classification or via image retrieval. Their performance significantly suffers when the spatial distribution of test images does not align with such choices. To address these limitations, we propose to leverage diffusion as a mechanism for image geolocalization. To avoid the problematic manifold reprojection step in diffusion, we developed a novel spherical positional encoding-decoding framework, which encodes points on a spherical surface (e.g., geolocations on Earth) into a Hilbert space of Spherical Harmonics coefficients and decodes points (geolocations) by mode-seeking. We call this type of position encoding Spherical Harmonics Dirac Delta (SHDD) Representation. We also propose a novel SirenNet-based architecture called CS-UNet to learn the conditional backward process in the latent SHDD space by minimizing a latent KL-divergence loss. We train a conditional latent diffusion model called LocDiffusion that generates geolocations under the guidance of images -- to the best of our knowledge, the first generative model for image geolocalization by diffusing geolocation information in a hidden location embedding space. We evaluate our method against SOTA image geolocalization baselines. LocDiffusion achieves competitive geolocalization performance and demonstrates significantly stronger generalizability to unseen geolocations.

artificial intelligence, machine learning, natural language, (19 more...)

2503.18142

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Georgia > Clarke County > Athens (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Energy (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

arXiv.org Artificial IntelligenceFeb-16-2025

GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder

Cho, Seunghyuk, Qin, Zhenyue, Liu, Yang, Choi, Youngbin, Lee, Seungbeom, Kim, Dongwoo

We introduce GeoDANO, a geometric vision-language model (VLM) with a domain-agnostic vision encoder, for solving plane geometry problems. Although VLMs have been employed for solving geometry problems, their ability to recognize geometric features remains insufficiently analyzed. To address this gap, we propose a benchmark that evaluates the recognition of visual geometric features, including primitives such as dots and lines, and relations such as orthogonality. Our preliminary study shows that vision encoders often used in general-purpose VLMs, e.g., OpenCLIP, fail to detect these features and struggle to generalize across domains. We develop GeoCLIP, a CLIP based model trained on synthetic geometric diagram-caption pairs to overcome the limitation. Benchmark results show that GeoCLIP outperforms existing vision encoders in recognizing geometric features. We then propose our VLM, GeoDANO, which augments GeoCLIP with a domain adaptation strategy for unseen diagram styles. GeoDANO outperforms specialized methods for plane geometry problems and GPT-4o on MathVerse.

large language model, logic & formal reasoning, machine learning, (22 more...)

2502.1136

Country:

Europe (0.93)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

arXiv.org Artificial IntelligenceDec-27-2024

General Geospatial Inference with a Population Dynamics Foundation Model

Agarwal, Mohit, Sun, Mimi, Kamath, Chaitanya, Muslim, Arbaaz, Sarker, Prithul, Paul, Joydeep, Yee, Hector, Sieniek, Marcin, Jablonski, Kim, Mayer, Yael, Fork, David, de Guia, Sheila, McPike, Jamie, Boulanger, Adam, Shekel, Tomer, Schottlander, David, Xiao, Yao, Manukonda, Manjit Chakravarthy, Liu, Yun, Bulut, Neslihan, Abu-el-haija, Sami, Eigenwillig, Arno, Kothari, Parth, Perozzi, Bryan, Bharel, Monica, Nguyen, Von, Barrington, Luke, Efron, Niv, Matias, Yossi, Corrado, Greg, Eswaran, Krish, Prabhakara, Shruthi, Shetty, Shravya, Prasad, Gautam

Supporting the health and well-being of dynamic populations around the world requires governmental agencies, organizations and researchers to understand and reason over complex relationships between human behavior and local contexts in order to identify high-risk groups and strategically allocate limited resources. Traditional approaches to these classes of problems often entail developing manually curated, task-specific features and models to represent human behavior and the natural and built environment, which can be challenging to adapt to new, or even, related tasks. To address this, we introduce a Population Dynamics Foundation Model (PDFM) that aims to capture the relationships between diverse data modalities and is applicable to a broad range of geospatial tasks. We first construct a geo-indexed dataset for postal codes and counties across the United States, capturing rich aggregated information on human behavior from maps, busyness, and aggregated search trends, and environmental factors such as weather and air quality. We then model this data and the complex relationships between locations using a graph neural network, producing embeddings that can be adapted to a wide range of downstream tasks using relatively simple models. We evaluate the effectiveness of our approach by benchmarking it on 27 downstream tasks spanning three distinct domains: health indicators, socioeconomic factors, and environmental measurements. The approach achieves state-of-the-art performance on all 27 geospatial interpolation tasks, and on 25 out of the 27 extrapolation and super-resolution tasks. We combined the PDFM with a state-of-the-art forecasting foundation model, TimesFM, to predict unemployment and poverty, achieving performance that surpasses fully supervised forecasting. The full set of embeddings and sample code are publicly available for researchers.

artificial intelligence, general geospatial inference, machine learning, (16 more...)

2411.07207

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry:

Energy (0.94)
Health & Medicine > Consumer Health (0.87)
Health & Medicine > Public Health (0.68)
(4 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Geographic Information Systems (1.00)
Information Technology > Data Science (1.00)
(4 more...)