AITopics | Chiang, Yao-Yi

Collaborating Authors

Chiang, Yao-Yi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MapQA: Open-domain Geospatial Question Answering on Map Data

Li, Zekun, Grossman, Malcolm, Eric, null, Qasemi, null, Kulkarni, Mihir, Chen, Muhao, Chiang, Yao-Yi

arXiv.org Artificial IntelligenceMar-10-2025

Geospatial question answering (QA) is a fundamental task in navigation and point of interest (POI) searches. While existing geospatial QA datasets exist, they are limited in both scale and diversity, often relying solely on textual descriptions of geo-entities without considering their geometries. A major challenge in scaling geospatial QA datasets for reasoning lies in the complexity of geospatial relationships, which require integrating spatial structures, topological dependencies, and multi-hop reasoning capabilities that most text-based QA datasets lack. To address these limitations, we introduce MapQA, a novel dataset that not only provides question-answer pairs but also includes the geometries of geo-entities referenced in the questions. MapQA is constructed using SQL query templates to extract question-answer pairs from OpenStreetMap (OSM) for two study regions: Southern California and Illinois. It consists of 3,154 QA pairs spanning nine question types that require geospatial reasoning, such as neighborhood inference and geo-entity type identification. Compared to existing datasets, MapQA expands both the number and diversity of geospatial question types. We explore two approaches to tackle this challenge: (1) a retrieval-based language model that ranks candidate geo-entities by embedding similarity, and (2) a large language model (LLM) that generates SQL queries from natural language questions and geo-entity attributes, which are then executed against an OSM database. Our findings indicate that retrieval-based methods effectively capture concepts like closeness and direction but struggle with questions that require explicit computations (e.g., distance calculations). LLMs (e.g., GPT and Gemini) excel at generating SQL queries for one-hop reasoning but face challenges with multi-hop reasoning, highlighting a key bottleneck in advancing geospatial QA systems.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2503.07871

Country:

North America > United States > California (0.88)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (1.00)
Consumer Products & Services (0.94)
Education (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MiTREE: Multi-input Transformer Ecoregion Encoder for Species Distribution Modelling

Chen, Theresa, Chiang, Yao-Yi

arXiv.org Artificial IntelligenceDec-25-2024

Climate change poses an extreme threat to biodiversity, making it imperative to efficiently model the geographical range of different species. The availability of large-scale remote sensing images and environmental data has facilitated the use of machine learning in Species Distribution Models (SDMs), which aim to predict the presence of a species at any given location. Traditional SDMs, reliant on expert observation, are labor-intensive, but advancements in remote sensing and citizen science data have facilitated machine learning approaches to SDM development. However, these models often struggle with leveraging spatial relationships between different inputs -- for instance, learning how climate data should inform the data present in satellite imagery -- without upsampling or distorting the original inputs. Additionally, location information and ecological characteristics at a location play a crucial role in predicting species distribution models, but these aspects have not yet been incorporated into state-of-the-art approaches. In this work, we introduce MiTREE: a multi-input Vision-Transformer-based model with an ecoregion encoder. MiTREE computes spatial cross-modal relationships without upsampling as well as integrates location and ecological context. We evaluate our model on the SatBird Summer and Winter datasets, the goal of which is to predict bird species encounter rates, and we find that our approach improves upon state-of-the-art baselines.

artificial intelligence, hotspot, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3687123.3698297

2412.18995

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.93)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.77)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Leveraging Large Language Models for Generating Labeled Mineral Site Record Linkage Data

Pyo, Jiyoon, Chiang, Yao-Yi

arXiv.org Artificial IntelligenceNov-17-2024

Record linkage integrates diverse data sources by identifying records that refer to the same entity. In the context of mineral site records, accurate record linkage is crucial for identifying and mapping mineral deposits. Properly linking records that refer to the same mineral deposit helps define the spatial coverage of mineral areas, benefiting resource identification and site data archiving. Mineral site record linkage falls under the spatial record linkage category since the records contain information about the physical locations and non-spatial attributes in a tabular format. The task is particularly challenging due to the heterogeneity and vast scale of the data. While prior research employs pre-trained discriminative language models (PLMs) on spatial entity linkage, they often require substantial amounts of curated ground-truth data for fine-tuning. Gathering and creating ground truth data is both time-consuming and costly. Therefore, such approaches are not always feasible in real-world scenarios where gold-standard data are unavailable. Although large generative language models (LLMs) have shown promising results in various natural language processing tasks, including record linkage, their high inference time and resource demand present challenges. We propose a method that leverages an LLM to generate training data and fine-tune a PLM to address the training data gap while preserving the efficiency of PLMs. Our approach achieves over 45\% improvement in F1 score for record linkage compared to traditional PLM-based methods using ground truth data while reducing the inference time by nearly 18 times compared to relying on LLMs. Additionally, we offer an automated pipeline that eliminates the need for human intervention, highlighting this approach's potential to overcome record linkage challenges.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3687123.3698298

2412.03575

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.64)

Industry:

Materials > Metals & Mining (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations

Jeong, Minoh, Namgung, Min, Kim, Zae Myung, Kang, Dongyeop, Chiang, Yao-Yi, Hero, Alfred

arXiv.org Machine LearningOct-2-2024

Multimodal learning plays a crucial role in enabling machine learning models to fuse and utilize diverse data sources, such as text, images, and audio, to support a variety of downstream tasks. A unified representation across various modalities is particularly important for improving efficiency and performance. Recent binding methods, such as ImageBind (Girdhar et al., 2023), typically use a fixed anchor modality to align multimodal data in the anchor modal embedding space. In this paper, we mathematically analyze the fixed anchor binding methods and uncover notable limitations: (1) over-reliance on the choice of the anchor modality, (2) failure to capture intra-modal information, and (3) failure to account for inter-modal correlation among non-anchored modalities. To address these limitations, we propose CentroBind, a simple yet powerful approach that eliminates the need for a fixed anchor; instead, it employs dynamically adjustable centroid-based anchors generated from all available modalities, resulting in a balanced and rich representation space. We theoretically demonstrate that our method captures three crucial properties of multimodal learning: intra-modal learning, inter-modal learning, and multimodal alignment, while also constructing a robust unified representation across all modalities. Our experiments on both synthetic and real-world datasets demonstrate the superiority of the proposed method, showing that dynamic anchor methods outperform all fixed anchor binding methods as the former captures more nuanced multimodal interactions.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2410.02086

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Unveiling Population Heterogeneity in Health Risks Posed by Environmental Hazards Using Regression-Guided Neural Network

Nam, Jong Woo, Choi, Eun Young, Ailshire, Jennifer A., Chiang, Yao-Yi

arXiv.org Artificial IntelligenceSep-20-2024

Environmental hazards place certain individuals at disproportionately higher risks. As these hazards increasingly endanger human health, precise identification of the most vulnerable population subgroups is critical for public health. Moderated multiple regression (MMR) offers a straightforward method for investigating this by adding interaction terms between the exposure to a hazard and other population characteristics to a linear regression model. However, when the vulnerabilities are hidden within a cross-section of many characteristics, MMR is often limited in its capabilities to find any meaningful discoveries. Here, we introduce a hybrid method, named regression-guided neural networks (ReGNN), which utilizes artificial neural networks (ANNs) to non-linearly combine predictors, generating a latent representation that interacts with a focal predictor (i.e. variable measuring exposure to an environmental hazard). We showcase the use of ReGNN for investigating the population heterogeneity in the health effects of exposure to air pollution (PM2.5) on cognitive functioning scores. We demonstrate that population heterogeneity that would otherwise be hidden using traditional MMR can be found using ReGNN by comparing its results to the fit results of the traditional MMR models. In essence, ReGNN is a novel tool that enhances traditional regression models by effectively summarizing and quantifying an individual's susceptibility to health risks.

artificial intelligence, coefficient, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2409.13205

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > Experimental Study (0.95)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.90)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.80)

Add feedback

Dynamic GNNs for Precise Seizure Detection and Classification from EEG Data

Hajisafi, Arash, Lin, Haowen, Chiang, Yao-Yi, Shahabi, Cyrus

arXiv.org Artificial IntelligenceMay-8-2024

Diagnosing epilepsy requires accurate seizure detection and classification, but traditional manual EEG signal analysis is resource-intensive. Meanwhile, automated algorithms often overlook EEG's geometric and semantic properties critical for interpreting brain activity. This paper introduces NeuroGNN, a dynamic Graph Neural Network (GNN) framework that captures the dynamic interplay between the EEG electrode locations and the semantics of their corresponding brain regions. The specific brain region where an electrode is placed critically shapes the nature of captured EEG signals. Each brain region governs distinct cognitive functions, emotions, and sensory processing, influencing both the semantic and spatial relationships within the EEG data. Understanding and modeling these intricate brain relationships are essential for accurate and meaningful insights into brain activity. This is precisely where the proposed NeuroGNN framework excels by dynamically constructing a graph that encapsulates these evolving spatial, temporal, semantic, and taxonomic correlations to improve precision in seizure detection and classification. Our extensive experiments with real-world data demonstrate that NeuroGNN significantly outperforms existing state-of-the-art models.

classification, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-981-97-2238-9_16

2405.09568

Country:

North America > United States > California (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology > Epilepsy (0.35)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding

Li, Zekun, Zhou, Wenxuan, Chiang, Yao-Yi, Chen, Muhao

arXiv.org Artificial IntelligenceOct-22-2023

Humans subconsciously engage in geospatial reasoning when reading articles. We recognize place names and their spatial relations in text and mentally associate them with their physical locations on Earth. Although pretrained language models can mimic this cognitive process using linguistic context, they do not utilize valuable geospatial information in large, widely available geographical databases, e.g., OpenStreetMap. This paper introduces GeoLM, a geospatially grounded language model that enhances the understanding of geo-entities in natural language. GeoLM leverages geo-entity mentions as anchors to connect linguistic information in text corpora with geospatial information extracted from geographical databases. GeoLM connects the two types of context through contrastive learning and masked language modeling. It also incorporates a spatial coordinate embedding mechanism to encode distance and direction relations to capture geospatial context. In the experiment, we demonstrate that GeoLM exhibits promising capabilities in supporting toponym recognition, toponym linking, relation extraction, and geo-entity typing, which bridge the gap between natural language processing and geospatial sciences. The code is publicly available at https://github.com/knowledge-computing/geolm.

artificial intelligence, empowering language model, natural language, (1 more...)

arXiv.org Artificial Intelligence

2310.14478

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Learning Dynamic Graphs from All Contextual Information for Accurate Point-of-Interest Visit Forecasting

Hajisafi, Arash, Lin, Haowen, Shaham, Sina, Hu, Haoji, Siampou, Maria Despoina, Chiang, Yao-Yi, Shahabi, Cyrus

arXiv.org Artificial IntelligenceSep-28-2023

Forecasting the number of visits to Points-of-Interest (POI) in an urban area is critical for planning and decision-making for various application domains, from urban planning and transportation management to public health and social studies. Although this forecasting problem can be formulated as a multivariate time-series forecasting task, the current approaches cannot fully exploit the ever-changing multi-context correlations among POIs. Therefore, we propose Busyness Graph Neural Network (BysGNN), a temporal graph neural network designed to learn and uncover the underlying multi-context correlations between POIs for accurate visit forecasting. Unlike other approaches where only time-series data is used to learn a dynamic graph, BysGNN utilizes all contextual information and time-series data to learn an accurate dynamic graph representation. By incorporating all contextual, temporal, and spatial signals, we observe a significant improvement in our forecasting accuracy over state-of-the-art forecasting models in our experiments with real-world datasets across the United States.

accurate point-of-interest visit forecasting, artificial intelligence, machine learning, (2 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3589132.3625567

2306.15927

Country: North America > United States (0.24)

Genre: Research Report (0.69)

Technology:

Information Technology > Modeling & Simulation (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.44)

Add feedback

The mapKurator System: A Complete Pipeline for Extracting and Linking Text from Historical Maps

Kim, Jina, Li, Zekun, Lin, Yijun, Namgung, Min, Jang, Leeje, Chiang, Yao-Yi

arXiv.org Artificial IntelligenceJul-3-2023

Scanned historical maps in libraries and archives are valuable repositories of geographic data that often do not exist elsewhere. Despite the potential of machine learning tools like the Google Vision APIs for automatically transcribing text from these maps into machine-readable formats, they do not work well with large-sized images (e.g., high-resolution scanned documents), cannot infer the relation between the recognized text and other datasets, and are challenging to integrate with post-processing tools. This paper introduces the mapKurator system, an end-to-end system integrating machine learning models with a comprehensive data processing pipeline. mapKurator empowers automated extraction, post-processing, and linkage of text labels from large numbers of large-dimension historical map scans. The output data, comprising bounding polygons and recognized text, is in the standard GeoJSON format, making it easily modifiable within Geographic Information Systems (GIS). The proposed system allows users to quickly generate valuable data from large numbers of historical maps for in-depth analysis of the map content and, in turn, encourages map findability, accessibility, interoperability, and reusability (FAIR principles). We deployed the mapKurator system and enabled the processing of over 60,000 maps and over 100 million text/place names in the David Rumsey Historical Map collection. We also demonstrated a seamless integration of mapKurator with a collaborative web platform to enable accessing automated approaches for extracting and linking text labels from historical map scans and collective work to improve the results.

artificial intelligence, machine learning, spatial reasoning, (17 more...)

arXiv.org Artificial Intelligence

2306.17059

Country: North America > United States > Minnesota (0.29)

Genre: Research Report (0.50)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.34)

Add feedback

Quantile Extreme Gradient Boosting for Uncertainty Quantification

Yin, Xiaozhe, Fallah-Shorshani, Masoud, McConnell, Rob, Fruin, Scott, Chiang, Yao-Yi, Franklin, Meredith

arXiv.org Artificial IntelligenceApr-23-2023

As the availability, size and complexity of data have increased in recent years, machine learning (ML) techniques have become popular for modeling. Predictions resulting from applying ML models are often used for inference, decision-making, and downstream applications. A crucial yet often overlooked aspect of ML is uncertainty quantification, which can significantly impact how predictions from models are used and interpreted. Extreme Gradient Boosting (XGBoost) is one of the most popular ML methods given its simple implementation, fast computation, and sequential learning, which make its predictions highly accurate compared to other methods. However, techniques for uncertainty determination in ML models such as XGBoost have not yet been universally agreed among its varying applications. We propose enhancements to XGBoost whereby a modified quantile regression is used as the objective function to estimate uncertainty (QXGBoost). Specifically, we included the Huber norm in the quantile regression model to construct a differentiable approximation to the quantile regression error function. This key step allows XGBoost, which uses a gradient-based optimization algorithm, to make probabilistic predictions efficiently. QXGBoost was applied to create 90\% prediction intervals for one simulated dataset and one real-world environmental dataset of measured traffic noise. Our proposed method had comparable or better performance than the uncertainty estimates generated for regular and quantile light gradient boosting. For both the simulated and traffic noise datasets, the overall performance of the prediction intervals from QXGBoost were better than other models based on coverage width-based criterion.

artificial intelligence, machine learning, qxgboost, (15 more...)

arXiv.org Artificial Intelligence

2304.11732

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > California > Los Angeles County (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Energy (1.00)
Health & Medicine (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback