AITopics | Tian, Yuanyuan

Collaborating Authors

Tian, Yuanyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Advancing Large Language Models for Spatiotemporal and Semantic Association Mining of Similar Environmental Events

Tian, Yuanyuan, Li, Wenwen, Hu, Lei, Chen, Xiao, Brook, Michael, Brubaker, Michael, Zhang, Fan, Liljedahl, Anna K.

arXiv.org Artificial IntelligenceNov-19-2024

Retrieval and recommendation are two essential tasks in modern search tools. This paper introduces a novel retrieval-reranking framework leveraging Large Language Models (LLMs) to enhance the spatiotemporal and semantic associated mining and recommendation of relevant unusual climate and environmental events described in news articles and web posts. This framework uses advanced natural language processing techniques to address the limitations of traditional manual curation methods in terms of high labor cost and lack of scalability. Specifically, we explore an optimized solution to employ cutting-edge embedding models for semantically analyzing spatiotemporal events (news) and propose a Geo-Time Re-ranking (GT-R) strategy that integrates multi-faceted criteria including spatial proximity, temporal association, semantic similarity, and category-instructed similarity to rank and identify similar spatiotemporal events. We apply the proposed framework to a dataset of four thousand Local Environmental Observer (LEO) Network events, achieving top performance in recommending similar events among multiple cutting-edge dense retrieval models. The search and recommendation pipeline can be applied to a wide range of similar data search tasks dealing with geospatial and temporal data. We hope that by linking relevant events, we can better aid the general public to gain an enhanced understanding of climate change and its impact on different communities.

information retrieval, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2411.1288

Country: North America > United States > Alaska (1.00)

Genre: Research Report > New Finding (1.00)

Industry: Media > News (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The KnowWhereGraph Ontology

Shimizu, Cogan, Stephe, Shirly, Barua, Adrita, Cai, Ling, Christou, Antrea, Currier, Kitty, Dalal, Abhilekha, Fisher, Colby K., Hitzler, Pascal, Janowicz, Krzysztof, Li, Wenwen, Liu, Zilong, Mahdavinejad, Mohammad Saeid, Mai, Gengchen, Rehberger, Dean, Schildhauer, Mark, Shi, Meilin, Norouzi, Sanaz Saki, Tian, Yuanyuan, Wang, Sizhe, Wang, Zhangyu, Zalewski, Joseph, Zhou, Lu, Zhu, Rui

arXiv.org Artificial IntelligenceOct-17-2024

KnowWhereGraph is one of the largest fully publicly available geospatial knowledge graphs. It includes data from 30 layers on natural hazards (e.g., hurricanes, wildfires), climate variables (e.g., air temperature, precipitation), soil properties, crop and land-cover types, demographics, and human health, various place and region identifiers, among other themes. These have been leveraged through the graph by a variety of applications to address challenges in food security and agricultural supply chains; sustainability related to soil conservation practices and farm labor; and delivery of emergency humanitarian aid following a disaster. In this paper, we introduce the ontology that acts as the schema for KnowWhereGraph. This broad overview provides insight into the requirements and design specifications for the graph and its schema, including the development methodology (modular ontology modeling) and the resources utilized to implement, materialize, and deploy KnowWhereGraph with its end-user interfaces and public query SPARQL endpoint.

artificial intelligence, dataset, ontology, (17 more...)

arXiv.org Artificial Intelligence

2410.13948

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture > Agriculture (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Sibyl: Forecasting Time-Evolving Query Workloads

Huang, Hanxian, Siddiqui, Tarique, Alotaibi, Rana, Curino, Carlo, Leeka, Jyoti, Jindal, Alekh, Zhao, Jishen, Camacho-Rodriguez, Jesus, Tian, Yuanyuan

arXiv.org Artificial IntelligenceJan-8-2024

For workload-based optimization, the input workload plays a crucial role and needs to be a good representation of the expected Database systems often rely on historical query traces to perform workload. Traditionally, historical query traces have been used as workload-based performance tuning. However, real production input workloads with the assumption that workloads are mostly workloads are time-evolving, making historical queries ineffective static. However, as we discuss in 2, many real workloads exhibit for optimizing future workloads. To address this challenge, we propose highly recurring query structures with changing patterns in both Sibyl, an end-to-end machine learning-based framework that their arrival intervals and data accesses. For instance, query templates accurately forecasts a sequence of future queries, with the entire are often shared across users, teams, and applications, but query statements, in various prediction windows. Drawing insights may be customized with different parameter values to access varying from real-workloads, we propose template-based featurization techniques data at different points in time. Consider a log analysis query and develop a stacked-LSTM with an encoder-decoder architecture that reports errors for different devices and error types: "SELECT for accurate forecasting of query workloads. We also * FROM T WHERE deviceType =? AND errorType =? AND develop techniques to improve forecasting accuracy over large prediction eventDate BETWEEN?

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3639308

2401.03723

Country:

North America > United States > California (0.14)
North America > United States > Minnesota (0.14)
Africa > Middle East > Egypt (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.93)

Add feedback

GEqO: ML-Accelerated Semantic Equivalence Detection

Haynes, Brandon, Alotaibi, Rana, Pavlenko, Anna, Leeka, Jyoti, Jindal, Alekh, Tian, Yuanyuan

arXiv.org Artificial IntelligenceJan-2-2024

Large scale analytics engines have become a core dependency for modern data-driven enterprises to derive business insights and drive actions. These engines support a large number of analytic jobs processing huge volumes of data on a daily basis, and workloads are often inundated with overlapping computations across multiple jobs. Reusing common computation is crucial for efficient cluster resource utilization and reducing job execution time. Detecting common computation is the first and key step for reducing this computational redundancy. However, detecting equivalence on large-scale analytics engines requires efficient and scalable solutions that are fully automated. In addition, to maximize computation reuse, equivalence needs to be detected at the semantic level instead of just the syntactic level (i.e., the ability to detect semantic equivalence of seemingly different-looking queries). Unfortunately, existing solutions fall short of satisfying these requirements. In this paper, we take a major step towards filling this gap by proposing GEqO, a portable and lightweight machine-learning-based framework for efficiently identifying semantically equivalent computations at scale. GEqO introduces two machine-learning-based filters that quickly prune out nonequivalent subexpressions and employs a semi-supervised learning feedback loop to iteratively improve its model with an intelligent sampling mechanism. Further, with its novel database-agnostic featurization method, GEqO can transfer the learning from one workload and database to another. Our extensive empirical evaluation shows that, on TPC-DS-like queries, GEqO yields significant performance gains-up to 200x faster than automated verifiers-and finds up to 2x more equivalences than optimizer and signature-based equivalence detection approaches.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626710

2401.0128

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
(4 more...)

Add feedback

Semantic Similarity Measure of Natural Language Text through Machine Learning and a Keyword-Aware Cross-Encoder-Ranking Summarizer -- A Case Study Using UCGIS GIS&T Body of Knowledge

Tian, Yuanyuan, Li, Wenwen, Wang, Sizhe, Gu, Zhining

arXiv.org Artificial IntelligenceMay-16-2023

Initiated by the University Consortium of Geographic Information Science (UCGIS), GIS&T Body of Knowledge (BoK) is a community-driven endeavor to define, develop, and document geospatial topics related to geographic information science and technologies (GIS&T). In recent years, GIS&T BoK has undergone rigorous development in terms of its topic re-organization and content updating, resulting in a new digital version of the project. While the BoK topics provide useful materials for researchers and students to learn about GIS, the semantic relationships among the topics, such as semantic similarity, should also be identified so that a better and automated topic navigation can be achieved. Currently, the related topics are either defined manually by editors or authors, which may result in an incomplete assessment of topic relationship. To address this challenge, our research evaluates the effectiveness of multiple natural language processing (NLP) techniques in extracting semantics from text, including both deep neural networks and traditional machine learning approaches. Besides, a novel text summarization - KACERS (Keyword-Aware Cross-Encoder-Ranking Summarizer) - is proposed to generate a semantic summary of scientific publications. By identifying the semantic linkages among key topics, this work provides guidance for future development and content organization of the GIS&T BoK project. It also offers a new perspective on the use of machine learning techniques for analyzing scientific publications, and demonstrate the potential of KACERS summarizer in semantic understanding of long text documents.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1111/tgis.13059

2305.09877

Country: North America > United States > Arizona (0.14)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)
Overview (0.93)

Industry: Education > Curriculum (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GeoAI for Knowledge Graph Construction: Identifying Causality Between Cascading Events to Support Environmental Resilience Research

Tian, Yuanyuan, Li, Wenwen

arXiv.org Artificial IntelligenceNov-11-2022

Knowledge graph technology is considered a powerful and semantically enabled solution to link entities, allowing users to derive new knowledge by reasoning data according to various types of reasoning rules. However, in building such a knowledge graph, events modeling, such as that of disasters, is often limited to single, isolated events. The linkages among cascading events are often missing in existing knowledge graphs. This paper introduces our GeoAI (Geospatial Artificial Intelligence) solutions to identify causality among events, in particular, disaster events, based on a set of spatially and temporally-enabled semantic rules. Through a use case of causal disaster events modeling, we demonstrated how these defined rules, including theme-based identification of correlated events, spatiotemporal co-occurrence constraint, and text mining of event metadata, enable the automatic extraction of causal relationships between different events. Our solution enriches the event knowledge base and allows for the exploration of linked cascading events in large knowledge graphs, therefore empowering knowledge query and discovery.

artificial intelligence, disaster, natural language, (15 more...)

arXiv.org Artificial Intelligence

2211.06011

Country: North America > United States > Arizona (0.15)

Genre: Research Report (0.40)

Industry: Government > Regional Government (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback