AITopics | Feng, Yu

Plotting

Feng, Yu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HyperGraphRAG: Retrieval-Augmented Generation with Hypergraph-Structured Knowledge Representation

Luo, Haoran, E, Haihong, Chen, Guanting, Zheng, Yandan, Wu, Xiaobao, Guo, Yikai, Lin, Qika, Feng, Yu, Kuang, Zemin, Song, Meina, Zhu, Yifan, Tuan, Luu Anh

arXiv.org Artificial IntelligenceMar-27-2025

While standard Retrieval-Augmented Generation (RAG) based on chunks, GraphRAG structures knowledge as graphs to leverage the relations among entities. However, previous GraphRAG methods are limited by binary relations: one edge in the graph only connects two entities, which cannot well model the n-ary relations among more than two entities that widely exist in reality. To address this limitation, we propose HyperGraphRAG, a novel hypergraph-based RAG method that represents n-ary relational facts via hyperedges, modeling the complicated n-ary relations in the real world. To retrieve and generate over hypergraphs, we introduce a complete pipeline with a hypergraph construction method, a hypergraph retrieval strategy, and a hypergraph-guided generation mechanism. Experiments across medicine, agriculture, computer science, and law demonstrate that HyperGraphRAG outperforms standard RAG and GraphRAG in accuracy and generation quality.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.21322

Country: North America (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.94)
Food & Agriculture > Agriculture (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpreting core forms of urban morphology linked to urban functions with explainable graph neural network

Chen, Dongsheng, Feng, Yu, Li, Xun, Qu, Mingya, Luo, Peng, Meng, Liqiu

arXiv.org Artificial IntelligenceFeb-22-2025

Understanding the high-order relationship between urban form and function is essential for modeling the underlying mechanisms of sustainable urban systems. Nevertheless, it is challenging to establish an accurate data representation for complex urban forms that are readily explicable in human terms. This study proposed the concept of core urban morphology representation and developed an explainable deep learning framework for explicably symbolizing complex urban forms into the novel representation, which we call CoMo. By interpretating the well-trained deep learning model with a stable weighted F1-score of 89.14%, CoMo presents a promising approach for revealing links between urban function and urban form in terms of core urban morphology representation. Using Boston as a study area, we analyzed the core urban forms at the individual-building, block, and neighborhood level that are important to corresponding urban functions. The residential core forms follow a gradual morphological pattern along the urban spine, which is consistent with a center-urban-suburban transition. Furthermore, we prove that urban morphology directly affects land use efficiency, which has a significantly strong correlation with the location (R2=0.721, p<0.001). Overall, CoMo can explicably symbolize urban forms, provide evidence for the classic urban location theory, and offer mechanistic insights for digital twins.

artificial intelligence, machine learning, urban form, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.compenvurbsys.2025.102267

2502.1621

Country:

Europe (0.67)
Asia > China (0.46)
North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PM-MOE: Mixture of Experts on Private Model Parameters for Personalized Federated Learning

Feng, Yu, Geng, Yangli-ao, Zhu, Yifan, Han, Zongfu, Yu, Xie, Xue, Kaiwen, Luo, Haoran, Sun, Mengyang, Zhang, Guangwei, Song, Meina

arXiv.org Artificial IntelligenceFeb-1-2025

Federated learning (FL) has gained widespread attention for its privacy-preserving and collaborative learning capabilities. Due to significant statistical heterogeneity, traditional FL struggles to generalize a shared model across diverse data domains. Personalized federated learning addresses this issue by dividing the model into a globally shared part and a locally private part, with the local model correcting representation biases introduced by the global model. Nevertheless, locally converged parameters more accurately capture domain-specific knowledge, and current methods overlook the potential benefits of these parameters. To address these limitations, we propose PM-MoE architecture. This architecture integrates a mixture of personalized modules and an energy-based personalized modules denoising, enabling each client to select beneficial personalized parameters from other clients. We applied the PM-MoE architecture to nine recent model-split-based personalized federated learning algorithms, achieving performance improvements with minimal additional training. Extensive experiments on six widely adopted datasets and two heterogeneity settings validate the effectiveness of our approach. The source code is available at \url{https://github.com/dannis97500/PM-MOE}.

artificial intelligence, federated learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3696410.3714561

2502.00354

Country:

Asia (0.70)
North America > United States > California (0.14)
North America > Canada > Quebec (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Energy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

DiverseAgentEntropy: Quantifying Black-Box LLM Uncertainty through Diverse Perspectives and Multi-Agent Interaction

Feng, Yu, Htut, Phu Mon, Qi, Zheng, Xiao, Wei, Mager, Manuel, Pappas, Nikolaos, Halder, Kishaloy, Li, Yang, Benajiba, Yassine, Roth, Dan

arXiv.org Artificial IntelligenceDec-12-2024

Quantifying the uncertainty in the factual parametric knowledge of Large Language Models (LLMs), especially in a black-box setting, poses a significant challenge. Existing methods, which gauge a model's uncertainty through evaluating self-consistency in responses to the original query, do not always capture true uncertainty. Models might respond consistently to the origin query with a wrong answer, yet respond correctly to varied questions from different perspectives about the same query, and vice versa. In this paper, we propose a novel method, DiverseAgentEntropy, for evaluating a model's uncertainty using multi-agent interaction under the assumption that if a model is certain, it should consistently recall the answer to the original query across a diverse collection of questions about the same original query. We further implement an abstention policy to withhold responses when uncertainty is high. Our method offers a more accurate prediction of the model's reliability and further detects hallucinations, outperforming other self-consistency-based methods. Additionally, it demonstrates that existing models often fail to consistently retrieve the correct answer to the same query under diverse varied questions even when knowing the correct answer.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.09572

Country:

Asia (1.00)
North America > United States (0.28)
North America > Mexico > Mexico City (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)

Industry:

Leisure & Entertainment (1.00)
Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

My Words Imply Your Opinion: Reader Agent-Based Propagation Enhancement for Personalized Implicit Emotion Analysis

Liao, Jian, Feng, Yu, Wang, Xiaoyu, Wang, Suge, Zheng, Jianxing, Li, Deyu

arXiv.org Artificial IntelligenceDec-10-2024

In implicit emotion analysis (IEA), the subtlety of emotional expressions makes it particularly sensitive to user-specific characteristics. Existing studies often inject personalization into the analysis by focusing on the authorial dimension of the emotional text. However, these methods overlook the potential influence of the intended reader on the reaction of implicit emotions. In this paper, we refine the IEA task to Personalized Implicit Emotion Analysis (PIEA) and introduce the RAPPIE model, a novel framework designed to address the issue of missing user information within this task. In particular, 1) we create reader agents based on the Large Language Model to simulate reader reactions, to address challenges of the spiral of silence and data incompleteness encountered when acquiring reader feedback information. 2) We establish a reader propagation role system and develop a role-aware emotion propagation multi-view graph learning model, which effectively deals with the sparsity of reader information by utilizing the distribution of propagation roles. 3) We annotate two Chinese PIEA datasets with detailed user metadata, thereby addressing the limitation of prior datasets that primarily focus on textual content annotation. Extensive experiments on these datasets indicate that the RAPPIE model outperforms current state-of-the-art baselines, highlighting the significance and efficacy of incorporating reader feedback into the PIEA process.

artificial intelligence, information, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.07367

Country:

North America > United States (0.68)
North America > Mexico > Mexico City (0.14)

Genre:

Research Report (0.82)
Personal > Opinion (0.34)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

Add feedback

Human Multi-View Synthesis from a Single-View Model:Transferred Body and Face Representations

Feng, Yu, Zhang, Shunsi, Shu, Jian, Zhao, Hanfeng, Pang, Guoliang, Zhang, Chi, Wang, Hao

arXiv.org Artificial IntelligenceDec-3-2024

Generating multi-view human images from a single view is a complex and significant challenge. Although recent advancements in multi-view object generation have shown impressive results with diffusion models, novel view synthesis for humans remains constrained by the limited availability of 3D human datasets. Consequently, many existing models struggle to produce realistic human body shapes or capture fine-grained facial details accurately. To address these issues, we propose an innovative framework that leverages transferred body and facial representations for multi-view human synthesis. Specifically, we use a single-view model pretrained on a large-scale human dataset to develop a multi-view body representation, aiming to extend the 2D knowledge of the single-view model to a multi-view diffusion model. Additionally, to enhance the model's detail restoration capability, we integrate transferred multimodal facial features into our trained human diffusion model. Experimental evaluations on benchmark datasets demonstrate that our approach outperforms the current state-of-the-art methods, achieving superior performance in multi-view human synthesis.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2412.03011

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

BLINK: Multimodal Large Language Models Can See but Not Perceive

Fu, Xingyu, Hu, Yushi, Li, Bangzheng, Feng, Yu, Wang, Haoyu, Lin, Xudong, Roth, Dan, Smith, Noah A., Ma, Wei-Chiu, Krishna, Ranjay

arXiv.org Artificial IntelligenceJul-3-2024

We introduce Blink, a new benchmark for multimodal language models (LLMs) that focuses on core visual perception abilities not found in other evaluations. Most of the Blink tasks can be solved by humans "within a blink" (e.g., relative depth estimation, visual correspondence, forensics detection, and multi-view reasoning). However, we find these perception-demanding tasks cast significant challenges for current multimodal LLMs because they resist mediation through natural language. Blink reformats 14 classic computer vision tasks into 3,807 multiple-choice questions, paired with single or multiple images and visual prompting. While humans get 95.70% accuracy on average, Blink is surprisingly challenging for existing multimodal LLMs: even the best-performing GPT-4V and Gemini achieve accuracies of 51.26% and 45.72%, only 13.17% and 7.63% higher than random guessing, indicating that such perception abilities have not "emerged" yet in recent multimodal LLMs. Our analysis also highlights that specialist CV models could solve these problems much better, suggesting potential pathways for future improvements. We believe Blink will stimulate the community to help multimodal LLMs catch up with human-level visual perception.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2404.1239

Country:

North America > United States > California (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.63)

Industry:

Law (0.66)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Test-Time Training on Graphs with Large Language Models (LLMs)

Zhang, Jiaxin, Wang, Yiqi, Yang, Xihong, Wang, Siwei, Feng, Yu, Shi, Yu, Ren, Ruicaho, Zhu, En, Liu, Xinwang

arXiv.org Artificial IntelligenceApr-21-2024

Graph Neural Networks have demonstrated great success in various fields of multimedia. However, the distribution shift between the training and test data challenges the effectiveness of GNNs. To mitigate this challenge, Test-Time Training (TTT) has been proposed as a promising approach. Traditional TTT methods require a demanding unsupervised training strategy to capture the information from test to benefit the main task. Inspired by the great annotation ability of Large Language Models (LLMs) on Text-Attributed Graphs (TAGs), we propose to enhance the test-time training on graphs with LLMs as annotators. In this paper, we design a novel Test-Time Training pipeline, LLMTTT, which conducts the test-time adaptation under the annotations by LLMs on a carefully-selected node set. Specifically, LLMTTT introduces a hybrid active node selection strategy that considers not only node diversity and representativeness, but also prediction signals from the pre-trained model. Given annotations from LLMs, a two-stage training strategy is designed to tailor the test-time model with the limited and noisy labels. A theoretical analysis ensures the validity of our method and extensive experiments demonstrate that the proposed LLMTTT can achieve a significant performance improvement compared to existing Out-of-Distribution (OOD) generalization methods.

large language model, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

2404.13571

Country:

Asia > China > Hunan Province (0.14)
North America > Canada > Ontario (0.14)
North America > United States > Pennsylvania (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models

Feng, Yu, Zhou, Ben, Lin, Weidong, Roth, Dan

arXiv.org Artificial IntelligenceApr-18-2024

Large language models primarily rely on inductive reasoning for decision making. This results in unreliable decisions when applied to real-world tasks that often present incomplete contexts and conditions. Thus, accurate probability estimation and appropriate interpretations are required to enhance decision-making reliability. In this paper, we propose a Bayesian inference framework called BIRD for large language models. BIRD provides controllable and interpretable probability estimation for model decisions, based on abductive factors, LLM entailment, as well as learnable deductive Bayesian modeling. Experiments show that BIRD produces probability estimations that align with human judgments over 65% of the time using open-sourced Llama models, outperforming the state-of-the-art GPT-4 by 35%. We also show that BIRD can be directly used for trustworthy decision making on many real-world applications.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.12494

Country:

North America (0.46)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Integrating 3D City Data through Knowledge Graphs

Ding, Linfang, Xiao, Guohui, Pano, Albulen, Fumagalli, Mattia, Chen, Dongsheng, Feng, Yu, Calvanese, Diego, Fan, Hongchao, Meng, Liqiu

arXiv.org Artificial IntelligenceOct-17-2023

CityGML is a widely adopted standard by the Open Geospatial Consortium (OGC) for representing and exchanging 3D city models. The representation of semantic and topological properties in CityGML makes it possible to query such 3D city data to perform analysis in various applications, e.g., security management and emergency response, energy consumption and estimation, and occupancy measurement. However, the potential of querying CityGML data has not been fully exploited. The official GML/XML encoding of CityGML is only intended as an exchange format but is not suitable for query answering. The most common way of dealing with CityGML data is to store them in the 3DCityDB system as relational tables and then query them with the standard SQL query language. Nevertheless, for end users, it remains a challenging task to formulate queries over 3DCityDB directly for their ad-hoc analytical tasks, because there is a gap between the conceptual semantics of CityGML and the relational schema adopted in 3DCityDB. In fact, the semantics of CityGML itself can be modeled as a suitable ontology. The technology of Knowledge Graphs (KGs), where an ontology is at the core, is a good solution to bridge such a gap. Moreover, embracing KGs makes it easier to integrate with other spatial data sources, e.g., OpenStreetMap and existing (Geo)KGs (e.g., Wikidata, DBPedia, and GeoNames), and to perform queries combining information from multiple data sources. In this work, we describe a CityGML KG framework to populate the concepts in the CityGML ontology using declarative mappings to 3DCityDB, thus exposing the CityGML data therein as a KG. To demonstrate the feasibility of our approach, we use CityGML data from the city of Munich as test data and integrate OpenStreeMap data in the same area.

artificial intelligence, city data, knowledge graph

arXiv.org Artificial Intelligence

2310.11555

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.24)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.60)

Add feedback