AITopics | Pan, Bo

Collaborating Authors

Pan, Bo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MEGL: Multimodal Explanation-Guided Learning

Zhang, Yifei, Jiang, Tianxu, Pan, Bo, Wang, Jingyu, Bai, Guangji, Zhao, Liang

arXiv.org Artificial IntelligenceNov-20-2024

Explaining the decision-making processes of Artificial Intelligence (AI) models is crucial for addressing their "black box" nature, particularly in tasks like image classification. Traditional eXplainable AI (XAI) methods typically rely on unimodal explanations, either visual or textual, each with inherent limitations. Visual explanations highlight key regions but often lack rationale, while textual explanations provide context without spatial grounding. Further, both explanation types can be inconsistent or incomplete, limiting their reliability. To address these challenges, we propose a novel Multimodal Explanation-Guided Learning (MEGL) framework that leverages both visual and textual explanations to enhance model interpretability and improve classification performance. Our Saliency-Driven Textual Grounding (SDTG) approach integrates spatial information from visual explanations into textual rationales, providing spatially grounded and contextually rich explanations. Additionally, we introduce Textual Supervision on Visual Explanations to align visual explanations with textual rationales, even in cases where ground truth visual annotations are missing. A Visual Explanation Distribution Consistency loss further reinforces visual coherence by aligning the generated visual explanations with dataset-level patterns, enabling the model to effectively learn from incomplete multimodal supervision. We validate MEGL on two new datasets, Object-ME and Action-ME, for image classification with multimodal explanations. Experimental results demonstrate that MEGL outperforms previous approaches in prediction accuracy and explanation quality across both visual and textual domains. Our code will be made available upon the acceptance of the paper.

explanation, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.13053

Country: Europe > Netherlands (0.14)

Genre:

Overview (0.68)
Research Report > New Finding (0.34)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

TAGExplainer: Narrating Graph Explanations for Text-Attributed Graph Learning Models

Pan, Bo, Xiong, Zhen, Wu, Guanchen, Zhang, Zheng, Zhang, Yifei, Zhao, Liang

arXiv.org Artificial IntelligenceOct-19-2024

Representation learning of Text-Attributed Graphs (TAGs) has garnered significant attention due to its applications in various domains, including recommendation systems and social networks. Despite advancements in TAG learning methodologies, challenges remain in explainability due to the black-box nature of existing TAG representation learning models. This paper presents TAGExplainer, the first method designed to generate natural language explanations for TAG learning. TAGExplainer employs a generative language model that maps input-output pairs to explanations reflecting the model's decision-making process. To address the lack of annotated ground truth explanations in real-world scenarios, we propose first generating pseudo-labels that capture the model's decisions from saliency-based explanations, then the pseudo-label generator is iteratively trained based on three training objectives focusing on faithfulness and brevity via Expert Iteration, to improve the quality of generated pseudo-labels. The high-quality pseudo-labels are finally utilized to train an end-to-end explanation generator model. Extensive experiments are conducted to demonstrate the effectiveness of TAGExplainer in producing faithful and concise natural language explanations.

explanation, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2410.15268

Country:

Asia (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Services (0.48)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(3 more...)

Add feedback

TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations

Zhang, Zheng, Hu, Yuntong, Pan, Bo, Ling, Chen, Zhao, Liang

arXiv.org Artificial IntelligenceMay-26-2024

Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions, enabling detailed representation of data and their relationships across a broad spectrum of real-world scenarios. Despite the potential for deeper insights, existing TAG representation learning primarily relies on supervised methods, necessitating extensive labeled data and limiting applicability across diverse contexts. This paper introduces a new self-supervised learning framework, Text-And-Graph Multi-View Alignment (TAGA), which overcomes these constraints by integrating TAGs' structural and semantic dimensions. TAGA constructs two complementary views: Text-of-Graph view, which organizes node texts into structured documents based on graph topology, and the Graph-of-Text view, which converts textual nodes and connections into graph data. By aligning representations from both views, TAGA captures joint textual and structural information. In addition, a novel structure-preserving random walk algorithm is proposed for efficient training on large-sized TAGs. Our framework demonstrates strong performance in zero-shot and few-shot scenarios across eight real-world datasets.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.168

Country: Asia (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Add feedback

GRAG: Graph Retrieval-Augmented Generation

Hu, Yuntong, Lei, Zhihan, Zhang, Zheng, Pan, Bo, Ling, Chen, Zhao, Liang

arXiv.org Artificial IntelligenceMay-26-2024

While Retrieval-Augmented Generation (RAG) enhances the accuracy and relevance of responses by generative language models, it falls short in graph-based contexts where both textual and topological information are important. Naive RAG approaches inherently neglect the structural intricacies of textual graphs, resulting in a critical gap in the generation process. To address this challenge, we introduce $\textbf{Graph Retrieval-Augmented Generation (GRAG)}$, which significantly enhances both the retrieval and generation processes by emphasizing the importance of subgraph structures. Unlike RAG approaches that focus solely on text-based entity retrieval, GRAG maintains an acute awareness of graph topology, which is crucial for generating contextually and factually coherent responses. Our GRAG approach consists of four main stages: indexing of $k$-hop ego-graphs, graph retrieval, soft pruning to mitigate the impact of irrelevant entities, and generation with pruned textual subgraphs. GRAG's core workflow-retrieving textual subgraphs followed by soft pruning-efficiently identifies relevant subgraph structures while avoiding the computational infeasibility typical of exhaustive subgraph searches, which are NP-hard. Moreover, we propose a novel prompting strategy that achieves lossless conversion from textual subgraphs to hierarchical text descriptions. Extensive experiments on graph multi-hop reasoning benchmarks demonstrate that in scenarios requiring multi-hop reasoning on textual graphs, our GRAG approach significantly outperforms current state-of-the-art RAG methods while effectively mitigating hallucinations.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.16506

Country: North America > United States (0.14)

Genre:

Overview (0.68)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Causal Generative Models with Property Control

Zhao, Qilong, Wang, Shiyu, Bai, Guangji, Pan, Bo, Qin, Zhaohui, Zhao, Liang

arXiv.org Machine LearningMay-25-2024

Generating data with properties of interest by external users while following the right causation among its intrinsic factors is important yet has not been well addressed jointly. This is due to the long-lasting challenge of jointly identifying key latent variables, their causal relations, and their correlation with properties of interest, as well as how to leverage their discoveries toward causally controlled data generation. To address these challenges, we propose a novel deep generative framework called the Correlation-aware Causal Variational Auto-encoder (C2VAE). This framework simultaneously recovers the correlation and causal relationships between properties using disentangled latent vectors. Specifically, causality is captured by learning the causal graph on latent variables through a structural causal model, while correlation is learned via a novel correlation pooling algorithm. Extensive experiments demonstrate C2VAE's ability to accurately recover true causality and correlation, as well as its superiority in controllable data generation compared to baseline models.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2405.16219

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

ELAD: Explanation-Guided Large Language Models Active Distillation

Zhang, Yifei, Pan, Bo, Ling, Chen, Hu, Yuntong, Zhao, Liang

arXiv.org Artificial IntelligenceFeb-20-2024

The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs to smaller models, often fail to determine whether the knowledge has been sufficiently transferred, potentially resulting in high costs or incomplete distillation. In this paper, we propose an Explanation-Guided LLMs Active Distillation (ELAD) framework that employs an active learning strategy to optimize the balance between annotation costs and model performance. To improve efficient sample selection, we introduce an explanation-guided sample selection method that identifies samples challenging its reasoning by exploiting uncertainties in explanation steps. Additionally, we present a customized LLM-annotated explanation revision technique where the teacher model detects and corrects flaws in the student model's reasoning. Our experiments across various reasoning datasets demonstrate that our framework significantly enhances the efficiency of LLM knowledge distillation.

explanation, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.13098

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous Systems

Lu, Jiaying, Pan, Bo, Chen, Jieyi, Feng, Yingchaojie, Hu, Jingyuan, Peng, Yuchen, Chen, Wei

arXiv.org Artificial IntelligenceFeb-14-2024

Recently, Large Language Model based Autonomous system(LLMAS) has gained great popularity for its potential to simulate complicated behaviors of human societies. One of its main challenges is to present and analyze the dynamic events evolution of LLMAS. In this work, we present a visualization approach to explore detailed statuses and agents' behavior within LLMAS. We propose a general pipeline that establishes a behavior structure from raw LLMAS execution events, leverages a behavior summarization algorithm to construct a hierarchical summary of the entire structure in terms of time sequence, and a cause trace method to mine the causal relationship between agent behaviors. We then develop AgentLens, a visual analysis system that leverages a hierarchical temporal visualization for illustrating the evolution of LLMAS, and supports users to interactively investigate details and causes of agents' behaviors. Two usage scenarios and a user study demonstrate the effectiveness and usability of our AgentLens.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2402.08995

Country:

Europe (0.67)
North America > United States > Illinois (0.14)
Asia > Middle East > Israel (0.14)

Genre: Questionnaire & Opinion Survey (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Explaining latent representations of generative models with large multimodal models

Zhu, Mengdan, Liu, Zhenke, Pan, Bo, Angirekula, Abhinav, Zhao, Liang

arXiv.org Artificial IntelligenceFeb-2-2024

Learning interpretable representations of data generative latent factors is an important topic for the development of artificial intelligence. With the rise of the large multimodal model, it can align images with text to generate answers. In this work, we propose a framework to comprehensively explain each latent factor in the generative models using a large multimodal model. We further measure the uncertainty of our generated explanations, quantitatively evaluate the performance of explanation generation among multiple large multimodal models, and qualitatively visualize the variations of each latent factor to learn the disentanglement effects of different generative models on explanations. Finally, we discuss the explanatory capabilities and limitations of state-of-the-art large multimodal models.

explanation, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2402.01858

Country:

North America > United States > Illinois (0.14)
North America > United States > Pennsylvania (0.14)
North America > United States > Michigan (0.14)

Genre: Research Report (0.64)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Controllable Data Generation Via Iterative Data-Property Mutual Mappings

Pan, Bo, Qin, Muran, Wang, Shiyu, Zhang, Yifei, Zhao, Liang

arXiv.org Artificial IntelligenceOct-11-2023

Deep generative models have been widely used for their ability to generate realistic data samples in various areas, such as images, molecules, text, and speech. One major goal of data generation is controllability, namely to generate new data with desired properties. Despite growing interest in the area of controllable generation, significant challenges still remain, including 1) disentangling desired properties with unrelated latent variables, 2) out-of-distribution property control, and 3) objective optimization for out-of-distribution property control. To address these challenges, in this paper, we propose a general framework to enhance VAE-based data generators with property controllability and ensure disentanglement. Our proposed objective can be optimized on both data seen and unseen in the training set. We propose a training procedure to train the objective in a semi-supervised manner by iteratively conducting mutual mappings between the data and properties. The proposed framework is implemented on four VAE-based controllable generators to evaluate its performance on property error, disentanglement, generation quality, and training time. The results indicate that our proposed framework enables more precise control over the properties of generated samples in a short training time, ensuring the disentanglement and keeping the validity of the generated samples.

artificial intelligence, iterative data-property mutual mapping, machine learning

arXiv.org Artificial Intelligence

2310.07683

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

XNLI: Explaining and Diagnosing NLI-based Visual Data Analysis

Feng, Yingchaojie, Wang, Xingbo, Pan, Bo, Wong, Kam Kwai, Ren, Yi, Liu, Shi, Yan, Zihan, Ma, Yuxin, Qu, Huamin, Chen, Wei

arXiv.org Artificial IntelligenceJan-24-2023

Natural language interfaces (NLIs) enable users to flexibly specify analytical intentions in data visualization. However, diagnosing the visualization results without understanding the underlying generation process is challenging. Our research explores how to provide explanations for NLIs to help users locate the problems and further revise the queries. We present XNLI, an explainable NLI system for visual data analysis. The system introduces a Provenance Generator to reveal the detailed process of visual transformations, a suite of interactive widgets to support error adjustments, and a Hint Generator to provide query revision hints based on the analysis of user queries and interactions. Two usage scenarios of XNLI and a user study verify the effectiveness and usability of the system. Results suggest that XNLI can significantly enhance task accuracy without interrupting the NLI-based analysis process.

artificial intelligence, natural language, query, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TVCG.2023.3240003

2301.10385

Country:

North America > United States (1.00)
Asia > China (1.00)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.48)

Industry:

Education > Educational Setting (0.67)
Media > Film (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Human Computer Interaction > Interfaces (0.89)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.67)

Add feedback