Meng, Siyuan
Aligning Vision to Language: Text-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning
Liu, Junming, Meng, Siyuan, Gao, Yanting, Mao, Song, Cai, Pinlong, Yan, Guohang, Chen, Yirong, Bian, Zilin, Shi, Botian, Wang, Ding
Multimodal reasoning in Large Language Models (LLMs) struggles with incomplete knowledge and hallucination artifacts, challenges that textual Knowledge Graphs (KGs) only partially mitigate due to their modality isolation. While Multimodal Knowledge Graphs (MMKGs) promise enhanced cross-modal understanding, their practical construction is impeded by semantic narrowness of manual text annotations and inherent noise in visual-semantic entity linkages. In this paper, we propose Vision-align-to-Language integrated Knowledge Graph (VaLiK), a novel approach for constructing MMKGs that enhances LLMs reasoning through cross-modal information supplementation. Specifically, we cascade pre-trained Vision-Language Models (VLMs) to align image features with text, transforming them into descriptions that encapsulate image-specific information. Furthermore, we developed a cross-modal similarity verification mechanism to quantify semantic consistency, effectively filtering out noise introduced during feature alignment. Even without manually annotated image captions, the refined descriptions alone suffice to construct the MMKG. Compared to conventional MMKGs construction paradigms, our approach achieves substantial storage efficiency gains while maintaining direct entity-to-image linkage capability. Experimental results on multimodal reasoning tasks demonstrate that LLMs augmented with VaLiK outperform previous state-of-the-art models. Our code is published at https://github.com/Wings-Of-Disaster/VaLiK.
LinkThief: Combining Generalized Structure Knowledge with Node Similarity for Link Stealing Attack against GNN
Zhang, Yuxing, Meng, Siyuan, Chen, Chunchun, Peng, Mengyao, Gu, Hongyan, Huang, Xinli
Graph neural networks(GNNs) have a wide range of applications in multimedia.Recent studies have shown that Graph neural networks(GNNs) are vulnerable to link stealing attacks,which infers the existence of edges in the target GNN's training graph.Existing attacks are usually based on the assumption that links exist between two nodes that share similar posteriors;however,they fail to focus on links that do not hold under this assumption.To this end,we propose LinkThief,an improved link stealing attack that combines generalized structure knowledge with node similarity,in a scenario where the attackers' background knowledge contains partially leaked target graph and shadow graph.Specifically,to equip the attack model with insights into the link structure spanning both the shadow graph and the target graph,we introduce the idea of creating a Shadow-Target Bridge Graph and extracting edge subgraph structure features from it.Through theoretical analysis from the perspective of privacy theft,we first explore how to implement the aforementioned ideas.Building upon the findings,we design the Bridge Graph Generator to construct the Shadow-Target Bridge Graph.Then,the subgraph around the link is sampled by the Edge Subgraph Preparation Module.Finally,the Edge Structure Feature Extractor is designed to obtain generalized structure knowledge,which is combined with node similarity to form the features provided to the attack model.Extensive experiments validate the correctness of theoretical analysis and demonstrate that LinkThief still effectively steals links without extra assumptions.