Siem Reap Province
Hierarchical Memory Organization for Wikipedia Generation
Yu, Eugene J., Zhu, Dawei, Song, Yifan, Wong, Xiangyu, Zhang, Jiebin, Shi, Wenxuan, Li, Xiaoguang, Liu, Qun, Li, Sujian
Generating Wikipedia articles autonomously is a challenging task requiring the integration of accurate, comprehensive, and well-structured information from diverse sources. This paper introduces the Memory Organization-based Generation (MOG) framework, a novel approach to address these challenges by leveraging a hierarchical memory architecture. MOG extracts fine-grained memory units from web documents, recursively organizes them into a Wikipedia-style hierarchical structure, and uses this structure to guide the generation process. This ensures alignment between memory and the article outline, improving both informativeness and verifiability while minimizing hallucinations. Additionally, a citation module is implemented to enhance traceability by linking every generated sentence to specific memory units. Evaluations on our newly created WikiStart dataset demonstrate that MOG outperforms baseline methods in producing informative and reliable articles, making it particularly robust in real-world scenarios.
GAEA: A Geolocation Aware Conversational Model
Campos, Ron, Vayani, Ashmal, Kulkarni, Parth Parag, Gupta, Rohit, Dutta, Aritra, Shah, Mubarak
Image geolocalization, in which, traditionally, an AI model predicts the precise GPS coordinates of an image is a challenging task with many downstream applications. However, the user cannot utilize the model to further their knowledge other than the GPS coordinate; the model lacks an understanding of the location and the conversational ability to communicate with the user. In recent days, with tremendous progress of large multimodal models (LMMs) proprietary and open-source researchers have attempted to geolocalize images via LMMs. However, the issues remain unaddressed; beyond general tasks, for more specialized downstream tasks, one of which is geolocalization, LMMs struggle. In this work, we propose to solve this problem by introducing a conversational model GAEA that can provide information regarding the location of an image, as required by a user. No large-scale dataset enabling the training of such a model exists. Thus we propose a comprehensive dataset GAEA with 800K images and around 1.6M question answer pairs constructed by leveraging OpenStreetMap (OSM) attributes and geographical context clues. For quantitative evaluation, we propose a diverse benchmark comprising 4K image-text pairs to evaluate conversational capabilities equipped with diverse question types. We consider 11 state-of-the-art open-source and proprietary LMMs and demonstrate that GAEA significantly outperforms the best open-source model, LLaVA-OneVision by 25.69% and the best proprietary model, GPT-4o by 8.28%. Our dataset, model and codes are available
Dynamic Localisation of Spatial-Temporal Graph Neural Network
Duan, Wenying, Guo, Shujun, huang, Wei, Rao, Hong, He, Xiaoxi
Spatial-temporal data, fundamental to many intelligent applications, reveals dependencies indicating causal links between present measurements at specific locations and historical data at the same or other locations. Within this context, adaptive spatial-temporal graph neural networks (ASTGNNs) have emerged as valuable tools for modelling these dependencies, especially through a data-driven approach rather than pre-defined spatial graphs. While this approach offers higher accuracy, it presents increased computational demands. Addressing this challenge, this paper delves into the concept of localisation within ASTGNNs, introducing an innovative perspective that spatial dependencies should be dynamically evolving over time. We introduce \textit{DynAGS}, a localised ASTGNN framework aimed at maximising efficiency and accuracy in distributed deployment. This framework integrates dynamic localisation, time-evolving spatial graphs, and personalised localisation, all orchestrated around the Dynamic Graph Generator, a light-weighted central module leveraging cross attention. The central module can integrate historical information in a node-independent manner to enhance the feature representation of nodes at the current moment. This improved feature representation is then used to generate a dynamic sparse graph without the need for costly data exchanges, and it supports personalised localisation. Performance assessments across two core ASTGNN architectures and nine real-world datasets from various applications reveal that \textit{DynAGS} outshines current benchmarks, underscoring that the dynamic modelling of spatial dependencies can drastically improve model expressibility, flexibility, and system efficiency, especially in distributed settings.
CONTINUUM: Detecting APT Attacks through Spatial-Temporal Graph Neural Networks
Bahar, Atmane Ayoub Mansour, Ferrahi, Kamel Soaid, Messai, Mohamed-Lamine, Seba, Hamida, Amrouche, Karima
Advanced Persistent Threats (APTs) represent a significant challenge in cybersecurity due to their sophisticated and stealthy nature. Traditional Intrusion Detection Systems (IDS) often fall short in detecting these multi-stage attacks. Recently, Graph Neural Networks (GNNs) have been employed to enhance IDS capabilities by analyzing the complex relationships within networked data. However, existing GNN-based solutions are hampered by high false positive rates and substantial resource consumption. In this paper, we present a novel IDS designed to detect APTs using a Spatio-Temporal Graph Neural Network Autoencoder. Our approach leverages spatial information to understand the interactions between entities within a graph and temporal information to capture the evolution of the graph over time. This dual perspective is crucial for identifying the sequential stages of APTs. Furthermore, to address privacy and scalability concerns, we deploy our architecture in a federated learning environment. This setup ensures that local data remains on-premise while encrypted model-weights are shared and aggregated using homomorphic encryption, maintaining data privacy and security. Our evaluation shows that this system effectively detects APTs with lower false positive rates and optimized resource usage compared to existing methods, highlighting the potential of spatio-temporal analysis and federated learning in enhancing cybersecurity defenses.
Dynamic Graph Communication for Decentralised Multi-Agent Reinforcement Learning
This work presents a novel communication framework for decentralized multi-agent systems operating in dynamic network environments. Integrated into a multi-agent reinforcement learning system, the framework is designed to enhance decision-making by optimizing the network's collective knowledge through efficient communication. Key contributions include adapting a static network packet-routing scenario to a dynamic setting with node failures, incorporating a graph attention network layer in a recurrent message-passing framework, and introducing a multi-round communication targeting mechanism. This approach enables an attention-based aggregation mechanism to be successfully trained within a sparse-reward, dynamic network packet-routing environment using only reinforcement learning. Experimental results show improvements in routing performance, including a 9.5 percent increase in average rewards and a 6.4 percent reduction in communication overhead compared to a baseline system. The study also examines the ethical and legal implications of deploying such systems in critical infrastructure and military contexts, identifies current limitations, and suggests potential directions for future research.