Goto

Collaborating Authors

 Luo, Junliang


Optimizing Blockchain Analysis: Tackling Temporality and Scalability with an Incremental Approach with Metropolis-Hastings Random Walks

arXiv.org Machine Learning

Blockchain technology, with implications in the financial domain, offers data in the form of large-scale transaction networks. Analyzing transaction networks facilitates fraud detection, market analysis, and supports government regulation. Despite many graph representation learning methods for transaction network analysis, we pinpoint two salient limitations that merit more investigation. Existing methods predominantly focus on the snapshots of transaction networks, sidelining the evolving nature of blockchain transaction networks. Existing methodologies may not sufficiently emphasize efficient, incremental learning capabilities, which are essential for addressing the scalability challenges in ever-expanding large-scale transaction networks. To address these challenges, we employed an incremental approach for random walk-based node representation learning in transaction networks. Further, we proposed a Metropolis-Hastings-based random walk mechanism for improved efficiency. The empirical evaluation conducted on blockchain transaction datasets reveals comparable performance in node classification tasks while reducing computational overhead. Potential applications include transaction network monitoring, the efficient classification of blockchain addresses for fraud detection or the identification of specialized address types within the network.


Investigating Similarities Across Decentralized Financial (DeFi) Services

arXiv.org Artificial Intelligence

We explore the adoption of graph representation learning (GRL) algorithms to investigate similarities across services offered by Decentralized Finance (DeFi) protocols. Following existing literature, we use Ethereum transaction data to identify the DeFi building blocks. These are sets of protocol-specific smart contracts that are utilized in combination within single transactions and encapsulate the logic to conduct specific financial services such as swapping or lending cryptoassets. We propose a method to categorize these blocks into clusters based on their smart contract attributes and the graph structure of their smart contract calls. We employ GRL to create embedding vectors from building blocks and agglomerative models for clustering them. To evaluate whether they are effectively grouped in clusters of similar functionalities, we associate them with eight financial functionality categories and use this information as the target label. We find that in the best-case scenario purity reaches .888. We use additional information to associate the building blocks with protocol-specific target labels, obtaining comparable purity (.864) but higher V-Measure (.571); we discuss plausible explanations for this difference. In summary, this method helps categorize existing financial products offered by DeFi protocols, and can effectively automatize the detection of similar DeFi services, especially within protocols.


Hallucination Detection and Hallucination Mitigation: An Investigation

arXiv.org Artificial Intelligence

Large language models (LLMs), including ChatGPT, Bard, and Llama, have achieved remarkable successes over the last two years in a range of different applications. In spite of these successes, there exist concerns that limit the wide application of LLMs. A key problem is the problem of hallucination. Hallucination refers to the fact that in addition to correct responses, LLMs can also generate seemingly correct but factually incorrect responses. This report aims to present a comprehensive review of the current literature on both hallucination detection and hallucination mitigation. We hope that this report can serve as a good reference for both engineers and researchers who are interested in LLMs and applying them to real world tasks.


Adaptive Dynamic Programming for Energy-Efficient Base Station Cell Switching

arXiv.org Artificial Intelligence

Energy saving in wireless networks is growing in importance due to increasing demand for evolving new-gen cellular networks, environmental and regulatory concerns, and potential energy crises arising from geopolitical tensions. In this work, we propose an approximate dynamic programming (ADP)-based method coupled with online optimization to switch on/off the cells of base stations to reduce network power consumption while maintaining adequate Quality of Service (QoS) metrics. We use a multilayer perceptron (MLP) given each state-action pair to predict the power consumption to approximate the value function in ADP for selecting the action with optimal expected power saved. To save the largest possible power consumption without deteriorating QoS, we include another MLP to predict QoS and a long short-term memory (LSTM) for predicting handovers, incorporated into an online optimization algorithm producing an adaptive QoS threshold for filtering cell switching actions based on the overall QoS history. The performance of the method is evaluated using a practical network simulator with various real-world scenarios with dynamic traffic patterns.


Towards Improved Illicit Node Detection with Positive-Unlabelled Learning

arXiv.org Artificial Intelligence

We demonstrate the difference The nature of anonymity and decentralization of blockchain between the estimated values of evaluation metrics and the systems are making changes in the finance industry due actual values through an engineered PU dataset from the to its immutability, transparency, and automation [1]. Such Ethereum transaction dataset proposed in [15] to show the decentralized systems, however, are in the current stage of a concerns of assuming unlabeled data to be normal. We conduct temporarily unregulated environment [2], [3] with a variety of experiments to show that applying various PU classifiers can abnormal usages and security concerns. The abnormal usages help in improving the classification performance on two realworld include both the illicit activities clearly defined by traditional datasets with limited positive labels. The PU classifiers fiance: phishing scams, Ponzi schemes, money laundering, estimate potential identifiable class prior or treat the unlabeled etc. [4], and also the activities with no clear definition of examples as negative samples with label noise and learn with lawfulness or being just defined such as mixing services, i.e., biased models. We also compare various graph representation the mixer nodes involve in funds to confuse the trace of the methods for extracting node embedding vectors as the input transfers from the original source, e.g., US Department of the to get diverse data distribution for the same data to obtain Treasury declared Tornado Cash as a sanctioned entity [5], [6].