Cloned Identity Detection in Social-Sensor Clouds based on Incomplete Profiles
Alharbi, Ahmed, Dong, Hai, Yi, Xun, Abeysekara, Prabath
–arXiv.org Artificial Intelligence
We propose a novel approach to effectively detect cloned identities of social-sensor cloud service providers (i.e. social media users) in the face of incomplete non-privacy-sensitive profile data. Named ICD-IPD, the proposed approach first extracts account pairs with similar usernames or screen names from a given set of user accounts collected from a social media. It then learns a multi-view representation associated with a given account and extracts two categories of features for every single account. These two categories of features include profile and Weighted Generalised Canonical Correlation Analysis (WGCCA)-based features that may potentially contain missing values. To counter the impact of such missing values, a missing value imputer will next impute the missing values of the aforementioned profile and WGCCA-based features. After that, the proposed approach further extracts two categories of augmented features for each account pair identified previously, namely, 1) similarity and 2) differences-based features. Finally, these features are concatenated and fed into a Light Gradient Boosting Machine classifier to detect identity cloning. We evaluated and compared the proposed approach against the existing state-of-the-art identity cloning approaches and other machine or deep learning models atop a real-world dataset. The experimental results show that the proposed approach outperforms the state-of-the-art approaches and models in terms of Precision, Recall and F1-score.
arXiv.org Artificial Intelligence
Nov-2-2024
- Genre:
- Overview > Innovation (0.54)
- Research Report
- New Finding (0.66)
- Promising Solution (0.54)
- Industry:
- Information Technology
- Security & Privacy (1.00)
- Services (1.00)
- Information Technology
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (1.00)
- Natural Language (1.00)
- Machine Learning
- Communications > Social Media (1.00)
- Data Science > Data Mining (1.00)
- Artificial Intelligence
- Information Technology