Brunei
MuCoS: Efficient Drug Target Discovery via Multi Context Aware Sampling in Knowledge Graphs
Gul, Haji, Naim, Abdul Ghani, Bhat, Ajaz Ahmad
Accurate prediction of drug target interactions is critical for accelerating drug discovery and elucidating complex biological mechanisms. In this work, we frame drug target prediction as a link prediction task on heterogeneous biomedical knowledge graphs (KG) that integrate drugs, proteins, diseases, pathways, and other relevant entities. Conventional KG embedding methods such as TransE and ComplEx SE are hindered by their reliance on computationally intensive negative sampling and their limited generalization to unseen drug target pairs. To address these challenges, we propose Multi Context Aware Sampling (MuCoS), a novel framework that prioritizes high-density neighbours to capture salient structural patterns and integrates these with contextual embeddings derived from BERT. By unifying structural and textual modalities and selectively sampling highly informative patterns, MuCoS circumvents the need for negative sampling, significantly reducing computational overhead while enhancing predictive accuracy for novel drug target associations and drug targets. Extensive experiments on the KEGG50k dataset demonstrate that MuCoS outperforms state-of-the-art baselines, achieving up to a 13\% improvement in mean reciprocal rank (MRR) in predicting any relation in the dataset and a 6\% improvement in dedicated drug target relation prediction.
MuCo-KGC: Multi-Context-Aware Knowledge Graph Completion
Gul, Haji, Bhat, Ajaz Ahmad, Naim, Abdul Ghani Haji
Knowledge graph completion (KGC) seeks to predict missing entities (e.g., heads or tails) or relationships in knowledge graphs (KGs), which often contain incomplete data. Traditional embedding-based methods, such as TransE and ComplEx, have improved tail entity prediction but struggle to generalize to unseen entities during testing. Textual-based models mitigate this issue by leveraging additional semantic context; however, their reliance on negative triplet sampling introduces high computational overhead, semantic inconsistencies, and data imbalance. Recent approaches, like KG-BERT, show promise but depend heavily on entity descriptions, which are often unavailable in KGs. Critically, existing methods overlook valuable structural information in the KG related to the entities and relationships. To address these challenges, we propose Multi-Context-Aware Knowledge Graph Completion (MuCo-KGC), a novel model that utilizes contextual information from linked entities and relations within the graph to predict tail entities. MuCo-KGC eliminates the need for entity descriptions and negative triplet sampling, significantly reducing computational complexity while enhancing performance. Our experiments on standard datasets, including FB15k-237, WN18RR, CoDEx-S, and CoDEx-M, demonstrate that MuCo-KGC outperforms state-of-the-art methods on three datasets. Notably, MuCo-KGC improves MRR on WN18RR, and CoDEx-S and CoDEx-M datasets by $1.63\%$, and $3.77\%$ and $20.15\%$ respectively, demonstrating its effectiveness for KGC tasks.
MuCoS: Efficient Drug-Target Prediction through Multi-Context-Aware Sampling
Gul, Haji, Naim, Abdul Gani Haji, Bhat, Ajaz A.
Drug-target interactions are critical for understanding biological processes and advancing drug discovery. However, traditional methods such as ComplEx-SE, TransE, and DistMult struggle with unseen relationships and negative triplets, which limits their effectiveness in drug-target prediction. To address these challenges, we propose Multi-Context-Aware Sampling (MuCoS), an efficient and positively accurate method for drug-target prediction. MuCoS reduces computational complexity by prioritizing neighbors of higher density to capture informative structural patterns. These optimized neighborhood representations are integrated with BERT, enabling contextualized embeddings for accurate prediction of missing relationships or tail entities. MuCoS avoids the need for negative triplet sampling, reducing computation while improving performance over unseen entities and relations. Experiments on the KEGG50k biomedical dataset show that MuCoS improved over existing models by 13% on MRR, 7% on Hits@1, 4% on Hits@3, and 18% on Hits@10 for the general relationship, and by 6% on MRR, 1% on Hits@1, 3% on Hits@3, and 12% on Hits@10 for prediction of drug-target relationship.
A Contextualized BERT model for Knowledge Graph Completion
Gul, Haji, Naim, Abdul Ghani, Bhat, Ajaz A.
Knowledge graphs (KGs) are valuable for representing structured, interconnected information across domains, enabling tasks like semantic search, recommendation systems and inference. A pertinent challenge with KGs, however, is that many entities (i.e., heads, tails) or relationships are unknown. Knowledge Graph Completion (KGC) addresses this by predicting these missing nodes or links, enhancing the graph's informational depth and utility. Traditional methods like TransE and ComplEx predict tail entities but struggle with unseen entities. Textual-based models leverage additional semantics but come with high computational costs, semantic inconsistencies, and data imbalance issues. Recent LLM-based models show improvement but overlook contextual information and rely heavily on entity descriptions. In this study, we introduce a contextualized BERT model for KGC that overcomes these limitations by utilizing the contextual information from neighbouring entities and relationships to predict tail entities. Our model eliminates the need for entity descriptions and negative triplet sampling, reducing computational demands while improving performance. Our model outperforms state-of-the-art methods on standard datasets, improving Hit@1 by 5.3% and 4.88% on FB15k-237 and WN18RR respectively, setting a new benchmark in KGC.
Predicting the usability of mobile applications using AI tools: the rise of large user interface models, opportunities, and challenges
Namoun, Abdallah, Alrehaili, Ahmed, Nisa, Zaib Un, Almoamari, Hani, Tufail, Ali
In 2022, 255 billion new app downloads were registered, and a whopping 167 billion USD was spent on app stores, a drastic increase from 230 billion app downloads in 2021. Interestingly, artificial intelligence is projected to increase mobile app downloads by 10% in 2024. To continue fueling their revenues in a highly competitive and volatile market, mobile app companies need to dedicate significant efforts to the design of user-friendly interfaces and the usability of their applications. Usability testing of mobile applications is inherently a complex and expensive process [1], yet rewarding in elaborating user requirements, identifying usability issues, and improving the quality of user experience [2]. Mobile usability testing encompasses several intertwined and laborious phases, including planning and designing the evaluation sessions, recruiting the intended users, conducting the testing sessions, and analyzing testing data to extract actionable insights [1].
Communication Traffic Characteristics Reveal an IoT Devices Identity
Chowdhury, Rajarshi Roy, Roy, Debashish, Abas, Pg Emeroylariffion
Internet of Things (IoT) is one of the technological advancements of the twenty-first century which can improve living standards. However, it also imposes new types of security challenges, including device authentication, traffic types classification, and malicious traffic identification, in the network domain. Traditionally, internet protocol (IP) and media access control (MAC) addresses are utilized for identifying network-connected devices in a network, whilst these addressing schemes are prone to be compromised, including spoofing attacks and MAC randomization. Therefore, device identification using only explicit identifiers is a challenging task. Accurate device identification plays a key role in securing a network. In this paper, a supervised machine learning-based device fingerprinting (DFP) model has been proposed for identifying network-connected IoT devices using only communication traffic characteristics (or implicit identifiers). A single transmission control protocol/internet protocol (TCP/IP) packet header features have been utilized for generating unique fingerprints, with the fingerprints represented as a vector of 22 features. Experimental results have shown that the proposed DFP method achieves over 98% in classifying individual IoT devices using the UNSW dataset with 22 smart-home IoT devices. This signifies that the proposed approach is invaluable to network operators in making their networks more secure.
Hybrid Fuzzy-Crisp Clustering Algorithm: Theory and Experiments
Kinjo, Akira R., Lai, Daphne Teck Ching
With the membership function being strictly positive, the conventional fuzzy c-means clustering method sometimes causes imbalanced influence when clusters of vastly different sizes exist. That is, an outstandingly large cluster drags to its center all the other clusters, however far they are separated. To solve this problem, we propose a hybrid fuzzy-crisp clustering algorithm based on a target function combining linear and quadratic terms of the membership function. In this algorithm, the membership of a data point to a cluster is automatically set to exactly zero if the data point is ``sufficiently'' far from the cluster center. In this paper, we present a new algorithm for hybrid fuzzy-crisp clustering along with its geometric interpretation. The algorithm is tested on twenty simulated data generated and five real-world datasets from the UCI repository and compared with conventional fuzzy and crisp clustering methods. The proposed algorithm is demonstrated to outperform the conventional methods on imbalanced datasets and can be competitive on more balanced datasets.
Internet of Things: Digital Footprints Carry A Device Identity
Chowdhury, Rajarshi Roy, Idris, Azam Che, Abas, Pg Emeroylariffion
The usage of technologically advanced devices has seen a boom in many domains, including education, automation, and healthcare; with most of the services requiring Internet-connectivity. To secure a network, device identification plays key role. In this paper, a device fingerprinting (DFP) model, which is able to distinguish between Internet of Things (IoT) and non-IoT devices, as well as uniquely identify individual devices, has been proposed. Four statistical features have been extracted from the consecutive five device-originated packets, to generate individual device fingerprints. The method has been evaluated using the Random Forest (RF) classifier and different datasets. Experimental results have shown that the proposed method achieves up to 99.8% accuracy in distinguishing between IoT and non-IoT devices and over 97.6% in classifying individual devices. These signify that the proposed method is useful in assisting operators in making their networks more secure and robust to security breaches and unauthorised access.
Device identification using optimized digital footprints
Chowdhury, Rajarshi Roy, Idris, Azam Che, Abas, Pg Emeroylariffion
The rapidly increasing number of internet of things (IoT) and non-IoT devices has imposed new security challenges to network administrators. Accurate device identification in the increasingly complex network structures is necessary. In this paper, a device fingerprinting (DFP) method has been proposed for device identification, based on digital footprints, which devices use for communication over a network. A subset of nine features have been selected from the network and transport layers of a single transmission control protocol/internet protocol packet based on attribute evaluators in Weka, to generate device-specific signatures. The method has been evaluated on two online datasets, and an experimental dataset, using different supervised machine learning (ML) algorithms. Results have shown that the method is able to distinguish device type with up to 100% precision using the random forest (RF) classifier, and classify individual devices with up to 95.7% precision. These results demonstrate the applicability of the proposed DFP method for device identification, in order to provide a more secure and robust network.
Neural Machine Translation model for University Email Application
Aneja, Sandhya, Mazid, Siti Nur Afikah Bte Abdul, Aneja, Nagender
Machine translation has many applications such as news translation, email translation, official letter translation etc. Commercial translators, e.g. Google Translation lags in regional vocabulary and are unable to learn the bilingual text in the source and target languages within the input. In this paper, a regional vocabulary-based application-oriented Neural Machine Translation (NMT) model is proposed over the data set of emails used at the University for communication over a period of three years. A state-of-the-art Sequence-to-Sequence Neural Network for ML -> EN and EN -> ML translations is compared with Google Translate using Gated Recurrent Unit Recurrent Neural Network machine translation model with attention decoder. The low BLEU score of Google Translation in comparison to our model indicates that the application based regional models are better. The low BLEU score of EN -> ML of our model and Google Translation indicates that the Malay Language has complex language features corresponding to English.