AITopics | Jameel, Shoaib

Collaborating Authors

Jameel, Shoaib

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enforcing Consistency and Fairness in Multi-level Hierarchical Classification with a Mask-based Output Layer

Chen, Shijing, Jameel, Shoaib, Bouadjenek, Mohamed Reda, Tang, Feilong, Naseem, Usman, Suleiman, Basem, Hacid, Hakim, Salim, Flora D., Razzak, Imran

arXiv.org Artificial IntelligenceMar-19-2025

Traditional Multi-level Hierarchical Classification (MLHC) classifiers often rely on backbone models with $n$ independent output layers. This structure tends to overlook the hierarchical relationships between classes, leading to inconsistent predictions that violate the underlying taxonomy. Additionally, once a backbone architecture for an MLHC classifier is selected, adapting the model to accommodate new tasks can be challenging. For example, incorporating fairness to protect sensitive attributes within a hierarchical classifier necessitates complex adjustments to maintain the class hierarchy while enforcing fairness constraints. In this paper, we extend this concept to hierarchical classification by introducing a fair, model-agnostic layer designed to enforce taxonomy and optimize specific objectives, including consistency, fairness, and exact match. Our evaluations demonstrate that the proposed layer not only improves the fairness of predictions but also enforces the taxonomy, resulting in consistent predictions and superior performance. Compared to Large Language Models (LLMs) employing in-processing de-biasing techniques and models without any bias correction, our approach achieves better outcomes in both fairness and accuracy, making it particularly valuable in sectors like e-commerce, healthcare, and education, where predictive reliability is crucial.

classifier, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2503.15566

Country:

Oceania > Australia > New South Wales (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Long Context Modeling with Ranked Memory-Augmented Retrieval

Alselwi, Ghadir, Xue, Hao, Jameel, Shoaib, Suleiman, Basem, Salim, Flora D., Razzak, Imran

arXiv.org Artificial IntelligenceMar-18-2025

Effective long-term memory management is crucial for language models handling extended contexts. We introduce a novel framework that dynamically ranks memory entries based on relevance. Unlike previous works, our model introduces a novel relevance scoring and a pointwise re-ranking model for key-value embeddings, inspired by learning-to-rank techniques in information retrieval. Enhanced Ranked Memory Augmented Retrieval ERMAR achieves state-of-the-art results on standard benchmarks.

arxiv, ermar, retrieval, (12 more...)

arXiv.org Artificial Intelligence

2503.148

Country:

Oceania > Australia > New South Wales (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification

Chen, Shijing, Bouadjenek, Mohamed Reda, Jameel, Shoaib, Naseem, Usman, Suleiman, Basem, Salim, Flora D., Hacid, Hakim, Razzak, Imran

arXiv.org Artificial IntelligenceJan-12-2025

Multi-level Hierarchical Classification (MLHC) tackles the challenge of categorizing items within a complex, multi-layered class structure. However, traditional MLHC classifiers often rely on a backbone model with independent output layers, which tend to ignore the hierarchical relationships between classes. This oversight can lead to inconsistent predictions that violate the underlying taxonomy. Leveraging Large Language Models (LLMs), we propose a novel taxonomy-embedded transitional LLM-agnostic framework for multimodality classification. The cornerstone of this advancement is the ability of models to enforce consistency across hierarchical levels. Our evaluations on the MEP-3M dataset - a multi-modal e-commerce product dataset with various hierarchical levels - demonstrated a significant performance improvement compared to conventional LLM structures.

classification, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.06827

Country:

Oceania > Australia (0.28)
Europe > United Kingdom (0.28)
Asia (0.28)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

GAMED: Knowledge Adaptive Multi-Experts Decoupling for Multimodal Fake News Detection

Shen, Lingzhi, Long, Yunfei, Cai, Xiaohao, Razzak, Imran, Chen, Guanming, Liu, Kang, Jameel, Shoaib

arXiv.org Artificial IntelligenceDec-11-2024

Multimodal fake news detection often involves modelling heterogeneous data sources, such as vision and language. Existing detection methods typically rely on fusion effectiveness and cross-modal consistency to model the content, complicating understanding how each modality affects prediction accuracy. Additionally, these methods are primarily based on static feature modelling, making it difficult to adapt to the dynamic changes and relationships between different data modalities. This paper develops a significantly novel approach, GAMED, for multimodal modelling, which focuses on generating distinctive and discriminative features through modal decoupling to enhance cross-modal synergies, thereby optimizing overall performance in the detection process. GAMED leverages multiple parallel expert networks to refine features and pre-embed semantic knowledge to improve the experts' ability in information selection and viewpoint sharing. Subsequently, the feature distribution of each modality is adaptively adjusted based on the respective experts' opinions. GAMED also introduces a novel classification technique to dynamically manage contributions from different modalities, while improving the explainability of decisions. Experimental results on the Fakeddit and Yang datasets demonstrate that GAMED performs better than recently developed state-of-the-art models. The source code can be accessed at https://github.com/slz0925/GAMED.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.12164

Country:

Asia (0.67)
Europe (0.48)
Oceania > Australia (0.28)

Genre: Research Report > Promising Solution (0.68)

Industry:

Media > News (1.00)
Information Technology (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

From Explainable to Interpretable Deep Learning for Natural Language Processing in Healthcare: How Far from Reality?

Huang, Guangming, Li, Yingya, Jameel, Shoaib, Long, Yunfei, Papanastasiou, Giorgos

arXiv.org Artificial IntelligenceMay-9-2024

Deep learning (DL) has substantially enhanced natural language processing (NLP) in healthcare research. However, the increasing complexity of DL-based NLP necessitates transparent model interpretability, or at least explainability, for reliable decision-making. This work presents a thorough scoping review of explainable and interpretable DL in healthcare NLP. The term "eXplainable and Interpretable Artificial Intelligence" (XIAI) is introduced to distinguish XAI from IAI. Different models are further categorized based on their functionality (model-, input-, output-based) and scope (local, global). Our analysis shows that attention mechanisms are the most prevalent emerging IAI technique. The use of IAI is growing, distinguishing it from XAI. The major challenges identified are that most XIAI does not explore "global" modelling processes, the lack of best practices, and the lack of systematic evaluation and benchmarks. One important opportunity is to use attention mechanisms to enhance multi-modal XIAI for personalized medicine. Additionally, combining DL with causal logic holds promise. Our discussion encourages the integration of XIAI in Large Language Models (LLMs) and domain-specific smaller models. In conclusion, XIAI adoption in healthcare requires dedicated in-house expertise. Collaboration with domain experts, end-users, and policymakers can lead to ready-to-use XIAI methods across NLP and medical tasks. While challenges exist, XIAI techniques offer a valuable foundation for interpretable NLP algorithms in healthcare.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.csbj.2024.05.004

2403.11894

Country:

Europe > United Kingdom (0.14)
Asia > China (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Would You Trust an AI Doctor? Building Reliable Medical Predictions with Kernel Dropout Uncertainty

Azam, Ubaid, Razzak, Imran, Vishwakarma, Shelly, Hacid, Hakim, Zhang, Dell, Jameel, Shoaib

arXiv.org Artificial IntelligenceApr-16-2024

The growing capabilities of AI raise questions about their trustworthiness in healthcare, particularly due to opaque decision-making and limited data availability. This paper proposes a novel approach to address these challenges, introducing a Bayesian Monte Carlo Dropout model with kernel modelling. Our model is designed to enhance reliability on small medical datasets, a crucial barrier to the wider adoption of AI in healthcare. This model leverages existing language models for improved effectiveness and seamlessly integrates with current workflows. We demonstrate significant improvements in reliability, even with limited data, offering a promising step towards building trust in AI-driven medical predictions and unlocking its potential to improve patient care.

data mining, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2404.10483

Country: Oceania > Australia > New South Wales (0.14)

Genre: Research Report > Promising Solution (0.66)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)
Health & Medicine > Health Care Technology > Medical Record (0.68)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

BayesJudge: Bayesian Kernel Language Modelling with Confidence Uncertainty in Legal Judgment Prediction

Azam, Ubaid, Razzak, Imran, Vishwakarma, Shelly, Hacid, Hakim, Zhang, Dell, Jameel, Shoaib

arXiv.org Artificial IntelligenceApr-16-2024

Predicting legal judgments with reliable confidence is paramount for responsible legal AI applications. While transformer-based deep neural networks (DNNs) like BERT have demonstrated promise in legal tasks, accurately assessing their prediction confidence remains crucial. We present a novel Bayesian approach called BayesJudge that harnesses the synergy between deep learning and deep Gaussian Processes to quantify uncertainty through Bayesian kernel Monte Carlo dropout. Our method leverages informative priors and flexible data modelling via kernels, surpassing existing methods in both predictive accuracy and confidence estimation as indicated through brier score. Extensive evaluations of public legal datasets showcase our model's superior performance across diverse tasks. We also introduce an optimal solution to automate the scrutiny of unreliable predictions, resulting in a significant increase in the accuracy of the model's predictions by up to 27\%. By empowering judges and legal professionals with more reliable information, our work paves the way for trustworthy and transparent legal AI applications that facilitate informed decisions grounded in both knowledge and quantified uncertainty.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.10481

Country:

Asia (0.28)
North America > United States (0.28)
Oceania > Australia > New South Wales (0.14)

Genre: Research Report (1.00)

Industry: Law > Litigation (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

IDoFew: Intermediate Training Using Dual-Clustering in Language Models for Few Labels Text Classification

Alsuhaibani, Abdullah, Zogan, Hamad, Razzak, Imran, Jameel, Shoaib, Xu, Guandong

arXiv.org Artificial IntelligenceJan-8-2024

Language models such as Bidirectional Encoder Representations from Transformers (BERT) have been very effective in various Natural Language Processing (NLP) and text mining tasks including text classification. However, some tasks still pose challenges for these models, including text classification with limited labels. This can result in a cold-start problem. Although some approaches have attempted to address this problem through single-stage clustering as an intermediate training step coupled with a pre-trained language model, which generates pseudo-labels to improve classification, these methods are often error-prone due to the limitations of the clustering algorithms. To overcome this, we have developed a novel two-stage intermediate clustering with subsequent fine-tuning that models the pseudo-labels reliably, resulting in reduced prediction errors. The key novelty in our model, IDoFew, is that the two-stage clustering coupled with two different clustering algorithms helps exploit the advantages of the complementary algorithms that reduce the errors in generating reliable pseudo-labels for fine-tuning. Our approach has shown significant improvements compared to strong comparative models.

machine learning, natural language, text classification, (15 more...)

arXiv.org Artificial Intelligence

2401.04025

Country:

Europe (1.00)
Oceania > Australia > New South Wales (0.14)
North America > United States > Minnesota (0.14)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

Topics in Contextualised Attention Embeddings

Talebpour, Mozhgan, de Herrera, Alba Garcia Seco, Jameel, Shoaib

arXiv.org Artificial IntelligenceJan-11-2023

Contextualised word vectors obtained via pre-trained language models encode a variety of knowledge that has already been exploited in applications. Complementary to these language models are probabilistic topic models that learn thematic patterns from the text. Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation. The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics. To address this question, we design different probe experiments. Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters. We strongly believe that our work paves way for further research into the relationships between probabilistic topic models and pre-trained language models.

artificial intelligence, natural language, topic model, (18 more...)

arXiv.org Artificial Intelligence

2301.04339

Country: Europe (0.68)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

CLUE: Contextualised Unified Explainable Learning of User Engagement in Video Lectures

Roy, Sujit, Gorle, Gnaneswara Rao, Gaur, Vishal, Raza, Haider, Jameel, Shoaib

arXiv.org Artificial IntelligenceJan-14-2022

Predicting contextualised engagement in videos is a long-standing problem that has been popularly attempted by exploiting the number of views or the associated likes using different computational methods. The recent decade has seen a boom in online learning resources, and during the pandemic, there has been an exponential rise of online teaching videos without much quality control. The quality of the content could be improved if the creators could get constructive feedback on their content. Employing an army of domain expert volunteers to provide feedback on the videos might not scale. As a result, there has been a steep rise in developing computational methods to predict a user engagement score that is indicative of some form of possible user engagement, i.e., to what level a user would tend to engage with the content. A drawback in current methods is that they model various features separately, in a cascaded approach, that is prone to error propagation. Besides, most of them do not provide crucial explanations on how the creator could improve their content. In this paper, we have proposed a new unified model, CLUE for the educational domain, which learns from the features extracted from freely available public online teaching videos and provides explainable feedback on the video along with a user engagement score. Given the complexity of the task, our unified framework employs different pre-trained models working together as an ensemble of classifiers. Our model exploits various multi-modal features to model the complexity of language, context agnostic information, textual emotion of the delivered content, animation, speaker's pitch and speech emotions. Under a transfer learning setup, the overall model, in the unified space, is fine-tuned for downstream applications.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2201.05651

Country: Europe > United Kingdom (0.28)

Genre:

Research Report > New Finding (0.68)
Instructional Material > Course Syllabus & Notes (0.66)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback