AITopics | Mian, Ajmal

Collaborating Authors

Mian, Ajmal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring Bias in over 100 Text-to-Image Generative Models

Vice, Jordan, Akhtar, Naveed, Hartley, Richard, Mian, Ajmal

arXiv.org Artificial IntelligenceMar-10-2025

We investigate bias trends in text-to-image generative models over time, focusing on the increasing availability of models through open platforms like Hugging Face. While these platforms democratize AI, they also facilitate the spread of inherently biased models, often shaped by task-specific fine-tuning. Ensuring ethical and transparent AI deployment requires robust evaluation frameworks and quantifiable bias metrics. To this end, we assess bias across three key dimensions: (i) distribution bias, (ii) generative hallucination, and (iii) generative miss-rate. Analyzing over 100 models, we reveal how bias patterns evolve over time and across generative tasks. Our findings indicate that artistic and style-transferred models exhibit significant bias, whereas foundation models, benefiting from broader training distributions, are becoming progressively less biased. By identifying these systemic trends, we contribute a large-scale evaluation corpus to inform bias research and mitigation strategies, fostering more responsible AI development.

machine learning, natural language, pndmscheduler 512, (17 more...)

arXiv.org Artificial Intelligence

2503.08012

Country:

Asia (0.28)
Oceania > Australia (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.86)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Novel Object 6D Pose Estimation with a Single Reference View

Liu, Jian, Sun, Wei, Zeng, Kai, Zheng, Jin, Yang, Hui, Wang, Lin, Rahmani, Hossein, Mian, Ajmal

arXiv.org Artificial IntelligenceMar-7-2025

Existing novel object 6D pose estimation methods typically rely on CAD models or dense reference views, which are both difficult to acquire. Using only a single reference view is more scalable, but challenging due to large pose discrepancies and limited geometric and spatial information. To address these issues, we propose a Single-Reference-based novel object 6D (SinRef-6D) pose estimation method. Our key idea is to iteratively establish point-wise alignment in the camera coordinate system based on state space models (SSMs). Specifically, iterative camera-space point-wise alignment can effectively handle large pose discrepancies, while our proposed RGB and Points SSMs can capture long-range dependencies and spatial information from a single view, offering linear complexity and superior spatial modeling capability. Once pre-trained on synthetic data, SinRef-6D can estimate the 6D pose of a novel object using only a single reference view, without requiring retraining or a CAD model. Extensive experiments on six popular datasets and real-world robotic scenes demonstrate that we achieve on-par performance with CAD-based and dense reference view-based methods, despite operating in the more challenging single reference setting. Code will be released at https://github.com/CNJianLiu/SinRef-6D.

artificial intelligence, pose estimation, spatial reasoning, (17 more...)

arXiv.org Artificial Intelligence

2503.05578

Country:

Asia > China (0.14)
Oceania > Australia (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.54)

Add feedback

Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation

Liu, Jian, Sun, Wei, Yang, Hui, Deng, Pengchao, Liu, Chongpei, Sebe, Nicu, Rahmani, Hossein, Mian, Ajmal

arXiv.org Artificial IntelligenceFeb-4-2025

Nine-degrees-of-freedom (9-DoF) object pose and size estimation is crucial for enabling augmented reality and robotic manipulation. Category-level methods have received extensive research attention due to their potential for generalization to intra-class unknown objects. However, these methods require manual collection and labeling of large-scale real-world training data. To address this problem, we introduce a diffusion-based paradigm for domain-generalized category-level 9-DoF object pose estimation. Our motivation is to leverage the latent generalization ability of the diffusion model to address the domain generalization challenge in object pose estimation. This entails training the model exclusively on rendered synthetic data to achieve generalization to real-world scenes. We propose an effective diffusion model to redefine 9-DoF object pose estimation from a generative perspective. Our model does not require any 3D shape priors during training or inference. By employing the Denoising Diffusion Implicit Model, we demonstrate that the reverse diffusion process can be executed in as few as 3 steps, achieving near real-time performance. Finally, we design a robotic grasping system comprising both hardware and software components. Through comprehensive experiments on two benchmark datasets and the real-world robotic system, we show that our method achieves state-of-the-art domain generalization performance. Our code will be made public at https://github.com/CNJianLiu/Diff9D.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.02525

Country:

Europe (0.67)
Oceania > Australia > Western Australia (0.14)
Asia > China > Hunan Province (0.14)

Genre:

Research Report (1.00)
Personal (0.67)

Industry:

Government > Regional Government (0.92)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Safety Without Semantic Disruptions: Editing-free Safe Image Generation via Context-preserving Dual Latent Reconstruction

Vice, Jordan, Akhtar, Naveed, Hartley, Richard, Mian, Ajmal

arXiv.org Artificial IntelligenceNov-21-2024

Training multimodal generative models on large, uncurated datasets can result in users being exposed to harmful, unsafe and controversial or culturally-inappropriate outputs. While model editing has been proposed to remove or filter undesirable concepts in embedding and latent spaces, it can inadvertently damage learned manifolds, distorting concepts in close semantic proximity. We identify limitations in current model editing techniques, showing that even benign, proximal concepts may become misaligned. To address the need for safe content generation, we propose a modular, dynamic solution that leverages safety-context embeddings and a dual reconstruction process using tunable weighted summation in the latent space to generate safer images. Our method preserves global context without compromising the structural integrity of the learned manifolds. We achieve state-of-the-art results on safe image generation benchmarks, while offering controllable variation of model safety. We identify trade-offs between safety and censorship, which presents a necessary perspective in the development of ethical AI models. We will release our code. Keywords: Text-to-Image Models, Generative AI, Safety, Reliability, Model Editing

machine learning, natural language, semantic disruption, (18 more...)

arXiv.org Artificial Intelligence

2411.13982

Country:

North America > United States (0.29)
Oceania > Australia (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.93)
Law > Civil Rights & Constitutional Law (0.49)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)

Add feedback

On the Fairness, Diversity and Reliability of Text-to-Image Generative Models

Vice, Jordan, Akhtar, Naveed, Hartley, Richard, Mian, Ajmal

arXiv.org Artificial IntelligenceNov-21-2024

The widespread availability of multimodal generative models has sparked critical discussions on their fairness, reliability, and potential for misuse. While text-to-image models can produce high-fidelity, user-guided images, they also exhibit unpredictable behavior and vulnerabilities, which can be exploited to manipulate class or concept representations. To address this, we propose an evaluation framework designed to assess model reliability through their responses to globally- and locally-applied `semantic' perturbations in the embedding space, pinpointing inputs that trigger unreliable behavior. Our approach offers deeper insights into two essential aspects: (i) generative diversity, evaluating the breadth of visual representations for learned concepts, and (ii) generative fairness, examining how removing concepts from input prompts affects semantic guidance. Beyond these evaluations, our method lays the groundwork for detecting unreliable, bias-injected models and retrieval of bias provenance. We will release our code. Keywords: Fairness, Reliability, AI Ethics, Bias, Text-to-Image Models

artificial intelligence, evaluation, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.13981

Country:

North America > United States (0.28)
Oceania > Australia (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.66)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.62)

Add feedback

Artificial intelligence techniques in inherited retinal diseases: A review

Trinh, Han, Vice, Jordan, Charng, Jason, Tajbakhsh, Zahra, Alam, Khyber, Chen, Fred K., Mian, Ajmal

arXiv.org Artificial IntelligenceOct-9-2024

Inherited retinal diseases (IRDs) are a diverse group of genetic disorders that lead to progressive vision loss and are a major cause of blindness in working-age adults. The complexity and heterogeneity of IRDs pose significant challenges in diagnosis, prognosis, and management. Recent advancements in artificial intelligence (AI) offer promising solutions to these challenges. However, the rapid development of AI techniques and their varied applications have led to fragmented knowledge in this field. This review consolidates existing studies, identifies gaps, and provides an overview of AI's potential in diagnosing and managing IRDs. It aims to structure pathways for advancing clinical applications by exploring AI techniques like machine learning and deep learning, particularly in disease detection, progression prediction, and personalized treatment planning. Special focus is placed on the effectiveness of convolutional neural networks in these areas. Additionally, the integration of explainable AI is discussed, emphasizing its importance in clinical settings to improve transparency and trust in AI-based systems. The review addresses the need to bridge existing gaps in focused studies on AI's role in IRDs, offering a structured analysis of current AI techniques and outlining future research directions. It concludes with an overview of the challenges and opportunities in deploying AI for IRDs, highlighting the need for interdisciplinary collaboration and the continuous development of robust, interpretable AI models to advance clinical applications.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.09105

Country:

North America > United States (0.28)
Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Genetic Disease (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density

Yang, Peiyu, Akhtar, Naveed, Shah, Mubarak, Mian, Ajmal

arXiv.org Artificial IntelligenceJul-8-2024

Trustworthy machine learning necessitates meticulous regulation of model reliance on non-robust features. We propose a framework to delineate and regulate such features by attributing model predictions to the input. Within our approach, robust feature attributions exhibit a certain consistency, while non-robust feature attributions are susceptible to fluctuations. This behavior allows identification of correlation between model reliance on non-robust features and smoothness of marginal density of the input samples. Hence, we uniquely regularize the gradients of the marginal density w.r.t. the input features for robustness. We also devise an efficient implementation of our regularization to address the potential numerical instability of the underlying optimization process. Moreover, we analytically reveal that, as opposed to our marginal density smoothing, the prevalent input gradient regularization smoothens conditional or joint density of the input, which can cause limited robustness. Our experiments validate the effectiveness of the proposed method, providing clear evidence of its capability to address the feature leakage problem and mitigate spurious correlations. Extensive results further establish that our technique enables the model to exhibit robustness against perturbations in pixel values, input gradients, and density.

artificial intelligence, machine learning, regularization, (14 more...)

arXiv.org Artificial Intelligence

2407.0437

Country:

Oceania > Australia > Western Australia (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

Fast-PGM: Fast Probabilistic Graphical Model Learning and Inference

Jiang, Jiantong, Wen, Zeyi, Yang, Peiyu, Mansoor, Atif, Mian, Ajmal

arXiv.org Artificial IntelligenceMay-28-2024

Probabilistic graphical models (PGMs) serve as a powerful framework for modeling complex systems with uncertainty and extracting valuable insights from data. However, users face challenges when applying PGMs to their problems in terms of efficiency and usability. This paper presents Fast-PGM, an efficient and open-source library for PGM learning and inference. Fast-PGM supports comprehensive tasks on PGMs, including structure and parameter learning, as well as exact and approximate inference, and enhances efficiency of the tasks through computational and memory optimizations and parallelization techniques.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2405.15605

Country:

Oceania > Australia (0.29)
Asia > China (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Text-guided 3D Human Motion Generation with Keyframe-based Parallel Skip Transformer

Geng, Zichen, Han, Caren, Hayder, Zeeshan, Liu, Jian, Shah, Mubarak, Mian, Ajmal

arXiv.org Artificial IntelligenceMay-24-2024

Text-driven human motion generation is an emerging task in animation and humanoid robot design. Existing algorithms directly generate the full sequence which is computationally expensive and prone to errors as it does not pay special attention to key poses, a process that has been the cornerstone of animation for decades. We propose KeyMotion, that generates plausible human motion sequences corresponding to input text by first generating keyframes followed by in-filling. We use a Variational Autoencoder (VAE) with Kullback-Leibler regularization to project the keyframes into a latent space to reduce dimensionality and further accelerate the subsequent diffusion process. For the reverse diffusion, we propose a novel Parallel Skip Transformer that performs cross-modal attention between the keyframe latents and text condition. To complete the motion sequence, we propose a text-guided Transformer designed to perform motion-in-filling, ensuring the preservation of both fidelity and adherence to the physical constraints of human motion. Experiments show that our method achieves state-of-theart results on the HumanML3D dataset outperforming others on all R-precision metrics and MultiModal Distance. KeyMotion also achieves competitive performance on the KIT dataset, achieving the best results on Top3 R-precision, FID, and Diversity metrics.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.15439

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Florida > Orange County > Orlando (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Visual Attention Methods in Deep Learning: An In-Depth Survey

Hassanin, Mohammed, Anwar, Saeed, Radwan, Ibrahim, Khan, Fahad S, Mian, Ajmal

arXiv.org Artificial IntelligenceMay-5-2024

Inspired by the human cognitive system, attention is a mechanism that imitates the human cognitive awareness about specific information, amplifying critical details to focus more on the essential aspects of data. Deep learning has employed attention to boost performance for many applications. Interestingly, the same attention design can suit processing different data modalities and can easily be incorporated into large networks. Furthermore, multiple complementary attention mechanisms can be incorporated into one network. Hence, attention techniques have become extremely attractive. However, the literature lacks a comprehensive survey on attention techniques to guide researchers in employing attention in their deep models. Note that, besides being demanding in terms of training data and computational resources, transformers only cover a single category in self-attention out of the many categories available. We fill this gap and provide an in-depth survey of 50 attention techniques, categorizing them by their most prominent features. We initiate our discussion by introducing the fundamental concepts behind the success of the attention mechanism. Next, we furnish some essentials such as the strengths and limitations of each attention category, describe their fundamental building blocks, basic formulations with primary usage, and applications specifically for computer vision. We also discuss the challenges and general open questions related to attention mechanisms. Finally, we recommend possible future research directions for deep attention. All the information about visual attention methods in deep learning is provided at \href{https://github.com/saeed-anwar/VisualAttention}{https://github.com/saeed-anwar/VisualAttention}

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2204.07756

Country:

Oceania > Australia (0.67)
Asia > Middle East > Saudi Arabia > Eastern Province > Dhahran (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback