AITopics | Ali, Hassan

Collaborating Authors

Ali, Hassan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks

Gajdošech, Lukáš, Ali, Hassan, Habekost, Jan-Gerrit, Madaras, Martin, Kerzel, Matthias, Wermter, Stefan

arXiv.org Artificial IntelligenceMar-6-2025

Datasets for object detection often do not account for enough variety of glasses, due to their transparent and reflective properties. Specifically, open-vocabulary object detectors, widely used in embodied robotic agents, fail to distinguish subclasses of glasses. This scientific gap poses an issue to robotic applications that suffer from accumulating errors between detection, planning, and action execution. The paper introduces a novel method for the acquisition of real-world data from RGB-D sensors that minimizes human effort. We propose an auto-labeling pipeline that generates labels for all the acquired frames based on the depth measurements. We provide a novel real-world glass object dataset that was collected on the Neuro-Inspired COLlaborator (NICOL), a humanoid robot platform. The data set consists of 7850 images recorded from five different cameras. We show that our trained baseline model outperforms state-of-the-art open-vocabulary approaches. In addition, we deploy our baseline model in an embodied agent approach to the NICOL platform, on which it achieves a success rate of 81% in a human-robot bartending scenario.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2503.04308

Country:

Europe > Slovakia (0.14)
Europe > Germany (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.91)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback

Adversarially Guided Stateful Defense Against Backdoor Attacks in Federated Deep Learning

Ali, Hassan, Nepal, Surya, Kanhere, Salil S., Jha, Sanjay

arXiv.org Artificial IntelligenceOct-14-2024

Recent works have shown that Federated Learning (FL) is vulnerable to backdoor attacks. Existing defenses cluster submitted updates from clients and select the best cluster for aggregation. However, they often rely on unrealistic assumptions regarding client submissions and sampled clients population while choosing the best cluster. We show that in realistic FL settings, state-of-the-art (SOTA) defenses struggle to perform well against backdoor attacks in FL. To address this, we highlight that backdoored submissions are adversarially biased and overconfident compared to clean submissions. We, therefore, propose an Adversarially Guided Stateful Defense (AGSD) against backdoor attacks on Deep Neural Networks (DNNs) in FL scenarios. AGSD employs adversarial perturbations to a small held-out dataset to compute a novel metric, called the trust index, that guides the cluster selection without relying on any unrealistic assumptions regarding client submissions. Moreover, AGSD maintains a trust state history of each client that adaptively penalizes backdoored clients and rewards clean clients. In realistic FL settings, where SOTA defenses mostly fail to resist attacks, AGSD mostly outperforms all SOTA defenses with minimal drop in clean accuracy (5% in the worst-case compared to best accuracy) even when (a) given a very small held-out dataset -- typically AGSD assumes 50 samples (<= 0.1% of the training data) and (b) no heldout dataset is available, and out-of-distribution data is used instead. For reproducibility, our code will be openly available at: https://github.com/hassanalikhatim/AGSD.

agsd, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.11205

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robots Can Multitask Too: Integrating a Memory Architecture and LLMs for Enhanced Cross-Task Robot Action Generation

Ali, Hassan, Allgeuer, Philipp, Mazzola, Carlo, Belgiovine, Giulia, Kaplan, Burak Can, Wermter, Stefan

arXiv.org Artificial IntelligenceJul-18-2024

Abstract-- Large Language Models (LLMs) have been recently used in robot applications for grounding LLM commonsense reasoning with the robot's perception and physical abilities. In humanoid robots, memory also plays a critical role in fostering real-world embodiment and facilitating long-term interactive capabilities, especially in multi-task setups where the robot must remember previous task states, environment states, and executed actions. In this paper, we address incorporating memory processes with LLMs for generating cross-task robot actions, while the robot effectively switches between tasks. Our proposed dual-layered architecture features two LLMs, utilizing their complementary skills of reasoning and following instructions, combined with a memory model inspired by human cognition. Our results show a significant improvement in performance over a baseline of five robotic tasks, demonstrating the potential of integrating memory with LLMs for combining the robot's action and perception for adaptive task execution. I. INTRODUCTION Despite the physical limitations due to their embodiment, humanoid robots are particularly effective tools because of their anthropomorphic shape, which can significantly improve Nevertheless, LLM reasoning alone is environments designed for human interaction [1]. Moreover, not yet sufficient for implementing the cognitive system the humanoid physical shape supports collaborating with humans of embodied artificial agents, capable of solving complex whose legibility and predictability of robot actions are tasks and interacting with humans.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.13505

Country:

Europe > Italy (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation and Collaboration

Allgeuer, Philipp, Ali, Hassan, Wermter, Stefan

arXiv.org Artificial IntelligenceJun-29-2024

We investigate the use of Large Language Models (LLMs) to equip neural robotic agents with human-like social and cognitive competencies, for the purpose of open-ended human-robot conversation and collaboration. We introduce a modular and extensible methodology for grounding an LLM with the sensory perceptions and capabilities of a physical robot, and integrate multiple deep learning models throughout the architecture in a form of system integration. The integrated models encompass various functions such as speech recognition, speech generation, open-vocabulary object detection, human pose estimation, and gesture detection, with the LLM serving as the central text-based coordinating unit. The qualitative and quantitative results demonstrate the huge potential of LLMs in providing emergent cognition and interactive language-oriented control of robots in a natural and social manner.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2407.00518

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task

Ali, Hassan, Allgeuer, Philipp, Wermter, Stefan

arXiv.org Artificial IntelligenceApr-12-2024

Intention-based Human-Robot Interaction (HRI) systems allow robots to perceive and interpret user actions to proactively interact with humans and adapt to their behavior. Therefore, intention prediction is pivotal in creating a natural interactive collaboration between humans and robots. In this paper, we examine the use of Large Language Models (LLMs) for inferring human intention during a collaborative object categorization task with a physical robot. We introduce a hierarchical approach for interpreting user non-verbal cues, like hand gestures, body poses, and facial expressions and combining them with environment states and user verbal cues captured using an existing Automatic Speech Recognition (ASR) system. Our evaluation demonstrates the potential of LLMs to interpret non-verbal cues and to combine them with their context-understanding capabilities and real-world knowledge to support intention prediction during human-robot interaction.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.08424

Country: Europe (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Snapture -- A Novel Neural Architecture for Combined Static and Dynamic Hand Gesture Recognition

Ali, Hassan, Jirak, Doreen, Wermter, Stefan

arXiv.org Artificial IntelligenceFeb-27-2024

As robots are expected to get more involved in people's everyday lives, frameworks that enable intuitive user interfaces are in demand. Hand gesture recognition systems provide a natural way of communication and, thus, are an integral part of seamless Human-Robot Interaction (HRI). Recent years have witnessed an immense evolution of computational models powered by deep learning. However, state-of-the-art models fall short in expanding across different gesture domains, such as emblems and co-speech. In this paper, we propose a novel hybrid hand gesture recognition system. Our architecture enables learning both static and dynamic gestures: by capturing a so-called "snapshot" of the gesture performance at its peak, we integrate the hand pose along with the dynamic movement. Moreover, we present a method for analyzing the motion profile of a gesture to uncover its dynamic characteristics and which allows regulating a static channel based on the amount of motion. Our evaluation demonstrates the superiority of our approach on two gesture benchmarks compared to a CNNLSTM baseline. We also provide an analysis on a gesture class basis that unveils the potential of our Snapture architecture for performance improvements. Thanks to its modular implementation, our framework allows the integration of other multimodal data like facial expressions and head tracking, which are important cues in HRI scenarios, into one architecture. Thus, our work contributes both to gesture recognition research and machine learning applications for non-verbal communication with robots.

artificial intelligence, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s12559-023-10174-z

2205.15862

Country:

Europe (0.92)
North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Gesture Recognition (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adversarial Machine Learning for Social Good: Reframing the Adversary as an Ally

Al-Maliki, Shawqi, Qayyum, Adnan, Ali, Hassan, Abdallah, Mohamed, Qadir, Junaid, Hoang, Dinh Thai, Niyato, Dusit, Al-Fuqaha, Ala

arXiv.org Artificial IntelligenceOct-5-2023

Deep Neural Networks (DNNs) have been the driving force behind many of the recent advances in machine learning. However, research has shown that DNNs are vulnerable to adversarial examples -- input samples that have been perturbed to force DNN-based models to make errors. As a result, Adversarial Machine Learning (AdvML) has gained a lot of attention, and researchers have investigated these vulnerabilities in various settings and modalities. In addition, DNNs have also been found to incorporate embedded bias and often produce unexplainable predictions, which can result in anti-social AI applications. The emergence of new AI technologies that leverage Large Language Models (LLMs), such as ChatGPT and GPT-4, increases the risk of producing anti-social applications at scale. AdvML for Social Good (AdvML4G) is an emerging field that repurposes the AdvML bug to invent pro-social applications. Regulators, practitioners, and researchers should collaborate to encourage the development of pro-social applications and hinder the development of anti-social ones. In this work, we provide the first comprehensive review of the emerging field of AdvML4G. This paper encompasses a taxonomy that highlights the emergence of AdvML4G, a discussion of the differences and similarities between AdvML4G and AdvML, a taxonomy covering social good-related concepts and aspects, an exploration of the motivations behind the emergence of AdvML4G at the intersection of ML4G and AdvML, and an extensive summary of the works that utilize AdvML4G as an auxiliary tool for innovating pro-social applications. Finally, we elaborate upon various challenges and open research issues that require significant attention from the research community.

artificial intelligence, large language model, natural language, (6 more...)

arXiv.org Artificial Intelligence

2310.03614

Genre: Research Report (0.40)

Industry: Social Sector (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Membership Inference Attacks on DNNs using Adversarial Perturbations

Ali, Hassan, Qayyum, Adnan, Al-Fuqaha, Ala, Qadir, Junaid

arXiv.org Artificial IntelligenceJul-11-2023

Several membership inference (MI) attacks have been proposed to audit a target DNN. Given a set of subjects, MI attacks tell which subjects the target DNN has seen during training. This work focuses on the post-training MI attacks emphasizing high confidence membership detection -- True Positive Rates (TPR) at low False Positive Rates (FPR). Current works in this category -- likelihood ratio attack (LiRA) and enhanced MI attack (EMIA) -- only perform well on complex datasets (e.g., CIFAR-10 and Imagenet) where the target DNN overfits its train set, but perform poorly on simpler datasets (0% TPR by both attacks on Fashion-MNIST, 2% and 0% TPR respectively by LiRA and EMIA on MNIST at 1% FPR). To address this, firstly, we unify current MI attacks by presenting a framework divided into three stages -- preparation, indication and decision. Secondly, we utilize the framework to propose two novel attacks: (1) Adversarial Membership Inference Attack (AMIA) efficiently utilizes the membership and the non-membership information of the subjects while adversarially minimizing a novel loss function, achieving 6% TPR on both Fashion-MNIST and MNIST datasets; and (2) Enhanced AMIA (E-AMIA) combines EMIA and AMIA to achieve 8% and 4% TPRs on Fashion-MNIST and MNIST datasets respectively, at 1% FPR. Thirdly, we introduce two novel augmented indicators that positively leverage the loss information in the Gaussian neighborhood of a subject. This improves TPR of all four attacks on average by 2.5% and 0.25% respectively on Fashion-MNIST and MNIST datasets at 1% FPR. Finally, we propose simple, yet novel, evaluation metric, the running TPR average (RTA) at a given FPR, that better distinguishes different MI attacks in the low FPR region. We also show that AMIA and E-AMIA are more transferable to the unknown DNNs (other than the target DNN) and are more robust to DP-SGD training as compared to LiRA and EMIA.

artificial intelligence, machine learning, mi attack, (15 more...)

arXiv.org Artificial Intelligence

2307.05193

Country:

Asia > Middle East > Qatar (0.14)
Asia > Pakistan (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Robust Surgical Tools Detection in Endoscopic Videos with Noisy Data

Qayyum, Adnan, Ali, Hassan, Caputo, Massimo, Vohra, Hunaid, Akinosho, Taofeek, Abioye, Sofiat, Berrou, Ilhem, Capik, Paweł, Qadir, Junaid, Bilal, Muhammad

arXiv.org Artificial IntelligenceJul-3-2023

Over the past few years, surgical data science has attracted substantial interest from the machine learning (ML) community. Various studies have demonstrated the efficacy of emerging ML techniques in analysing surgical data, particularly recordings of procedures, for digitizing clinical and non-clinical functions like preoperative planning, context-aware decision-making, and operating skill assessment. However, this field is still in its infancy and lacks representative, well-annotated datasets for training robust models in intermediate ML tasks. Also, existing datasets suffer from inaccurate labels, hindering the development of reliable models. In this paper, we propose a systematic methodology for developing robust models for surgical tool detection using noisy data. Our methodology introduces two key innovations: (1) an intelligent active learning strategy for minimal dataset identification and label correction by human experts; and (2) an assembling strategy for a student-teacher model-based self-training framework to achieve the robust classification of 14 surgical tools in a semi-supervised fashion. Furthermore, we employ weighted data loaders to handle difficult class labels and address class imbalance issues. The proposed methodology achieves an average F1-score of 85.88\% for the ensemble model-based self-training with class weights, and 80.88\% without class weights for noisy labels. Also, our proposed method significantly outperforms existing approaches, which effectively demonstrates its effectiveness.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2307.01232

Country:

Europe > United Kingdom > England (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Consistent Valid Physically-Realizable Adversarial Attack against Crowd-flow Prediction Models

Ali, Hassan, Butt, Muhammad Atif, Filali, Fethi, Al-Fuqaha, Ala, Qadir, Junaid

arXiv.org Artificial IntelligenceMar-5-2023

Recent works have shown that deep learning (DL) models can effectively learn city-wide crowd-flow patterns, which can be used for more effective urban planning and smart city management. However, DL models have been known to perform poorly on inconspicuous adversarial perturbations. Although many works have studied these adversarial perturbations in general, the adversarial vulnerabilities of deep crowd-flow prediction models in particular have remained largely unexplored. In this paper, we perform a rigorous analysis of the adversarial vulnerabilities of DL-based crowd-flow prediction models under multiple threat settings, making three-fold contributions. (1) We propose CaV-detect by formally identifying two novel properties - Consistency and Validity - of the crowd-flow prediction inputs that enable the detection of standard adversarial inputs with 0% false acceptance rate (FAR). (2) We leverage universal adversarial perturbations and an adaptive adversarial loss to present adaptive adversarial attacks to evade CaV-detect defense. (3) We propose CVPR, a Consistent, Valid and Physically-Realizable adversarial attack, that explicitly inducts the consistency and validity priors in the perturbation generation mechanism. We find out that although the crowd-flow models are vulnerable to adversarial perturbations, it is extremely challenging to simulate these perturbations in physical settings, notably when CaV-detect is in place. We also show that CVPR attack considerably outperforms the adaptively modified standard attacks in FAR and adversarial loss metrics. We conclude with useful insights emerging from our work and highlight promising future research directions.

artificial intelligence, machine learning, perturbation, (17 more...)

arXiv.org Artificial Intelligence

2303.02669

Country:

Asia (1.00)
North America > Canada (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback