AITopics | Gedeon, Tom

Collaborating Authors

Gedeon, Tom

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Labels Generated by Large Language Model Helps Measuring People's Empathy in Vitro

Hasan, Md Rakibul, Yao, Yue, Hossain, Md Zakir, Krishna, Aneesh, Rudas, Imre, Rahman, Shafin, Gedeon, Tom

arXiv.org Artificial IntelligenceDec-31-2024

Large language models (LLMs) have revolutionised numerous fields, with LLM-as-a-service (LLMSaaS) having a strong generalisation ability that offers accessible solutions directly without the need for costly training. In contrast to the widely studied prompt engineering for task solving directly (in vivo), this paper explores its potential in in-vitro applications. These involve using LLM to generate labels to help the supervised training of mainstream models by (1) noisy label correction and (2) training data augmentation with LLM-generated labels. In this paper, we evaluate this approach in the emerging field of empathy computing -- automating the prediction of psychological questionnaire outcomes from inputs like text sequences. Specifically, crowdsourced datasets in this domain often suffer from noisy labels that misrepresent underlying empathy. By leveraging LLM-generated labels to train pre-trained language models (PLMs) like RoBERTa, we achieve statistically significant accuracy improvements over baselines, achieving a state-of-the-art Pearson correlation coefficient of 0.648 on NewsEmp benchmarks. In addition, we bring insightful discussions, including current challenges in empathy computing, data biases in training data and evaluation metric selection. Code and LLM-generated data are available at https://github.com/hasan-rakibul/LLMPathy (available once the paper is accepted).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.00691

Country:

Asia (0.68)
Oceania > Australia (0.28)
Europe > Middle East > Malta (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

When Spatial meets Temporal in Action Recognition

Chen, Huilin, Wang, Lei, Chen, Yifan, Gedeon, Tom, Koniusz, Piotr

arXiv.org Artificial IntelligenceNov-22-2024

Video action recognition has made significant strides, but challenges remain in effectively using both spatial and temporal information. While existing methods often focus on either spatial features (e.g., object appearance) or temporal dynamics (e.g., motion), they rarely address the need for a comprehensive integration of both. Capturing the rich temporal evolution of video frames, while preserving their spatial details, is crucial for improving accuracy. In this paper, we introduce the Temporal Integration and Motion Enhancement (TIME) layer, a novel preprocessing technique designed to incorporate temporal information. The TIME layer generates new video frames by rearranging the original sequence, preserving temporal order while embedding $N^2$ temporally evolving frames into a single spatial grid of size $N \times N$. This transformation creates new frames that balance both spatial and temporal information, making them compatible with existing video models. When $N=1$, the layer captures rich spatial details, similar to existing methods. As $N$ increases ($N\geq2$), temporal information becomes more prominent, while the spatial information decreases to ensure compatibility with model inputs. We demonstrate the effectiveness of the TIME layer by integrating it into popular action recognition models, such as ResNet-50, Vision Transformer, and Video Masked Autoencoders, for both RGB and depth video data. Our experiments show that the TIME layer enhances recognition accuracy, offering valuable insights for video processing tasks.

artificial intelligence, machine learning, time layer, (18 more...)

arXiv.org Artificial Intelligence

2411.15284

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Position-Sensing Graph Neural Networks: Proactively Learning Nodes Relative Positions

Qin, Zhenyue, Anwar, Yiqun Zhang Saeed, Kim, Dongwoo, Liu, Yang, Ji, Pan, Gedeon, Tom

arXiv.org Artificial IntelligenceNov-3-2024

Most existing graph neural networks (GNNs) learn node embeddings using the framework of message passing and aggregation. Such GNNs are incapable of learning relative positions between graph nodes within a graph. To empower GNNs with the awareness of node positions, some nodes are set as anchors. Then, using the distances from a node to the anchors, GNNs can infer relative positions between nodes. However, P-GNNs arbitrarily select anchors, leading to compromising position-awareness and feature extraction. To eliminate this compromise, we demonstrate that selecting evenly distributed and asymmetric anchors is essential. On the other hand, we show that choosing anchors that can aggregate embeddings of all the nodes within a graph is NP-complete. Therefore, devising efficient optimal algorithms in a deterministic approach is practically not feasible. To ensure position-awareness and bypass NP-completeness, we propose Position-Sensing Graph Neural Networks (PSGNNs), learning how to choose anchors in a back-propagatable fashion. Experiments verify the effectiveness of PSGNNs against state-of-the-art GNNs, substantially improving performance on various synthetic and real-world graph datasets while enjoying stable scalability. Specifically, PSGNNs on average boost AUC more than 14% for pairwise node classification and 18% for link prediction over the existing state-of-the-art position-aware methods. Our source code is publicly available at: https://github.com/ZhenyueQin/PSGNN.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TNNLS.2024.3374464

2105.11346

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

LEGO: Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion

Ding, Dexuan, Wang, Lei, Zhu, Liyun, Gedeon, Tom, Koniusz, Piotr

arXiv.org Artificial IntelligenceOct-3-2024

In computer vision tasks, features often come from diverse representations, domains, and modalities, such as text, images, and videos. Effectively fusing these features is essential for robust performance, especially with the availability of powerful pre-trained models like vision-language models. However, common fusion methods, such as concatenation, element-wise operations, and non-linear techniques, often fail to capture structural relationships, deep feature interactions, and suffer from inefficiency or misalignment of features across domains. In this paper, we shift from high-dimensional feature space to a lower-dimensional, interpretable graph space by constructing similarity graphs that encode feature relationships at different levels, e.g., clip, frame, patch, token, etc. To capture deeper interactions, we use graph power expansions and introduce a learnable graph fusion operator to combine these graph powers for more effective fusion. Our approach is relationship-centric, operates in a homogeneous space, and is mathematically principled, resembling element-wise similarity score aggregation via multilinear polynomials. We demonstrate the effectiveness of our graph-based fusion method on video anomaly detection, showing strong performance across multi-representational, multi-modal, and multi-domain feature fusion tasks.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.01506

Country:

North America > United States (0.14)
Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

TrackNetV4: Enhancing Fast Sports Object Tracking with Motion Attention Maps

Raj, Arjun, Wang, Lei, Gedeon, Tom

arXiv.org Artificial IntelligenceSep-22-2024

Accurately detecting and tracking high-speed, small objects, such as balls in sports videos, is challenging due to factors like motion blur and occlusion. Although recent deep learning frameworks like TrackNetV1, V2, and V3 have advanced tennis ball and shuttlecock tracking, they often struggle in scenarios with partial occlusion or low visibility. This is primarily because these models rely heavily on visual features without explicitly incorporating motion information, which is crucial for precise tracking and trajectory prediction. In this paper, we introduce an enhancement to the TrackNet family by fusing high-level visual features with learnable motion attention maps through a motion-aware fusion mechanism, effectively emphasizing the moving ball's location and improving tracking performance. Our approach leverages frame differencing maps, modulated by a motion prompt layer, to highlight key motion regions over time. Experimental results on the tennis ball and shuttlecock datasets show that our method enhances the tracking performance of both TrackNetV2 and V3. We refer to our lightweight, plug-and-play solution, built on top of the existing TrackNet, as TrackNetV4.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2409.14543

Country: Oceania > Australia > Western Australia > Perth (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Machine Learning to Detect Anxiety Disorders from Error-Related Negativity and EEG Signals

Chandrasekar, Ramya, Hasan, Md Rakibul, Ghosh, Shreya, Gedeon, Tom, Hossain, Md Zakir

arXiv.org Artificial IntelligenceSep-16-2024

Anxiety is endemic to every person, with an occurrence rate of approximately 20% [World Health Organization, 2017]. Between 2020 and 2022, over one in six people (17.2% or 3.4 million people) aged 16 to 85 years experienced an anxiety disorder [Australian Bureau of Statistics]. Anxiety is caused by changes in the situation, nervousness and common symptoms, including sweating, trembling and excessive worrying, which affect a person's daily life. Anxiety disorders encompass a range of conditions, such as generalised anxiety disorder (GAD), panic disorder (PD), social anxiety disorder (SAD), obsessive-compulsive disorder (OCD), various phobia-related disorders, physical pain related protective behaviour [Li et al., 2020, 2021] and depression [Ghosh and Anwar, 2021]. Current clinical approaches for diagnosing these disorders often suffer from limitations in accuracy and objectivity, relying heavily on self-reports, patient histories and clinical observations. These methods can be subjective and may not capture the nuanced neural and behavioural patterns associated with anxiety, leading to potential misdiagnoses. Recent research has shown promising results in using machine learning techniques to detect anxiety through physiological analysis [Abd-Alrazaq et al., 2023], such as respiration, electrocardiogram (ECG), photoplethysmography (PPG), electrodermal response (EDA) and electroencephalography (EEG), to identify patterns associated with anxiety states [Abd-Alrazaq et al., 2023].

artificial intelligence, disorder, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.00028

Country:

Asia (0.68)
Oceania > Australia (0.48)
Europe (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Anxiety Disorder (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Motion meets Attention: Video Motion Prompts

Chen, Qixiang, Wang, Lei, Koniusz, Piotr, Gedeon, Tom

arXiv.org Artificial IntelligenceJul-3-2024

Videos contain rich spatio-temporal information. Traditional methods for extracting motion, used in tasks such as action recognition, often rely on visual contents rather than precise motion features. This phenomenon is referred to as 'blind motion extraction' behavior, which proves inefficient in capturing motions of interest due to a lack of motion-guided cues. Recently, attention mechanisms have enhanced many computer vision tasks by effectively highlighting salient visual areas. Inspired by this, we propose using a modified Sigmoid function with learnable slope and shift parameters as an attention mechanism to activate and modulate motion signals derived from frame differencing maps. This approach generates a sequence of attention maps that enhance the processing of motion-related video content. To ensure temporally continuity and smoothness of the attention maps, we apply pair-wise temporal attention variation regularization to remove unwanted motions (e.g., noise) while preserving important ones. We then perform Hadamard product between each pair of attention maps and the original video frames to highlight the evolving motions of interest over time. These highlighted motions, termed video motion prompts, are subsequently used as inputs to the model instead of the original video frames. We formalize this process as a motion prompt layer and incorporate the regularization term into the loss function to learn better motion prompts. This layer serves as an adapter between the model and the video data, bridging the gap between traditional 'blind motion extraction' and the extraction of relevant motions of interest.

artificial intelligence, machine learning, weight model, (15 more...)

arXiv.org Artificial Intelligence

2407.03179

Country: Oceania > Australia > Western Australia (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What Does Softmax Probability Tell Us about Classifiers Ranking Across Diverse Test Conditions?

Tu, Weijie, Deng, Weijian, Zheng, Liang, Gedeon, Tom

arXiv.org Artificial IntelligenceJun-14-2024

This work aims to develop a measure that can accurately rank the performance of various classifiers when they are tested on unlabeled data from out-of-distribution (OOD) distributions. We commence by demonstrating that conventional uncertainty metrics, notably the maximum Softmax prediction probability, possess inherent utility in forecasting model generalization across certain OOD contexts. Building on this insight, we introduce a new measure called Softmax Correlation (SoftmaxCorr). It calculates the cosine similarity between a class-class correlation matrix, constructed from Softmax output vectors across an unlabeled test dataset, and a predefined reference matrix that embodies ideal class correlations. A high resemblance of predictions to the reference matrix signals that the model delivers confident and uniform predictions across all categories, reflecting minimal uncertainty and confusion. Through rigorous evaluation across a suite of datasets, including ImageNet, CIFAR-10, and WILDS, we affirm the predictive validity of SoftmaxCorr in accurately forecasting model performance within both in-distribution (ID) and OOD settings. Furthermore, we discuss the limitations of our proposed measure and suggest avenues for future research.

artificial intelligence, generalization, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2406.09908

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

FunnelNet: An End-to-End Deep Learning Framework to Monitor Digital Heart Murmur in Real-Time

Jobayer, Md, Shawon, Md. Mehedi Hasan, Hasan, Md Rakibul, Ghosh, Shreya, Gedeon, Tom, Hossain, Md Zakir

arXiv.org Artificial IntelligenceMay-9-2024

Objective: Heart murmurs are abnormal sounds caused by turbulent blood flow within the heart. Several diagnostic methods are available to detect heart murmurs and their severity, such as cardiac auscultation, echocardiography, phonocardiogram (PCG), etc. However, these methods have limitations, including extensive training and experience among healthcare providers, cost and accessibility of echocardiography, as well as noise interference and PCG data processing. This study aims to develop a novel end-to-end real-time heart murmur detection approach using traditional and depthwise separable convolutional networks. Methods: Continuous wavelet transform (CWT) was applied to extract meaningful features from the PCG data. The proposed network has three parts: the Squeeze net, the Bottleneck, and the Expansion net. The Squeeze net generates a compressed data representation, whereas the Bottleneck layer reduces computational complexity using a depthwise-separable convolutional network. The Expansion net is responsible for up-sampling the compressed data to a higher dimension, capturing tiny details of the representative data. Results: For evaluation, we used four publicly available datasets and achieved state-of-the-art performance in all datasets. Furthermore, we tested our proposed network on two resource-constrained devices: a Raspberry PI and an Android device, stripping it down into a tiny machine learning model (TinyML), achieving a maximum of 99.70%. Conclusion: The proposed model offers a deep learning framework for real-time accurate heart murmur detection within limited resources. Significance: It will significantly result in more accessible and practical medical services and reduced diagnosis time to assist medical professionals. The code is publicly available at TBA.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2405.0957

Country:

Europe (0.93)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Deep Learning Approach to Diabetes Diagnosis

Zhang, Zeyu, Ahmed, Khandaker Asif, Hasan, Md Rakibul, Gedeon, Tom, Hossain, Md Zakir

arXiv.org Artificial IntelligenceMar-12-2024

Diabetes, resulting from inadequate insulin production or utilization, causes extensive harm to the body. Existing diagnostic methods are often invasive and come with drawbacks, such as cost constraints. Although there are machine learning models like Classwise k Nearest Neighbor (CkNN) and General Regression Neural Network (GRNN), they struggle with imbalanced data and result in under-performance. Leveraging advancements in sensor technology and machine learning, we propose a non-invasive diabetes diagnosis using a Back Propagation Neural Network (BPNN) with batch normalization, incorporating data re-sampling and normalization for class balancing. Our method addresses existing challenges such as limited performance associated with traditional machine learning. Experimental results on three datasets show significant improvements in overall accuracy, sensitivity, and specificity compared to traditional methods. Notably, we achieve accuracies of 89.81% in Pima diabetes dataset, 75.49% in CDC BRFSS2015 dataset, and 95.28% in Mesra Diabetes dataset. This underscores the potential of deep learning models for robust diabetes diagnosis. See project website https://steve-zeyu-zhang.github.io/DiabetesDiagnosis/

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2403.07483

Country:

North America > United States > California (0.14)
North America > United States > Minnesota (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback