Goto

Collaborating Authors

 Sitaula, Chiranjibi


Neonatal Face and Facial Landmark Detection from Video Recordings

arXiv.org Artificial Intelligence

This paper explores automated face and facial landmark detection of neonates, which is an important first step in many video-based neonatal health applications, such as vital sign estimation, pain assessment, sleep-wake classification, and jaundice detection. Utilising three publicly available datasets of neonates in the clinical environment, 366 images (258 subjects) and 89 (66 subjects) were annotated for training and testing, respectively. Transfer learning was applied to two YOLO-based models, with input training images augmented with random horizontal flipping, photo-metric colour distortion, translation and scaling during each training epoch. Additionally, the re-orientation of input images and fusion of trained deep learning models was explored. Our proposed model based on YOLOv7Face outperformed existing methods with a mean average precision of 84.8% for face detection, and a normalised mean error of 0.072 for facial landmark detection. Overall, this will assist in the development of fully automated neonatal health assessment algorithms.


Multi-channel CNN to classify nepali covid-19 related tweets using hybrid features

arXiv.org Artificial Intelligence

Because of the current COVID-19 pandemic with its increasing fears among people, it has triggered several health complications such as depression and anxiety. Such complications have not only affected the developed countries but also developing countries such as Nepal. These complications can be understood from peoples' tweets/comments posted online after their proper analysis and sentiment classification. Nevertheless, owing to the limited number of tokens/words in each tweet, it is always crucial to capture multiple information associated with them for their better understanding. In this study, we, first, represent each tweet by combining both syntactic and semantic information, called hybrid features. The syntactic information is generated from the bag of words method, whereas the semantic information is generated from the combination of the fastText-based (ft) and domain-specific (ds) methods. Second, we design a novel multi-channel convolutional neural network (MCNN), which ensembles the multiple CNNs, to capture multi-scale information for better classification. Last, we evaluate the efficacy of both the proposed feature extraction method and the MCNN model classifying tweets into three sentiment classes (positive, neutral and negative) on NepCOV19Tweets dataset, which is the only public COVID-19 tweets dataset in Nepali language. The evaluation results show that the proposed hybrid features outperform individual feature extraction methods with the highest classification accuracy of 69.7% and the MCNN model outperforms the existing methods with the highest classification accuracy of 71.3% during classification.


Noisy Neonatal Chest Sound Separation for High-Quality Heart and Lung Sounds

arXiv.org Artificial Intelligence

Stethoscope-recorded chest sounds provide the opportunity for remote cardio-respiratory health monitoring of neonates. However, reliable monitoring requires high-quality heart and lung sounds. This paper presents novel Non-negative Matrix Factorisation (NMF) and Non-negative Matrix Co-Factorisation (NMCF) methods for neonatal chest sound separation. To assess these methods and compare with existing single-source separation methods, an artificial mixture dataset was generated comprising of heart, lung and noise sounds. Signal-to-noise ratios were then calculated for these artificial mixtures. These methods were also tested on real-world noisy neonatal chest sounds and assessed based on vital sign estimation error and a signal quality score of 1-5 developed in our previous works. Additionally, the computational cost of all methods was assessed to determine the applicability for real-time processing. Overall, both the proposed NMF and NMCF methods outperform the next best existing method by 2.7dB to 11.6dB for the artificial dataset and 0.40 to 1.12 signal quality improvement for the real-world dataset. The median processing time for the sound separation of a 10s recording was found to be 28.3s for NMCF and 342ms for NMF. Because of stable and robust performance, we believe that our proposed methods are useful to denoise neonatal heart and lung sound in a real-world environment. Codes for proposed and existing methods can be found at: https://github.com/egrooby-monash/Heart-and-Lung-Sound-Separation.


Tag-based Semantic Features for Scene Image Classification

arXiv.org Artificial Intelligence

The existing image feature extraction methods are primarily based on the content and structure information of images, and rarely consider the contextual semantic information. Regarding some types of images such as scenes and objects, the annotations and descriptions of them available on the web may provide reliable contextual semantic information for feature extraction. In this paper, we introduce novel semantic features of an image based on the annotations and descriptions of its similar images available on the web. Specifically, we propose a new method which consists of two consecutive steps to extract our semantic features. For each image in the training set, we initially search the top $k$ most similar images from the internet and extract their annotations/descriptions (e.g., tags or keywords). The annotation information is employed to design a filter bank for each image category and generate filter words (codebook). Finally, each image is represented by the histogram of the occurrences of filter words in all categories. We evaluate the performance of the proposed features in scene image classification on three commonly-used scene image datasets (i.e., MIT-67, Scene15 and Event8). Our method typically produces a lower feature dimension than existing feature extraction methods. Experimental results show that the proposed features generate better classification accuracies than vision based and tag based features, and comparable results to deep learning based features.