Goto

Collaborating Authors

 Calgary


Swin Transformer for Fast MRI

arXiv.org Artificial Intelligence

Magnetic resonance imaging (MRI) is an important non-invasive clinical tool that can produce high-resolution and reproducible images. However, a long scanning time is required for high-quality MR images, which leads to exhaustion and discomfort of patients, inducing more artefacts due to voluntary movements of the patients and involuntary physiological movements. To accelerate the scanning process, methods by k-space undersampling and deep learning based reconstruction have been popularised. This work introduced SwinMR, a novel Swin transformer based method for fast MRI reconstruction. The whole network consisted of an input module (IM), a feature extraction module (FEM) and an output module (OM). The IM and OM were 2D convolutional layers and the FEM was composed of a cascaded of residual Swin transformer blocks (RSTBs) and 2D convolutional layers. The RSTB consisted of a series of Swin transformer layers (STLs). The shifted windows multi-head self-attention (W-MSA/SW-MSA) of STL was performed in shifted windows rather than the multi-head self-attention (MSA) of the original transformer in the whole image space. A novel multi-channel loss was proposed by using the sensitivity maps, which was proved to reserve more textures and details. We performed a series of comparative studies and ablation studies in the Calgary-Campinas public brain MR dataset and conducted a downstream segmentation experiment in the Multi-modal Brain Tumour Segmentation Challenge 2017 dataset. The results demonstrate our SwinMR achieved high-quality reconstruction compared with other benchmark methods, and it shows great robustness with different undersampling masks, under noise interruption and on different datasets. The code is publicly available at https://github.com/ayanglab/SwinMR.


Machine Learning: Algorithms, Models, and Applications

arXiv.org Artificial Intelligence

Recent times are witnessing rapid development in machine learning algorithm systems, especially in reinforcement learning, natural language processing, computer and robot vision, image processing, speech, and emotional processing and understanding. In tune with the increasing importance and relevance of machine learning models, algorithms, and their applications, and with the emergence of more innovative uses cases of deep learning and artificial intelligence, the current volume presents a few innovative research works and their applications in real world, such as stock trading, medical and healthcare systems, and software automation. The chapters in the book illustrate how machine learning and deep learning algorithms and models are designed, optimized, and deployed. The volume will be useful for advanced graduate and doctoral students, researchers, faculty members of universities, practicing data scientists and data engineers, professionals, and consultants working on the broad areas of machine learning, deep learning, and artificial intelligence.


Biometrics in the Time of Pandemic: 40% Masked Face Recognition Degradation can be Reduced to 2%

arXiv.org Artificial Intelligence

In this study of the face recognition on masked versus unmasked faces generated using Flickr-Faces-HQ and SpeakingFaces datasets, we report 36.78% degradation of recognition performance caused by the mask-wearing at the time of pandemics, in particular, in border checkpoint scenarios. We have achieved better performance and reduced the degradation to 1.79% using advanced deep learning approaches in the cross-spectral domain.


Domain-Agnostic Clustering with Self-Distillation

arXiv.org Artificial Intelligence

Recent advancements in self-supervised learning have reduced the gap between supervised and unsupervised representation learning. However, most self-supervised and deep clustering techniques rely heavily on data augmentation, rendering them ineffective for many learning tasks where insufficient domain knowledge exists for performing augmentation. We propose a new self-distillation based algorithm for domain-agnostic clustering. Our method builds upon the existing deep clustering frameworks and requires no separate student model. The proposed method outperforms existing domain agnostic (augmentation-free) algorithms on CIFAR-10. We empirically demonstrate that knowledge distillation can improve unsupervised representation learning by extracting richer `dark knowledge' from the model than using predicted labels alone. Preliminary experiments also suggest that self-distillation improves the convergence of DeepCluster-v2.


Edge-Enhanced Dual Discriminator Generative Adversarial Network for Fast MRI with Parallel Imaging Using Multi-view Information

arXiv.org Artificial Intelligence

In clinical medicine, magnetic resonance imaging (MRI) is one of the most important tools for diagnosis, triage, prognosis, and treatment planning. However, MRI suffers from an inherent slow data acquisition process because data is collected sequentially in k-space. In recent years, most MRI reconstruction methods proposed in the literature focus on holistic image reconstruction rather than enhancing the edge information. This work steps aside this general trend by elaborating on the enhancement of edge information. Specifically, we introduce a novel parallel imaging coupled dual discriminator generative adversarial network (PIDD-GAN) for fast multi-channel MRI reconstruction by incorporating multi-view information. The dual discriminator design aims to improve the edge information in MRI reconstruction. One discriminator is used for holistic image reconstruction, whereas the other one is responsible for enhancing edge information. An improved U-Net with local and global residual learning is proposed for the generator. Frequency channel attention blocks (FCA Blocks) are embedded in the generator for incorporating attention mechanisms. Content loss is introduced to train the generator for better reconstruction quality. We performed comprehensive experiments on Calgary-Campinas public brain MR dataset and compared our method with state-of-the-art MRI reconstruction methods. Ablation studies of residual learning were conducted on the MICCAI13 dataset to validate the proposed modules. Results show that our PIDD-GAN provides high-quality reconstructed MR images, with well-preserved edge information. The time of single-image reconstruction is below 5ms, which meets the demand of faster processing.


Announcing support for extracting data from identity documents using Amazon Textract

#artificialintelligence

Creating efficiencies in your business is at the top of your list. You want your employees to be more productive, have them focus on high impact tasks, or find ways to implement better processes to improve the outcomes to your customers. There are various ways to solve this problem, and more companies are turning to artificial intelligence (AI) and machine learning (ML) to help. In the financial services sector, there is the creation of new accounts online, or in healthcare there are new digital platforms to schedule and manage appointments, which require users to fill out forms. These can be error prone, time consuming, and certainly improved upon.


How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey

arXiv.org Artificial Intelligence

Deepfake is content or material that is synthetically generated or manipulated using artificial intelligence (AI) methods, to be passed off as real and can include audio, video, image, and text synthesis. This survey has been conducted with a different perspective compared to existing survey papers, that mostly focus on just video and image deepfakes. This survey not only evaluates generation and detection methods in the different deepfake categories, but mainly focuses on audio deepfakes that are overlooked in most of the existing surveys. This paper critically analyzes and provides a unique source of audio deepfake research, mostly ranging from 2016 to 2020. To the best of our knowledge, this is the first survey focusing on audio deepfakes in English. This survey provides readers with a summary of 1) different deepfake categories 2) how they could be created and detected 3) the most recent trends in this domain and shortcomings in detection methods 4) audio deepfakes, how they are created and detected in more detail which is the main focus of this paper. We found that Generative Adversarial Networks(GAN), Convolutional Neural Networks (CNN), and Deep Neural Networks (DNN) are common ways of creating and detecting deepfakes. In our evaluation of over 140 methods we found that the majority of the focus is on video deepfakes and in particular in the generation of video deepfakes. We found that for text deepfakes there are more generation methods but very few robust methods for detection, including fake news detection, which has become a controversial area of research because of the potential of heavy overlaps with human generation of fake content. This paper is an abbreviated version of the full survey and reveals a clear need to research audio deepfakes and particularly detection of audio deepfakes.


AI is getting better at writing students' essays

#artificialintelligence

The dramatic rise of online learning during the COVID-19 pandemic has spotlit concerns about the role of technology in exam surveillance -- and also in student cheating. Some universities have reported more cheating during the pandemic, and such concerns are unfolding in a climate where technologies that allow for the automation of writing continue to improve. Over the past two years, the ability of artificial intelligence to generate writing has leaped forward significantly, particularly with the development of what's known as the language generator GPT-3. With this, companies such as Google, Microsoft, and NVIDIA can now produce "human-like" text. AI-generated writing has raised the stakes of how universities and schools will gauge what constitutes academic misconduct, such as plagiarism.


Deep Learning Approach for Aggressive Driving Behaviour Detection

arXiv.org Artificial Intelligence

Driving behaviour is one of the primary causes of road crashes and accidents, and these can be decreased by identifying and minimizing aggressive driving behaviour. This study identifies the timesteps when a driver in different circumstances (rush, mental conflicts, reprisal) begins to drive aggressively. An observer (real or virtual) is needed to examine driving behaviour to discover aggressive driving occasions; we overcome this problem by using a smartphone's GPS sensor to detect locations and classify drivers' driving behaviour every three minutes. To detect timeseries patterns in our dataset, we employ RNN (GRU, LSTM) algorithms to identify patterns during the driving course. The algorithm is independent of road, vehicle, position, or driver characteristics. We conclude that three minutes (or more) of driving (120 seconds of GPS data) is sufficient to identify driver behaviour. The results show high accuracy and a high F1 score.


Could Artificial Intelligence Be About To Replace Your Doctor?

#artificialintelligence

The US health and medical insurance industry is a $1.1-trillion maze that is impossible to navigate. And in the bigger scenario of a massive $11-trillion-plus global healthcare industry, America is definitely not first. Americans are fed up, and a digital revolution that goes way beyond telemedicine is the only thing that will restore control. Fixing it is a highly disruptive, multi-trillion-dollar opportunity. Telemedicine, or'Telehealth', will soar from $42 billion in 2018 to $400 billion in 2026. Healthcare IT alone will be worth $390 billion by 2024.