Goto

Collaborating Authors

 Yamac, Mehmet


Deformable Convolutions and LSTM-based Flexible Event Frame Fusion Network for Motion Deblurring

arXiv.org Artificial Intelligence

Event cameras differ from conventional RGB cameras in that they produce asynchronous data sequences. While RGB cameras capture every frame at a fixed rate, event cameras only capture changes in the scene, resulting in sparse and asynchronous data output. Despite the fact that event data carries useful information that can be utilized in motion deblurring of RGB cameras, integrating event and image information remains a challenge. Recent state-of-the-art CNN-based deblurring solutions produce multiple 2-D event frames based on the accumulation of event data over a time period. In most of these techniques, however, the number of event frames is fixed and predefined, which reduces temporal resolution drastically, particularly for scenarios when fast-moving objects are present or when longer exposure times are required. It is also important to note that recent modern cameras (e.g., cameras in mobile phones) dynamically set the exposure time of the image, which presents an additional problem for networks developed for a fixed number of event frames. A Long Short-Term Memory (LSTM)-based event feature extraction module has been developed for addressing these challenges, which enables us to use a dynamically varying number of event frames. Using these modules, we constructed a state-of-the-art deblurring network, Deformable Convolutions and LSTM-based Flexible Event Frame Fusion Network (DLEFNet). It is particularly useful for scenarios in which exposure times vary depending on factors such as lighting conditions or the presence of fast-moving objects in the scene. It has been demonstrated through evaluation results that the proposed method can outperform the existing state-of-the-art networks for deblurring task in synthetic and real-world data sets.


Super Neurons

arXiv.org Artificial Intelligence

Operational Neural Networks (ONNs) are new generation network models that can perform any (non-linear) transformation with a proper combination of "nodal" and "pool" operators. However, they still have a certain restriction, which is the sole usage of a single nodal operator for all (synaptic) connections of each neuron. The idea behind the "generative neurons" was born as a remedy for this restriction where each nodal operator can be "customized" during the training in order to maximize the learning performance. Self-Organized ONNs (Self-ONNs) composed with the generative neurons can achieve an utmost level of diversity even with a compact configuration; however, it still suffers from the last property that was inherited from the CNNs: localized kernel operations which imposes a severe limitation to the information flow between layers. It is, therefore, desirable for the neurons to gather information from a larger area in the previous layer maps without increasing the kernel size. For certain applications, it might be even more desirable "to learn" the kernel locations of each connection during the training process along with the customized nodal operators so that both can be optimized simultaneously. This study introduces the super (generative) neuron models that can accomplish this without altering the kernel sizes and will enable a significant diversity in terms of information flow. The two models of super neurons proposed in this study vary on the localization process of the kernels: i) randomly localized kernels within a bias range set for each layer, ii) optimized locations of each kernel during the Back-Propagation (BP) training. The extensive set of comparative evaluations show that Self-ONNs with super-neurons can indeed achieve a superior learning and generalization capability without any significant rise of the computational complexity.


BM3D vs 2-Layer ONN

arXiv.org Artificial Intelligence

Despite their recent success on image denoising, the need for deep and complex architectures still hinders the practical usage of CNNs. Older but computationally more efficient methods such as BM3D remain a popular choice, especially in resource-constrained scenarios. In this study, we aim to find out whether compact neural networks can learn to produce competitive results as compared to BM3D for AWGN image denoising. To this end, we configure networks with only two hidden layers and employ different neuron models and layer widths for comparing the performance with BM3D across different AWGN noise levels. Our results conclusively show that the recently proposed self-organized variant of operational neural networks based on a generative neuron model (Self-ONNs) is not only a better choice as compared to CNNs, but also provide competitive results as compared to BM3D and even significantly surpass it for high noise levels.


COVID-19 Infection Map Generation and Detection from Chest X-Ray Images

arXiv.org Artificial Intelligence

Computer-aided diagnosis has become a necessity for accurate and immediate coronavirus disease 2019 (COVID-19) detection to aid treatment and prevent the spread of the virus. Numerous studies have proposed to use Deep Learning techniques for COVID-19 diagnosis. However, they have used very limited chest X-ray (CXR) image repositories for evaluation with a small number, a few hundreds, of COVID-19 samples. Moreover, these methods can neither localize nor grade the severity of COVID-19 infection. For this purpose, recent studies proposed to explore the activation maps of deep networks. However, they remain inaccurate for localizing the actual infestation making them unreliable for clinical use. This study proposes a novel method for the joint localization, severity grading, and detection of COVID-19 from CXR images by generating the so-called infection maps. To accomplish this, we have compiled the largest dataset with 119,316 CXR images including 2951 COVID-19 samples, where the annotation of the ground-truth segmentation masks is performed on CXRs by a novel collaborative human-machine approach. Furthermore, we publicly release the first CXR dataset with the ground-truth segmentation masks of the COVID-19 infected regions. A detailed set of experiments show that state-of-the-art segmentation networks can learn to localize COVID-19 infection with an F1-score of 83.20%, which is significantly superior to the activation maps created by the previous methods. Finally, the proposed approach achieved a COVID-19 detection performance with 94.96% sensitivity and 99.88% specificity.


Compressively Sensed Image Recognition

arXiv.org Machine Learning

Compressive Sensing (CS) theory asserts that sparse signal reconstruction is possible from a small number of linear measurements. Although CS enables low-cost linear sampling, it requires non-linear and costly reconstruction. Recent literature works show that compressive image classification is possible in CS domain without reconstruction of the signal. In this work, we introduce a DCT base method that extracts binary discriminative features directly from CS measurements. These CS measurements can be obtained by using (i) a random or a pseudo-random measurement matrix, or (ii) a measurement matrix whose elements are learned from the training data to optimize the given classification task. We further introduce feature fusion by concatenating Bag of Words (BoW) representation of our binary features with one of the two state-of-the-art CNN-based feature vectors. We show that our fused feature outperforms the state-of-the-art in both cases.