Accuracy
Efficient Graph-Friendly COCO Metric Computation for Train-Time Model Evaluation
Evaluating the COCO mean average precision (MaP) and COCO recall metrics as part of the static computation graph of modern deep learning frameworks poses a unique set of challenges. These challenges include the need for maintaining a dynamic-sized state to compute mean average precision, reliance on global dataset-level statistics to compute the metrics, and managing differing numbers of bounding boxes between images in a batch. As a consequence, it is common practice for researchers and practitioners to evaluate COCO metrics as a post training evaluation step. With a graph-friendly algorithm to compute COCO Mean Average Precision and recall, these metrics could be evaluated at training time, improving visibility into the evolution of the metrics through training curve plots, and decreasing iteration time when prototyping new model versions. Our contributions include an accurate approximation algorithm for Mean Average Precision, an open source implementation of both COCO mean average precision and COCO recall, extensive numerical benchmarks to verify the accuracy of our implementations, and an open-source training loop that include train-time evaluation of mean average precision and recall.
CLIP2TV: Align, Match and Distill for Video-Text Retrieval
Gao, Zijian, Liu, Jingyu, Sun, Weiqi, Chen, Sheng, Chang, Dedan, Zhao, Lili
Modern video-text retrieval frameworks basically consist of three parts: video encoder, text encoder and the similarity head. With the success of both visual and textual representation learning, transformerbased encoders and fusion methods have also been adopted in the field of video-text retrieval. In this paper, We propose a new CLIP-based framework called CLIP2TV, which consists of a video-text alignment module and a video-text matching module. The two modules are trained end-toend in a coordinated manner, and boost the performance to each other. Moreover, to address the impairment brought by data noise, especially false negatives introduced by vague description in some datasets, we propose similarity distillation to alleviate the problem. Extensive experimental results on various datasets validate the effectiveness of the proposed methods. Finally, on common datasets of various length of video clips, CLIP2TV achieves better or competitive results towards previous SOTA methods.
A machine learning based approach to gravitational lens identification with the International LOFAR Telescope
Rezaei, S., McKean, J. P., Biehl, M., de Roo1, W., Lafontaine, A.
We present a novel machine learning based approach for detecting galaxy-scale gravitational lenses from interferometric data, specifically those taken with the International LOFAR Telescope (ILT), which is observing the northern radio sky at a frequency of 150 MHz, an angular resolution of 350 mas and a sensitivity of 90 uJy beam-1 (1 sigma). We develop and test several Convolutional Neural Networks to determine the probability and uncertainty of a given sample being classified as a lensed or non-lensed event. By training and testing on a simulated interferometric imaging data set that includes realistic lensed and non-lensed radio sources, we find that it is possible to recover 95.3 per cent of the lensed samples (true positive rate), with a contamination of just 0.008 per cent from non-lensed samples (false positive rate). Taking the expected lensing probability into account results in a predicted sample purity for lensed events of 92.2 per cent. We find that the network structure is most robust when the maximum image separation between the lensed images is greater than 3 times the synthesized beam size, and the lensed images have a total flux density that is equivalent to at least a 20 sigma (point-source) detection. For the ILT, this corresponds to a lens sample with Einstein radii greater than 0.5 arcsec and a radio source population with 150 MHz flux densities more than 2 mJy. By applying these criteria and our lens detection algorithm we expect to discover the vast majority of galaxy-scale gravitational lens systems contained within the LOFAR Two Metre Sky Survey.
Modeling User Behavior With Interaction Networks for Spam Detection
Agarwal, Prabhat, Srivastava, Manisha, Singh, Vishwakarma, Rosenberg, Charles
Spam is a serious problem plaguing web-scale digital platforms which facilitate user content creation and distribution. It compromises platform's integrity, performance of services like recommendation and search, and overall business. Spammers engage in a variety of abusive and evasive behavior which are distinct from non-spammers. Users' complex behavior can be well represented by a heterogeneous graph rich with node and edge attributes. Learning to identify spammers in such a graph for a web-scale platform is challenging because of its structural complexity and size. In this paper, we propose SEINE (Spam DEtection using Interaction NEtworks), a spam detection model over a novel graph framework. Our graph simultaneously captures rich users' details and behavior and enables learning on a billion-scale graph. Our model considers neighborhood along with edge types and attributes, allowing it to capture a wide range of spammers. SEINE, trained on a real dataset of tens of millions of nodes and billions of edges, achieves a high performance of 80% recall with 1% false positive rate. SEINE achieves comparable performance to the state-of-the-art techniques on a public dataset while being pragmatic to be used in a large-scale production system.
Bootstrapping a User-Centered Task-Oriented Dialogue System
Chen, Shijie, Chen, Ziru, Deng, Xiang, Lewis, Ashley, Mo, Lingbo, Stevens, Samuel, Wang, Zhen, Yue, Xiang, Zhang, Tianshu, Su, Yu, Sun, Huan
We present TacoBot, a task-oriented dialogue system built for the inaugural Alexa Prize TaskBot Challenge, which assists users in completing multi-step cooking and home improvement tasks. TacoBot is designed with a user-centered principle and aspires to deliver a collaborative and accessible dialogue experience. Towards that end, it is equipped with accurate language understanding, flexible dialogue management, and engaging response generation. Furthermore, TacoBot is backed by a strong search engine and an automated end-to-end test suite. In bootstrapping the development of TacoBot, we explore a series of data augmentation strategies to train advanced neural language processing models and continuously improve the dialogue experience with collected real conversations. At the end of the semifinals, TacoBot achieved an average rating of 3.55/5.0.
Algorithm developed by Lithuanian researchers can predict possible Alzheimer's with nearly 100 per cent accuracy
Researchers from Kaunas universities in Lithuania developed a deep learning-based method that can predict the possible onset of Alzheimer's disease from brain images with an accuracy of over 99 per cent. The method was developed while analysing functional MRI images obtained from 138 subjects and performed better in terms of accuracy, sensitivity and specificity than previously developed methods. According to World Health Organisation, Alzheimer's disease is the most frequent cause of dementia, contributing to up to 70 per cent of dementia cases. Worldwide, approximately 24 million people are affected, and this number is expected to double every 20 years. Owing to societal ageing, the disease will become a costly public health burden in the years to come.
GBDF: Gender Balanced DeepFake Dataset Towards Fair DeepFake Detection
Nadimpalli, Aakash Varma, Rattani, Ajita
Facial forgery by deepfakes has raised severe societal concerns. Several solutions have been proposed by the vision community to effectively combat the misinformation on the internet via automated deepfake detection systems. Recent studies have demonstrated that facial analysis-based deep learning models can discriminate based on protected attributes. For the commercial adoption and massive roll-out of the deepfake detection technology, it is vital to evaluate and understand the fairness (the absence of any prejudice or favoritism) of deepfake detectors across demographic variations such as gender and race. As the performance differential of deepfake detectors between demographic subgroups would impact millions of people of the deprived sub-group. This paper aims to evaluate the fairness of the deepfake detectors across males and females. However, existing deepfake datasets are not annotated with demographic labels to facilitate fairness analysis. To this aim, we manually annotated existing popular deepfake datasets with gender labels and evaluated the performance differential of current deepfake detectors across gender. Our analysis on the gender-labeled version of the datasets suggests (a) current deepfake datasets have skewed distribution across gender, and (b) commonly adopted deepfake detectors obtain unequal performance across gender with mostly males outperforming females. Finally, we contributed a gender-balanced and annotated deepfake dataset, GBDF, to mitigate the performance differential and to promote research and development towards fairness-aware deep fake detectors. The GBDF dataset is publicly available at: https://github.com/aakash4305/GBDF
ExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training Algorithms
Agnes, Cicy K, Naveed, Akthar V, Chacko, Anitha Mary M O
Exoplanet detection opens the door to the discovery of new habitable worlds and helps us understand how planets were formed. With the objective of finding earth-like habitable planets, NASA launched Kepler space telescope and its follow up mission K2. The advancement of observation capabilities has increased the range of fresh data available for research, and manually handling them is both time-consuming and difficult. Machine learning and deep learning techniques can greatly assist in lowering human efforts to process the vast array of data produced by the modern instruments of these exoplanet programs in an economical and unbiased manner. However, care should be taken to detect all the exoplanets precisely while simultaneously minimizing the misclassification of non-exoplanet stars. In this paper, we utilize two variations of generative adversarial networks, namely semi-supervised generative adversarial networks and auxiliary classifier generative adversarial networks, to detect transiting exoplanets in K2 data. We find that the usage of these models can be helpful for the classification of stars with exoplanets. Both of our techniques are able to categorize the light curves with a recall and precision of 1.00 on the test data. Our semi-supervised technique is beneficial to solve the cumbersome task of creating a labeled dataset.
Many-to-One Knowledge Distillation of Real-Time Epileptic Seizure Detection for Low-Power Wearable Internet of Things Systems
Baghersalimi, Saleh, Amirshahi, Alireza, Forooghifar, Farnaz, Teijeiro, Tomas, Aminifar, Amir, Atienza, David
Integrating low-power wearable Internet of Things (IoT) systems into routine health monitoring is an ongoing challenge. Recent advances in the computation capabilities of wearables make it possible to target complex scenarios by exploiting multiple biosignals and using high-performance algorithms, such as Deep Neural Networks (DNNs). There is, however, a trade-off between performance of the algorithms and the low-power requirements of IoT platforms with limited resources. Besides, physically larger and multi-biosignal-based wearables bring significant discomfort to the patients. Consequently, reducing power consumption and discomfort is necessary for patients to use IoT devices continuously during everyday life. To overcome these challenges, in the context of epileptic seizure detection, we propose a many-to-one signals knowledge distillation approach targeting single-biosignal processing in IoT wearable systems. The starting point is to get a highly-accurate multi-biosignal DNN, then apply our approach to develop a single-biosignal DNN solution for IoT systems that achieves an accuracy comparable to the original multi-biosignal DNN. To assess the practicality of our approach to real-life scenarios, we perform a comprehensive simulation experiment analysis on several state-of-the-art edge computing platforms, such as Kendryte K210 and Raspberry Pi Zero.
Lesion detection in contrast enhanced spectral mammography
Jailin, Clément, Milioni, Pablo, Li, Zhijin, Iordache, Răzvan, Muller, Serge
Background \& purpose: The recent emergence of neural networks models for the analysis of breast images has been a breakthrough in computer aided diagnostic. This approach was not yet developed in Contrast Enhanced Spectral Mammography (CESM) where access to large databases is complex. This work proposes a deep-learning-based Computer Aided Diagnostic development for CESM recombined images able to detect lesions and classify cases. Material \& methods: A large CESM diagnostic dataset with biopsy-proven lesions was collected from various hospitals and different acquisition systems. The annotated data were split on a patient level for the training (55%), validation (15%) and test (30%) of a deep neural network with a state-of-the-art detection architecture. Free Receiver Operating Characteristic (FROC) was used to evaluate the model for the detection of 1) all lesions, 2) biopsied lesions and 3) malignant lesions. ROC curve was used to evaluate breast cancer classification. The metrics were finally compared to clinical results. Results: For the evaluation of the malignant lesion detection, at high sensitivity (Se>0.95), the false positive rate was at 0.61 per image. For the classification of malignant cases, the model reached an Area Under the Curve (AUC) in the range of clinical CESM diagnostic results. Conclusion: This CAD is the first development of a lesion detection and classification model for CESM images. Trained on a large dataset, it has the potential to be used for helping the management of biopsy decision and for helping the radiologist detecting complex lesions that could modify the clinical treatment.