Accuracy
Online Fake Review Detection Using Supervised Machine Learning And BERT Model
Mir, Abrar Qadir, Khan, Furqan Yaqub, Chishti, Mohammad Ahsan
Online shopping stores have grown steadily over the past few years. Due to the massive growth of these businesses, the detection of fake reviews has attracted attention. Fake reviews are seriously trying to mislead customers and thereby undermine the honesty and authenticity of online shopping environments. So far, various fake review classifiers have been proposed that take into account the actual content of the review. To improve the accuracies of existing fake review classification or detection approaches, we propose to use BERT (Bidirectional Encoder Representation from Transformers) model to extract word embeddings from texts (i.e. reviews). Word embeddings are obtained in various basic methods such as SVM (Support vector machine), Random Forests, Naive Bayes, and others. The confusion matrix method was also taken into account to evaluate and graphically represent the results. The results indicate that the SVM classifiers outperform the others in terms of accuracy and f1-score with an accuracy of 87.81%, which is 7.6% higher than the classifier used in the previous study [5].
Leveraging Contextual Relatedness to Identify Suicide Documentation in Clinical Notes through Zero Shot Learning
Workman, Terri Elizabeth, Goulet, Joseph L., Brandt, Cynthia A., Warren, Allison R., Eleazer, Jacob, Skanderson, Melissa, Lindemann, Luke, Blosnich, John R., Leary, John O, Treitler, Qing Zeng
Identifying suicidality including suicidal ideation, attempts, and risk factors in electronic health record data in clinical notes is difficult. A major difficulty is the lack of training samples given the small number of true positive instances among the increasingly large number of patients being screened. This paper describes a novel methodology that identifies suicidality in clinical notes by addressing this data sparsity issue through zero-shot learning. U.S. Veterans Affairs clinical notes served as data. The training dataset label was determined using diagnostic codes of suicide attempt and self-harm. A base string associated with the target label of suicidality was used to provide auxiliary information by narrowing the positive training cases to those containing the base string. A deep neural network was trained by mapping the training documents contents to a semantic space. For comparison, we trained another deep neural network using the identical training dataset labels and bag-of-words features. The zero shot learning model outperformed the baseline model in terms of AUC, sensitivity, specificity, and positive predictive value at multiple probability thresholds. In applying a 0.90 probability threshold, the methodology identified notes not associated with a relevant ICD 10 CM code that documented suicidality, with 94 percent accuracy. This new method can effectively identify suicidality without requiring manual annotation.
Smart Application for Fall Detection Using Wearable ECG & Accelerometer Sensors
Timely and reliable detection of falls is a large and rapidly growing field of research due to the medical and financial demand of caring for a constantly growing elderly population. Within the past 2 decades, the availability of high-quality hardware (high-quality sensors and AI microchips) and software (machine learning algorithms) technologies has served as a catalyst for this research by giving developers the capabilities to develop such systems. This study developed multiple application components in order to investigate the development challenges and choices for fall detection systems, and provide materials for future research. The smart application developed using this methodology was validated by the results from fall detection modelling experiments and model mobile deployment. The best performing model overall was the ResNet152 on a standardised, and shuffled dataset with a 2s window size which achieved 92.8% AUC, 87.28% sensitivity, and 98.33% specificity. Given these results it is evident that accelerometer and ECG sensors are beneficial for fall detection, and allow for the discrimination between falls and other activities. This study leaves a significant amount of room for improvement due to weaknesses identified in the resultant dataset. These improvements include using a labelling protocol for the critical phase of a fall, increasing the number of dataset samples, improving the test subject representation, and experimenting with frequency domain preprocessing.
Check Your Other Door! Creating Backdoor Attacks in the Frequency Domain
Hammoud, Hasan Abed Al Kader, Ghanem, Bernard
Deep Neural Networks (DNNs) are ubiquitous and span a variety of applications ranging from image classification to real-time object detection. As DNN models become more sophisticated, the computational cost of training these models becomes a burden. For this reason, outsourcing the training process has been the go-to option for many DNN users. Unfortunately, this comes at the cost of vulnerability to backdoor attacks. These attacks aim to establish hidden backdoors in the DNN so that it performs well on clean samples, but outputs a particular target label when a trigger is applied to the input. Existing backdoor attacks either generate triggers in the spatial domain or naively poison frequencies in the Fourier domain. In this work, we propose a pipeline based on Fourier heatmaps to generate a spatially dynamic and invisible backdoor attack in the frequency domain. The proposed attack is extensively evaluated on various datasets and network architectures. Unlike most existing backdoor attacks, the proposed attack can achieve high attack success rates with low poisoning rates and little to no drop in performance while remaining imperceptible to the human eye. Moreover, we show that the models poisoned by our attack are resistant to various state-of-the-art (SOTA) defenses, so we contribute two possible defenses that can evade the attack.
Fair Multi-Exit Framework for Facial Attribute Classification
Chiu, Ching-Hao, Chung, Hao-Wei, Chen, Yu-Jen, Shi, Yiyu, Ho, Tsung-Yi
Fairness has become increasingly pivotal in facial recognition. Without bias mitigation, deploying unfair AI would harm the interest of the underprivileged population. In this paper, we observe that though the higher accuracy that features from the deeper layer of a neural networks generally offer, fairness conditions deteriorate as we extract features from deeper layers. This phenomenon motivates us to extend the concept of multi-exit framework. Unlike existing works mainly focusing on accuracy, our multi-exit framework is fairness-oriented, where the internal classifiers are trained to be more accurate and fairer. During inference, any instance with high confidence from an internal classifier is allowed to exit early. Moreover, our framework can be applied to most existing fairness-aware frameworks. Experiment results show that the proposed framework can largely improve the fairness condition over the state-of-the-art in CelebA and UTK Face datasets.
Deepfake CAPTCHA: A Method for Preventing Fake Calls
Yasur, Lior, Frankovits, Guy, Grabovski, Fred M., Mirsky, Yisroel
Deep learning technology has made it possible to generate realistic content of specific individuals. These `deepfakes' can now be generated in real-time which enables attackers to impersonate people over audio and video calls. Moreover, some methods only need a few images or seconds of audio to steal an identity. Existing defenses perform passive analysis to detect fake content. However, with the rapid progress of deepfake quality, this may be a losing game. In this paper, we propose D-CAPTCHA: an active defense against real-time deepfakes. The approach is to force the adversary into the spotlight by challenging the deepfake model to generate content which exceeds its capabilities. By doing so, passive detection becomes easier since the content will be distorted. In contrast to existing CAPTCHAs, we challenge the AI's ability to create content as opposed to its ability to classify content. In this work we focus on real-time audio deepfakes and present preliminary results on video. In our evaluation we found that D-CAPTCHA outperforms state-of-the-art audio deepfake detectors with an accuracy of 91-100% depending on the challenge (compared to 71% without challenges). We also performed a study on 41 volunteers to understand how threatening current real-time deepfake attacks are. We found that the majority of the volunteers could not tell the difference between real and fake audio.
Homeland Security develops new portable gunshot detection system
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. The Department of Homeland Security said its Science and Technology Directorate has developed a portable gunshot detection system in collaboration with the Massachusetts-based Shooter Detection Systems company. The department said that the system, known as SDS Outdoor, could provide "critical information about outdoor shooting incidents almost instantaneously to first responders." The new system is reportedly an enhancement to the commercial, off-the-shelf Guardian Indoor Active Shooter Detection System.
Machine Learning Key Terminologies
Supervised learning: This involves training a model on labeled data, where the correct output is provided for each example in the training set. Common supervised learning algorithms include linear regression, logistic regression, and support vector machines. Unsupervised learning: This involves training a model on unlabeled data, with the goal of discovering patterns or structures in the data. Common unsupervised learning algorithms include k-means clustering and principal component analysis. Reinforcement learning: This involves training an agent to make decisions in an environment in order to maximize a reward. The agent receives feedback in the form of rewards or penalties for its actions. Batch learning: This involves training a model on the entire dataset at once.
When Spectral Modeling Meets Convolutional Networks: A Method for Discovering Reionization-era Lensed Quasars in Multi-band Imaging Data
Andika, Irham Taufik, Jahnke, Knud, van der Wel, Arjen, Bañados, Eduardo, Bosman, Sarah E. I., Davies, Frederick B., Eilers, Anna-Christina, Jaelani, Anton Timur, Mazzucchelli, Chiara, Onoue, Masafusa, Schindler, Jan-Torge
Over the last two decades, around 300 quasars have been discovered at $z\gtrsim6$, yet only one has identified as being strongly gravitationally lensed. We explore a new approach -- enlarging the permitted spectral parameter space, while introducing a new spatial geometry veto criterion -- which is implemented via image-based deep learning. We first apply this approach to a systematic search for reionization-era lensed quasars, using data from the Dark Energy Survey, the Visible and Infrared Survey Telescope for Astronomy Hemisphere Survey, and the Wide-field Infrared Survey Explorer.Our search method consists of two main parts: (i) the preselection of the candidates based on their spectral energy distributions (SEDs) using catalog-level photometry and (ii) relative probabilities calculation of the candidates being a lens or some contaminant, utilizing a convolutional neural network (CNN) classification. The training data sets are constructed by painting deflected point-source lights over actual galaxy images, to generate realistic galaxy-quasar lens models, optimized to find systems with small image separations, i.e., Einstein radii of $\theta_\mathrm{E} \leq 1$ arcsec. Visual inspection is then performed for sources with CNN scores of $P_\mathrm{lens} > 0.1$, which leads us to obtain 36 newly selected lens candidates, which are awaiting spectroscopic confirmation. These findings show that automated SED modeling and deep learning pipelines, supported by modest human input, are a promising route for detecting strong lenses from large catalogs that can overcome the veto limitations of primarily dropout-based SED selection approaches.
Fitness Dependent Optimizer with Neural Networks for COVID-19 patients
Abdulkhaleq, Maryam T., Rashid, Tarik A., Hassan, Bryar A., Alsadoon, Abeer, Bacanin, Nebojsa, Chhabra, Amit, Vimal, S.
The Coronavirus, known as COVID-19, which appeared in 2019 in China, has significantly affected global health and become a huge burden on health institutions all over the world. These effects are continuing today. One strategy for limiting the virus's transmission is to have an early diagnosis of suspected cases and take appropriate measures before the disease spreads further. This work aims to diagnose and show the probability of getting infected by the disease according to textual clinical data. In this work, we used five machine learning techniques (GWO_MLP, GWO_CMLP, MGWO_MLP, FDO_MLP, FDO_CMLP) all of which aim to classify Covid-19 patients into two categories (Positive and Negative). Experiments showed promising results for all used models. The applied methods showed very similar performance, typically in terms of accuracy. However, in each tested dataset, FDO_MLP and FDO_CMLP produced the best results with 100% accuracy. The other models' results varied from one experiment to the other. It is concluded that the models on which the FDO algorithm was used as a learning algorithm had the possibility of obtaining higher accuracy. However, it is found that FDO has the longest runtime compared to the other algorithms. The link to the covid 19 models is found here: https://github.com/Tarik4Rashid4/covid19models