Performance Analysis
Evaluating the Effectiveness of Margin Parameter when Learning Knowledge Embedding Representation for Domain-specific Multi-relational Categorized Data
Chung, Matthew Wai Heng, Tissot, Hegler
Learning knowledge representation is an increasingly important technology that supports a variety of machine learning related applications. However, the choice of hyperparameters is seldom justified and usually relies on exhaustive search. Understanding the effect of hyperparameter combinations on embedding quality is crucial to avoid the inefficient process and enhance practicality of vector representation methods. We evaluate the effects of distinct values for the margin parameter focused on translational embedding representation models for multi-relational categorized data. We assess the margin influence regarding the quality of embedding models by contrasting traditional link prediction task accuracy against a classification task. The findings provide evidence that lower values of margin are not rigorous enough to help with the learning process, whereas larger values produce much noise pushing the entities beyond to the surface of the hyperspace, thus requiring constant regularization. Finally, the correlation between link prediction and classification accuracy shows traditional validation protocol for embedding models is a weak metric to represent the quality of embedding representation.
Massive errors found in facial recognition tech, especially in case of nonwhites: U.S. study
WASHINGTON โ Facial recognition systems can produce wildly inaccurate results, especially for nonwhites, according to a U.S. government study released Thursday that is likely to raise fresh doubts on deployment of the artificial intelligence technology. The study of dozens of facial recognition algorithms showed "false positives" rates for Asians and African-Americans as much as 100 times higher than for whites. The researchers from the National Institute of Standards and Technology (NIST), a government research center, also found two algorithms assigned the wrong gender to black females almost 35 percent of the time. The study comes amid widespread deployment of facial recognition for law enforcement, airports, border security, banking, retailing, schools and for personal technology such as unlocking smartphones. Some activists and researchers have claimed the potential for errors is too great and that mistakes could result in the jailing of innocent people, and that the technology could be used to create databases that may be hacked or inappropriately used.
Data science for cybersecurity: A probabilistic time series model for detecting RDP inbound brute force attacks - Microsoft Security
Our approach to time series anomaly detection is computationally efficient, automatically learns how to update probabilities and adapt to changes in data. As we describe in the next section, this approach has yielded successful attack detection at high precision. The proposed time series anomaly detection model was deployed and utilized by Microsoft Threat Experts to detect RDP brute force attacks during threat hunting activities. A list that ranks machines across enterprises with the lowest anomaly scores (indicating the likelihood of observing a value at least as large under expected conditions in all signals considered) is updated and reviewed every day. See Table 1 for an example.
Destruction of Image Steganography using Generative Adversarial Networks
Corley, Isaac, Lwowski, Jonathan, Hoffman, Justin
Digital image steganalysis, or the detection of image steganography, has been studied in depth for years and is driven by Advanced Persistent Threat (APT) groups', such as APT37 Reaper, utilization of steganographic techniques to transmit additional malware to perform further post-exploitation activity on a compromised host. However, many steganalysis algorithms are constrained to work with only a subset of all possible images in the wild or are known to produce a high false positive rate. This results in blocking any suspected image being an unreasonable policy. A more feasible policy is to filter suspicious images prior to reception by the host machine. However, how does one optimally filter specifically to obfuscate or remove image steganography while avoiding degradation of visual image quality in the case that detection of the image was a false positive? We propose the Deep Digital Steganography Purifier (DDSP), a Generative Adversarial Network (GAN) which is optimized to destroy steganographic content without compromising the perceptual quality of the original image. As verified by experimental results, our model is capable of providing a high rate of destruction of steganographic image content while maintaining a high visual quality in comparison to other state-of-the-art filtering methods. Additionally, we test the transfer learning capability of generalizing to to obfuscate real malware payloads embedded into different image file formats and types using an unseen steganographic algorithm and prove that our model can in fact be deployed to provide adequate results.
Shareable Representations for Search Query Understanding
Kumar, Mukul, Hu, Youna, Headden, Will, Goutam, Rahul, Lin, Heran, Yin, Bing
Understanding search queries is critical for shopping search engines to deliver a satisfying customer experience. Popular shopping search engines receive billions of unique queries yearly, each of which can depict any of hundreds of user preferences or intents. In order to get the right results to customers it must be known queries like "inexpensive prom dresses" are intended to not only surface results of a certain product type but also products with a low price. Referred to as query intents, examples also include preferences for author, brand, age group, or simply a need for customer service. Recent works such as BERT have demonstrated the success of a large transformer encoder architecture with language model pre-training on a variety of NLP tasks. We adapt such an architecture to learn intents for search queries and describe methods to account for the noisiness and sparseness of search query data. We also describe cost effective ways of hosting transformer encoder models in context with low latency requirements. With the right domain-specific training we can build a shareable deep learning model whose internal representation can be reused for a variety of query understanding tasks including query intent identification. Model sharing allows for fewer large models needed to be served at inference time and provides a platform to quickly build and roll out new search query classifiers.
Prediction of Physical Load Level by Machine Learning Analysis of Heart Activity after Exercises
Gang, Peng, Zeng, Wei, Gordienko, Yuri, Rokovyi, Oleksandr, Alienin, Oleg, Stirenko, Sergii
The assessment of energy expenditure in real life is of great importance for monitoring the current physical state of people, especially in work, sport, elderly care, health care, and everyday life even. This work reports about application of some machine learning methods (linear regression, linear discriminant analysis, k-nearest neighbors, decision tree, random forest, Gaussian naive Bayes, support-vector machine) for monitoring energy expenditures in athletes. The classification problem was to predict the known level of the in-exercise loads (in three categories by calories) by the heart rate activity features measured during the short period of time (1 minute only) after training, i.e by features of the post-exercise load. The results obtained shown that the post-exercise heart activity features preserve the information of the in-exercise training loads and allow us to predict their actual in-exercise levels. The best performance can be obtained by the random forest classifier with all 8 heart rate features (micro-averaged area under curve value AUCmicro = 0.87 and macro-averaged one AUCmacro = 0.88) and the k-nearest neighbors classifier with 4 most important heart rate features (AUCmicro = 0.91 and AUCmacro = 0.89). The limitations and perspectives of the ML methods used are outlined, and some practical advices are proposed as to their improvement and implementation for the better prediction of in-exercise energy expenditures.
Robust Data Preprocessing for Machine-Learning-Based Disk Failure Prediction in Cloud Production Environments
Han, Shujie, Wu, Jun, Xu, Erci, He, Cheng, Lee, Patrick P. C., Qiang, Yi, Zheng, Qixing, Huang, Tao, Huang, Zixi, Li, Rui
To provide proactive fault tolerance for modern cloud data centers, extensive studies have proposed machine learning (ML) approaches to predict imminent disk failures for early remedy and evaluated their approaches directly on public datasets (e.g., Backblaze SMART logs). However, in real-world production environments, the data quality is imperfect (e.g., inaccurate labeling, missing data samples, and complex failure types), thereby degrading the prediction accuracy. We present RODMAN, a robust data preprocessing pipeline that refines data samples before feeding them into ML models. We start with a large-scale trace-driven study of over three million disks from Alibaba Cloud's data centers, and motivate the practical challenges in ML-based disk failure prediction. We then design RODMAN with three data preprocessing echniques, namely failure-type filtering, spline-based data filling, and automated pre-failure backtracking, that are applicable for general ML models. Evaluation on both the Alibaba and Backblaze datasets shows that RODMAN improves the prediction accuracy compared to without data preprocessing under various settings.
Machine learning to predict the long-term risk of myocardial infarction and cardiac death based on clinical risk, coronary calcium, and epicardial adipose tissue: a prospective study
Our aim was to evaluate the performance of machine learning (ML), integrating clinical parameters with coronary artery calcium (CAC), and automated epicardial adipose tissue (EAT) quantification, for the prediction of long-term risk of myocardial infarction (MI) and cardiac death in asymptomatic subjects. Our study included 1912 asymptomatic subjects [1117 (58.4%) male, age: 55.8 9.1 years] from the prospective EISNER trial with long-term follow-up after CAC scoring. EAT volume and density were quantified using a fully automated deep learning method. ML extreme gradient boosting was trained using clinical co-variates, plasma lipid panel measurements, risk factors, CAC, aortic calcium, and automated EAT measures, and validated using repeated 10-fold cross validation. During mean follow-up of 14.5 2 years, 76 events of MI and/or cardiac death occurred. ML obtained a significantly higher AUC than atherosclerotic cardiovascular disease (ASCVD) risk and CAC score for predicting events (ML: 0.82; ASCVD: 0.77; CAC: 0.77, P 0.05 for all). Subjects with a higher ML score (by Youden's index) had high hazard of suffering events (HR: 10.38, P 0.001); the relationships persisted in multivariable analysis including ASCVD-risk and CAC measures (HR: 2.94, P 0.005). Age, ASCVD-risk, and CAC were prognostically important for both genders. Systolic blood pressure was more important than cholesterol in women, and the opposite in men.
Practical Solutions for Machine Learning Safety in Autonomous Vehicles
Mohseni, Sina, Pitale, Mandar, Singh, Vasu, Wang, Zhangyang
Autonomous vehicles rely on machine learning to solve challenging tasks in perception and motion planning. However, automotive software safety standards have not fully evolved to address the challenges of machine learning safety such as interpretability, verification, and performance limitations. In this paper, we review and organize practical machine learning safety techniques that can complement engineering safety for machine learning based software in autonomous vehicles. Our organization maps safety strategies to state-of-the-art machine learning techniques in order to enhance dependability and safety of machine learning algorithms. We also discuss security limitations and user experience aspects of machine learning components in autonomous vehicles.
Features or Shape? Tackling the False Dichotomy of Time Series Classification
Alaee, Sara, Abdoli, Alireza, Shelton, Christian, Murillo, Amy C., Gerry, Alec C., Keogh, Eamonn
Time series classification is an important task in its own right, and it is often a precursor to further downstream analytics. To date, virtually all works in the literature have used either shape - based classification using a distance measure or feature - based classification after finding some suitable features for the do main . I t seems to be underappreciated that in many datasets it is the case that some classes are best discriminated with fea tures, while others are best discriminated with shape. Thus, making the shape vs. feature choice will condemn us to poor results, at least for some classes. In this work, we propose a new model for classifying time series that allows the use of both shape and feature - based measures, when warranted . Our algorithm automatically decides which approach is best for which class, and at query time chooses which classifier to trust the most. We evaluate our idea on real world datasets and demonstrate that our ideas produce statistically significant improvement in classification accuracy .