Goto

Collaborating Authors

Results


Outlier Detection Techniques in Python

#artificialintelligence

Outlier detection, which is the process of identifying extreme values in data, has many applications across a wide variety of industries including finance, insurance, cybersecurity and healthcare. In finance, for example, it can detect malicious events like credit card fraud. In insurance, it can identify forged or fabricated documents. In cybersecurity, it is used for identifying malicious behaviors like password theft and phishing. Finally, outlier detection has been used for rare disease detection in a healthcare context.


Cyber Criminals vs Robots

#artificialintelligence

What happens when cyber criminals face robots? What happens when they use robots? How will offensive and defensive strategies of cybersecurity evolve as artificial intelligence continues to grow? Both artificial intelligence and cybersecurity have consistently landed in the top charts of fastest growing industries year after year¹². The 2 fields overlap in many areas and will undoubtedly continue to do so for years to come. For this article, I have narrowed my scope to a specific use case, intrusion detection. An Intrusion Detection System (IDS) is software that monitors a company's network for malicious activity. I dive into AI's role in Intrusion Detection Systems, code my own IDS using machine learning, and further demonstrate how it can be used to assist threat hunters.


Detect Malicious JavaScript Code Using Machine Learning

#artificialintelligence

In this article, we will consider approaches to detect obfuscated JavaScript code snippets using machine learning. Most websites use JavaScript (JS) code to make dynamic content; thus, JS code becomes a valuable attack vector against browsers, browser plug-ins, email clients, and other JS applications. Among common JS-based attacks are drive-by-download, cross-site scripting (XSS), cross-site request forgery (XSRF), malvertising/malicious advertising, and others. Most of the malicious JS codes are obfuscated in order to hide what they are doing and to avoid being detected by signature-based security systems. In other words, the obfuscation technique is a sequence of confusing code transformations to compromise its understandability, but at the same time to save its functionality.


Identifying Cyber Threats Before They Happen: Deep Learning

#artificialintelligence

Crypto.com, Microsoft, NVidia, and Okta all got hacked this year. In some hacks, attackers are looking to take data, while some are just trying things out. Either way, it is in the interest of companies to patch up the holes in their security systems as more attackers are learning to take advantage of them. The project I am working on now is one to prevent cyber threats like these from happening. When a company is hacked, there is a lot at stake.


Know The Top Machine Learning Algorithms For Business

#artificialintelligence

It's never been easier for businesses of all sizes to harness the power of data, thanks to the development of free, open-source machine learning algorithms and artificial intelligence tools like Google's TensorFlow and scikit-learn, as well as "ML-as-a-service" products like Google's cloud prediction API and Microsoft's Azure machine learning platform. On the other hand, machine learning is a significant and complicated field. Where do you begin to learn how to apply it to your company? Machine learning is a branch of study that trains machines to do cognitive tasks like humans do. While they have far fewer cognitive abilities than ordinary people, they can quickly process large amounts of data and extract significant commercial insights.


Using AI to Reduce IoT Vulnerability

#artificialintelligence

This article considers the use of artificial intelligence to help security professionals protect IoT systems. The Internet of Things (IoT) is still in its infancy, but threats to IoT systems and their potential for harm have become quite sophisticated. There are two reasons for this: the value of data and systems that IoT vulnerabilities can give access to; and the high number of potential attack vectors – discrete elements of IoT networks that are vulnerable to foul play. Artificial intelligence (AI) software and algorithms help security professionals to wrest control of this technological battleground back from hackers and protect the IoT as it reaches maturity. Only introduced in 2008, the Internet of Things and IoT systems are still fairly nebulous concepts, subjects of numerous and sometimes conflicting definitions.


Using AI to Reduce IoT Vulnerability

#artificialintelligence

This article considers the use of artificial intelligence to help security professionals protect IoT systems. The Internet of Things (IoT) is still in its infancy, but threats to IoT systems and their potential for harm have become quite sophisticated. There are two reasons for this: the value of data and systems that IoT vulnerabilities can give access to; and the high number of potential attack vectors – discrete elements of IoT networks that are vulnerable to foul play. Artificial intelligence (AI) software and algorithms help security professionals to wrest control of this technological battleground back from hackers and protect the IoT as it reaches maturity. Only introduced in 2008, the Internet of Things and IoT systems are still fairly nebulous concepts, subjects of numerous and sometimes conflicting definitions.


A Comprehensive Survey on Radio Frequency (RF) Fingerprinting: Traditional Approaches, Deep Learning, and Open Challenges

arXiv.org Artificial Intelligence

Fifth generation (5G) networks and beyond envisions massive Internet of Things (IoT) rollout to support disruptive applications such as extended reality (XR), augmented/virtual reality (AR/VR), industrial automation, autonomous driving, and smart everything which brings together massive and diverse IoT devices occupying the radio frequency (RF) spectrum. Along with spectrum crunch and throughput challenges, such a massive scale of wireless devices exposes unprecedented threat surfaces. RF fingerprinting is heralded as a candidate technology that can be combined with cryptographic and zero-trust security measures to ensure data privacy, confidentiality, and integrity in wireless networks. Motivated by the relevance of this subject in the future communication networks, in this work, we present a comprehensive survey of RF fingerprinting approaches ranging from a traditional view to the most recent deep learning (DL) based algorithms. Existing surveys have mostly focused on a constrained presentation of the wireless fingerprinting approaches, however, many aspects remain untold. In this work, however, we mitigate this by addressing every aspect - background on signal intelligence (SIGINT), applications, relevant DL algorithms, systematic literature review of RF fingerprinting techniques spanning the past two decades, discussion on datasets, and potential research avenues - necessary to elucidate this topic to the reader in an encyclopedic manner.


Adversarial Attacks against Windows PE Malware Detection: A Survey of the State-of-the-Art

arXiv.org Artificial Intelligence

The malware has been being one of the most damaging threats to computers that span across multiple operating systems and various file formats. To defend against the ever-increasing and ever-evolving threats of malware, tremendous efforts have been made to propose a variety of malware detection methods that attempt to effectively and efficiently detect malware. Recent studies have shown that, on the one hand, existing ML and DL enable the superior detection of newly emerging and previously unseen malware. However, on the other hand, ML and DL models are inherently vulnerable to adversarial attacks in the form of adversarial examples, which are maliciously generated by slightly and carefully perturbing the legitimate inputs to confuse the targeted models. Basically, adversarial attacks are initially extensively studied in the domain of computer vision, and some quickly expanded to other domains, including NLP, speech recognition and even malware detection. In this paper, we focus on malware with the file format of portable executable (PE) in the family of Windows operating systems, namely Windows PE malware, as a representative case to study the adversarial attack methods in such adversarial settings. To be specific, we start by first outlining the general learning framework of Windows PE malware detection based on ML/DL and subsequently highlighting three unique challenges of performing adversarial attacks in the context of PE malware. We then conduct a comprehensive and systematic review to categorize the state-of-the-art adversarial attacks against PE malware detection, as well as corresponding defenses to increase the robustness of PE malware detection. We conclude the paper by first presenting other related attacks against Windows PE malware detection beyond the adversarial attacks and then shedding light on future research directions and opportunities.


Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties

arXiv.org Artificial Intelligence

In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Thus, the prediction model is unreliable although the overall model accuracy can be acceptable. Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class. However, their effectiveness depends on several factors mainly related to data intrinsic characteristics, such as imbalance ratio, dataset size and dimensionality, overlapping between classes or borderline examples. In this work, the impact of these factors is analyzed through a comprehensive comparative study involving 40 datasets from different application areas. The objective is to obtain models for automatic selection of the best resampling strategy for any dataset based on its characteristics. These models allow us to check several factors simultaneously considering a wide range of values since they are induced from very varied datasets that cover a broad spectrum of conditions. This differs from most studies that focus on the individual analysis of the characteristics or cover a small range of values. In addition, the study encompasses both basic and advanced resampling strategies that are evaluated by means of eight different performance metrics, including new measures specifically designed for imbalanced data classification. The general nature of the proposal allows the choice of the most appropriate method regardless of the domain, avoiding the search for special purpose techniques that could be valid for the target data.