AITopics

2207.07089

Country:

Europe > Finland > Pirkanmaa > Tampere (0.04)
Asia > Middle East > Qatar (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Vimalajeewa, Dixon, Bruce, Scott Alan, Vidakovic, Brani

Early Detection of Ovarian Cancer by Wavelet Analysis of Protein Mass Spectra

arXiv.org Artificial IntelligenceJul-14-2022

Accurate and efficient detection of ovarian cancer at early stages is critical to ensure proper treatments for patients. Among the first-line modalities investigated in studies of early diagnosis are features distilled from protein mass spectra. This method, however, considers only a specific subset of spectral responses and ignores the interplay among protein expression levels, which can also contain diagnostic information. We propose a new modality that automatically searches protein mass spectra for discriminatory features by considering the self-similar nature of the spectra. Self-similarity is assessed by taking a wavelet decomposition of protein mass spectra and estimating the rate of level-wise decay in the energies of the resulting wavelet coefficients. Level-wise energies are estimated in a robust manner using distance variance, and rates are estimated locally via a rolling window approach. This results in a collection of rates that can be used to characterize the interplay among proteins, which can be indicative of cancer presence. Discriminatory descriptors are then selected from these evolutionary rates and used as classifying features. The proposed wavelet-based features are used in conjunction with features proposed in the existing literature for early stage diagnosis of ovarian cancer using two datasets published by the American National Cancer Institute. Including the wavelet-based features from the new modality results in improvements in diagnostic performance for early-stage ovarian cancer detection. This demonstrates the ability of the proposed modality to characterize new ovarian cancer diagnostic information.

mass spectra, protein mass spectra, spectra, (15 more...)

2207.07028

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)

Genre: Research Report > Experimental Study (0.70)

Industry: Health & Medicine > Therapeutic Area > Oncology > Ovarian Cancer (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Yasaei, Rozhin, Faezi, Sina, Faruque, Mohammad Abdullah Al

Golden Reference-Free Hardware Trojan Localization using Graph Convolutional Network

arXiv.org Artificial IntelligenceJul-14-2022

The globalization of the Integrated Circuit (IC) supply chain has moved most of the design, fabrication, and testing process from a single trusted entity to various untrusted third-party entities worldwide. The risk of using untrusted third-Party Intellectual Property (3PIP) is the possibility for adversaries to insert malicious modifications known as Hardware Trojans (HTs). These HTs can compromise the integrity, deteriorate the performance, deny the service, and alter the functionality of the design. While numerous HT detection methods have been proposed in the literature, the crucial task of HT localization is overlooked. Moreover, a few existing HT localization methods have several weaknesses: reliance on a golden reference, inability to generalize for all types of HT, lack of scalability, low localization resolution, and manual feature engineering/property definition. To overcome their shortcomings, we propose a novel, golden reference-free HT localization method at the pre-silicon stage by leveraging Graph Convolutional Network (GCN). In this work, we convert the circuit design to its intrinsic data structure, graph and extract the node attributes. Afterward, the graph convolution performs automatic feature extraction for nodes to classify the nodes as Trojan or benign. Our automated approach does not burden the designer with manual code review. It locates the Trojan signals with 99.6% accuracy, 93.1% F1-score, and a false-positive rate below 0.009%.

graph, localization, node, (17 more...)

2207.06664

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Explainable Intrusion Detection Systems (X-IDS): A Survey of Current Methods, Challenges, and Opportunities

Neupane, Subash, Ables, Jesse, Anderson, William, Mittal, Sudip, Rahimi, Shahram, Banicescu, Ioana, Seale, Maria

The application of Artificial Intelligence (AI) and Machine Learning (ML) to cybersecurity challenges has gained traction in industry and academia, partially as a result of widespread malware attacks on critical systems such as cloud infrastructures and government institutions. Intrusion Detection Systems (IDS), using some forms of AI, have received widespread adoption due to their ability to handle vast amounts of data with a high prediction accuracy. These systems are hosted in the organizational Cyber Security Operation Center (CSoC) as a defense tool to monitor and detect malicious network flow that would otherwise impact the Confidentiality, Integrity, and Availability (CIA). CSoC analysts rely on these systems to make decisions about the detected threats. However, IDSs designed using Deep Learning (DL) techniques are often treated as black box models and do not provide a justification for their predictions. This creates a barrier for CSoC analysts, as they are unable to improve their decisions based on the model's predictions. One solution to this problem is to design explainable IDS (X-IDS). This survey reviews the state-of-the-art in explainable AI (XAI) for IDS, its current challenges, and discusses how these challenges span to the design of an X-IDS. In particular, we discuss black box and white box approaches comprehensively. We also present the tradeoff between these approaches in terms of their performance and ability to produce explanations. Furthermore, we propose a generic architecture that considers human-in-the-loop which can be used as a guideline when designing an X-IDS. Research recommendations are given from three critical viewpoints: the need to define explainability for IDS, the need to create explanations tailored to various stakeholders, and the need to design metrics to evaluate explanations.

data mining, explanation, machine learning, (22 more...)

2207.06236

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Mississippi > Warren County > Vicksburg (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Promising Solution (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
(9 more...)

Dablain, Damien, Krawczyk, Bartosz, Chawla, Nitesh

Towards A Holistic View of Bias in Machine Learning: Bridging Algorithmic Fairness and Imbalanced Learning

Machine learning (ML) is playing an increasingly important role in rendering decisions that affect a broad range of groups in society. ML models inform decisions in criminal justice, the extension of credit in banking, and the hiring practices of corporations. This posits the requirement of model fairness, which holds that automated decisions should be equitable with respect to protected features (e.g., gender, race, or age) that are often under-represented in the data. We postulate that this problem of under-representation has a corollary to the problem of imbalanced data learning. This class imbalance is often reflected in both classes and protected features. For example, one class (those receiving credit) may be over-represented with respect to another class (those not receiving credit) and a particular group (females) may be under-represented with respect to another group (males). A key element in achieving algorithmic fairness with respect to protected groups is the simultaneous reduction of class and protected group imbalance in the underlying training data, which facilitates increases in both model accuracy and fairness. We discuss the importance of bridging imbalanced learning and group fairness by showing how key concepts in these fields overlap and complement each other; and propose a novel oversampling algorithm, Fair Oversampling, that addresses both skewed class distributions and protected features. Our method: (i) can be used as an efficient pre-processing algorithm for standard ML algorithms to jointly address imbalance and group equity; and (ii) can be combined with fairness-aware learning algorithms to improve their robustness to varying levels of class imbalance. Additionally, we take a step toward bridging the gap between fairness and imbalanced learning with a new metric, Fair Utility, that combines balanced accuracy with fairness.

artificial intelligence, fairness, machine learning, (17 more...)

2207.06084

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > North Macedonia > Skopje Statistical Region > Skopje Municipality > Skopje (0.04)
Asia > Singapore (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.67)
Banking & Finance > Credit (0.46)
Law > Civil Rights & Constitutional Law (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Makwana, Dhruv, Nag, Subhrajit, Susladkar, Onkar, Deshmukh, Gayatri, R, Sai Chandra Teja, Mittal, Sparsh, Mohan, C Krishna

ACLNet: An Attention and Clustering-based Cloud Segmentation Network

We propose a novel deep learning model named ACLNet, for cloud segmentation from ground images. ACLNet uses both deep neural network and machine learning (ML) algorithm to extract complementary features. Specifically, it uses EfficientNet-B0 as the backbone, "`a trous spatial pyramid pooling" (ASPP) to learn at multiple receptive fields, and "global attention module" (GAM) to extract finegrained details from the image. ACLNet also uses k-means clustering to extract cloud boundaries more precisely. ACLNet is effective for both daytime and nighttime images. It provides lower error rate, higher recall and higher F1-score than state-of-art cloud segmentation models. The source-code of ACLNet is available here: https://github.com/ckmvigil/ACLNet.

aclnet, feature map, segmentation, (16 more...)

doi: 10.1080/2150704X.2022.2097031

2207.06277

Country:

Asia > India > Uttarakhand > Roorkee (0.04)
Asia > India > Telangana > Hyderabad (0.04)
Asia > Japan (0.04)
Asia > India > Maharashtra > Pune (0.04)

Genre: Research Report (0.82)

Industry: Energy (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Sun, Jiayin, Dong, Qiulei

Orthogonal-Coding-Based Feature Generation for Transductive Open-Set Recognition via Dual-Space Consistent Sampling

Open-set recognition (OSR) aims to simultaneously detect unknown-class samples and classify known-class samples. Most of the existing OSR methods are inductive methods, which generally suffer from the domain shift problem that the learned model from the known-class domain might be unsuitable for the unknown-class domain. Addressing this problem, inspired by the success of transductive learning for alleviating the domain shift problem in many other visual tasks, we propose an Iterative Transductive OSR framework, called IT-OSR, which implements three explored modules iteratively, including a reliability sampling module, a feature generation module, and a baseline update module. Specifically, at each iteration, a dual-space consistent sampling approach is presented in the explored reliability sampling module for selecting some relatively more reliable ones from the test samples according to their pseudo labels assigned by a baseline method, which could be an arbitrary inductive OSR method. Then, a conditional dual-adversarial generative network under an orthogonal coding condition is designed in the feature generation module to generate discriminative sample features of both known and unknown classes according to the selected test samples with their pseudo labels. Finally, the baseline method is updated for sample re-prediction in the baseline update module by jointly utilizing the generated features, the selected test samples with pseudo labels, and the training samples. Extensive experimental results on both the standard-dataset and the cross-dataset settings demonstrate that the derived transductive methods, by introducing two typical inductive OSR methods into the proposed IT-OSR framework, achieve better performances than 15 state-of-the-art methods in most cases.

module, recognition, test sample, (15 more...)

2207.05957

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

PhishSim: Aiding Phishing Website Detection with a Feature-Free Tool

Purwanto, Rizka, Pal, Arindam, Blair, Alan, Jha, Sanjay

In this paper, we propose a feature-free method for detecting phishing websites using the Normalized Compression Distance (NCD), a parameter-free similarity measure which computes the similarity of two websites by compressing them, thus eliminating the need to perform any feature extraction. It also removes any dependence on a specific set of website features. This method examines the HTML of webpages and computes their similarity with known phishing websites, in order to classify them. We use the Furthest Point First algorithm to perform phishing prototype extractions, in order to select instances that are representative of a cluster of phishing webpages. We also introduce the use of an incremental learning algorithm as a framework for continuous and adaptive detection without extracting new features when concept drift occurs. On a large dataset, our proposed method significantly outperforms previous methods in detecting phishing websites, with an AUC score of 98.68%, a high true positive rate (TPR) of around 90%, while maintaining a low false positive rate (FPR) of 0.58%. Our approach uses prototypes, eliminating the need to retain long term data in the future, and is feasible to deploy in real systems with a processing time of roughly 0.3 seconds.

doi: 10.1109/TIFS.2022.3164212.

2207.10801

Country:

North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Pombal, José, Cruz, André F., Bravo, João, Saleiro, Pedro, Figueiredo, Mário A. T., Bizarro, Pedro

Understanding Unfairness in Fraud Detection through Model and Data Bias Interactions

In recent years, machine learning algorithms have become ubiquitous in a multitude of high-stakes decision-making applications. The unparalleled ability of machine learning algorithms to learn patterns from data also enables them to incorporate biases embedded within. A biased model can then make decisions that disproportionately harm certain groups in society -- limiting their access to financial services, for example. The awareness of this problem has given rise to the field of Fair ML, which focuses on studying, measuring, and mitigating unfairness in algorithmic prediction, with respect to a set of protected groups (e.g., race or gender). However, the underlying causes for algorithmic unfairness still remain elusive, with researchers divided between blaming either the ML algorithms or the data they are trained on. In this work, we maintain that algorithmic unfairness stems from interactions between models and biases in the data, rather than from isolated contributions of either of them. To this end, we propose a taxonomy to characterize data bias and we study a set of hypotheses regarding the fairness-accuracy trade-offs that fairness-blind ML algorithms exhibit under different data bias settings. On our real-world account-opening fraud use case, we find that each setting entails specific trade-offs, affecting fairness in expected value and variance -- the latter often going unnoticed. Moreover, we show how algorithms compare differently in terms of accuracy and fairness, depending on the biases affecting the data. Finally, we note that under specific data bias conditions, simple pre-processing interventions can successfully balance group-wise error rates, while the same techniques fail in more complex settings.

disparity, fairness, prevalence, (14 more...)

2207.06273

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > District of Columbia > Washington (0.05)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Banking & Finance > Financial Services (0.66)
Law Enforcement & Public Safety > Fraud (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Bándi, Péter, Balkenhol, Maschenka, van Dijk, Marcory, van Ginneken, Bram, van der Laak, Jeroen, Litjens, Geert

Domain adaptation strategies for cancer-independent detection of lymph node metastases

Recently, large, high-quality public datasets have led to the development of convolutional neural networks that can detect lymph node metastases of breast cancer at the level of expert pathologists. Many cancers, regardless of the site of origin, can metastasize to lymph nodes. However, collecting and annotating high-volume, high-quality datasets for every cancer type is challenging. In this paper we investigate how to leverage existing high-quality datasets most efficiently in multi-task settings for closely related tasks. Specifically, we will explore different training and domain adaptation strategies, including prevention of catastrophic forgetting, for colon and head-and-neck cancer metastasis detection in lymph nodes. Our results show state-of-the-art performance on both cancer metastasis detection tasks. Furthermore, we show the effectiveness of repeated adaptation of networks from one cancer type to another to obtain multi-task metastasis detection networks. Last, we show that leveraging existing high-quality datasets can significantly boost performance on new target tasks and that catastrophic forgetting can be effectively mitigated using regularization.

dataset, metastasis, source task, (14 more...)

2207.06193

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > Netherlands > Gelderland > Arnhem (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)