AITopics

2208.08882

Country:

North America > United States (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJun-15-2023

How Robust is your Fair Model? Exploring the Robustness of Diverse Fairness Strategies

Small, Edward, Shao, Wei, Zhang, Zeliang, Liu, Peihan, Chan, Jeffrey, Sokol, Kacper, Salim, Flora

With the introduction of machine learning in high-stakes decision making, ensuring algorithmic fairness has become an increasingly important problem to solve. In response to this, many mathematical definitions of fairness have been proposed, and a variety of optimisation techniques have been developed, all designed to maximise a defined notion of fairness. However, fair solutions are reliant on the quality of the training data, and can be highly sensitive to noise. Recent studies have shown that robustness (the ability for a model to perform well on unseen data) plays a significant role in the type of strategy that should be used when approaching a new problem and, hence, measuring the robustness of these strategies has become a fundamental problem. In this work, we therefore propose a new criterion to measure the robustness of various fairness optimisation strategies - the robustness ratio. We conduct multiple extensive experiments on five bench mark fairness data sets using three of the most popular fairness strategies with respect to four of the most popular definitions of fairness. Our experiments empirically show that fairness methods that rely on threshold optimisation are very sensitive to noise in all the evaluated data sets, despite mostly outperforming other methods. This is in contrast to the other two methods, which are less fair for low noise scenarios but fairer for high noise ones. To the best of our knowledge, we are the first to quantitatively evaluate the robustness of fairness optimisation strategies. This can potentially can serve as a guideline in choosing the most suitable fairness strategy for various data sets.

data mining, machine learning, natural language, (16 more...)

2207.04581

Country:

Oceania > Australia (0.14)
North America > United States > Michigan (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)

Genre:

Research Report > New Finding (0.68)
Overview > Growing Problem (0.54)

Industry:

Education (0.95)
Government > Regional Government > North America Government > United States Government (0.93)
Law > Statutes (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Ünal, Ali Burak, Pfeifer, Nico, Akgün, Mete

ppAURORA: Privacy Preserving Area Under Receiver Operating Characteristic and Precision-Recall Curves

arXiv.org Artificial IntelligenceJun-15-2023

Computing an AUC as a performance measure to compare the quality of different machine learning models is one of the final steps of many research projects. Many of these methods are trained on privacy-sensitive data and there are several different approaches like $\epsilon$-differential privacy, federated machine learning and cryptography if the datasets cannot be shared or used jointly at one place for training and/or testing. In this setting, it can also be a problem to compute the global AUC, since the labels might also contain privacy-sensitive information. There have been approaches based on $\epsilon$-differential privacy to address this problem, but to the best of our knowledge, no exact privacy preserving solution has been introduced. In this paper, we propose an MPC-based solution, called ppAURORA, with private merging of individually sorted lists from multiple sources to compute the exact AUC as one could obtain on the pooled original test samples. With ppAURORA, the computation of the exact area under precision-recall and receiver operating characteristic curves is possible even when ties between prediction confidence values exist. We use ppAURORA to evaluate two different models predicting acute myeloid leukemia therapy response and heart disease, respectively. We also assess its scalability via synthetic data experiments. All these experiments show that we efficiently and privately compute the exact same AUC with both evaluation metrics as one can obtain on the pooled test samples in plaintext according to the semi-honest adversary setting.

artificial intelligence, computation, machine learning, (17 more...)

2102.08788

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.88)
Health & Medicine > Therapeutic Area > Hematology (0.88)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.54)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Towards trustworthy seizure onset detection using workflow notes

Saab, Khaled, Tang, Siyi, Taha, Mohamed, Lee-Messer, Christopher, Ré, Christopher, Rubin, Daniel

A major barrier to deploying healthcare AI models is their trustworthiness. One form of trustworthiness is a model's robustness across different subgroups: while existing models may exhibit expert-level performance on aggregate metrics, they often rely on non-causal features, leading to errors in hidden subgroups. To take a step closer towards trustworthy seizure onset detection from EEG, we propose to leverage annotations that are produced by healthcare personnel in routine clinical workflows -- which we refer to as workflow notes -- that include multiple event descriptions beyond seizures. Using workflow notes, we first show that by scaling training data to an unprecedented level of 68,920 EEG hours, seizure onset detection performance significantly improves (+12.3 AUROC points) compared to relying on smaller training sets with expensive manual gold-standard labels. Second, we reveal that our binary seizure onset detection model underperforms on clinically relevant subgroups (e.g., up to a margin of 6.5 AUROC points between pediatrics and adults), while having significantly higher false positives on EEG clips showing non-epileptiform abnormalities compared to any EEG clip (+19 FPR points). To improve model robustness to hidden subgroups, we train a multilabel model that classifies 26 attributes other than seizures, such as spikes, slowing, and movement artifacts. We find that our multilabel model significantly improves overall seizure onset detection performance (+5.9 AUROC points) while greatly improving performance among subgroups (up to +8.3 AUROC points), and decreases false positives on non-epileptiform abnormalities by 8 FPR points. Finally, we propose a clinical utility metric based on false positives per 24 EEG hours and find that our multilabel model improves this clinical utility metric by a factor of 2x across different clinical settings.

artificial intelligence, machine learning, workflow note, (18 more...)

2306.08728

Country:

North America > United States > California > Santa Clara County > Stanford (0.05)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Workflow (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Pereira, Ramon Fraga, Fuggitti, Francesco, Meneguzzi, Felipe, De Giacomo, Giuseppe

Temporally Extended Goal Recognition in Fully Observable Non-Deterministic Domain Models

Goal Recognition is the task of discerning the correct intended goal that an agent aims to achieve, given a set of goal hypotheses, a domain model, and a sequence of observations (i.e., a sample of the plan executed in the environment). Existing approaches assume that goal hypotheses comprise a single conjunctive formula over a single final state and that the environment dynamics are deterministic, preventing the recognition of temporally extended goals in more complex settings. In this paper, we expand goal recognition to temporally extended goals in Fully Observable Non-Deterministic (FOND) planning domain models, focusing on goals on finite traces expressed in Linear Temporal Logic (LTLf) and Pure Past Linear Temporal Logic (PLTLf). We develop the first approach capable of recognizing goals in such settings and evaluate it using different LTLf and PLTLf goals over six FOND planning domain models. Empirical results show that our approach is accurate in recognizing temporally extended goals in different recognition settings.

artificial intelligence, machine learning, temporally, (17 more...)

2306.0868

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > Canada (0.04)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (0.68)

Intrator, Yotam, Aizenberg, Natalie, Livne, Amir, Rivlin, Ehud, Goldenberg, Roman

Self-Supervised Polyp Re-Identification in Colonoscopy

Computer-aided polyp detection (CADe) is becoming a standard, integral part of any modern colonoscopy system. A typical colonoscopy CADe detects a polyp in a single frame and does not track it through the video sequence. Yet, many downstream tasks including polyp characterization (CADx), quality metrics, automatic reporting, require aggregating polyp data from multiple frames. In this work we propose a robust long term polyp tracking method based on re-identification by visual appearance. Our solution uses an attention-based self-supervised ML model, specifically designed to leverage the temporal nature of video input. We quantitatively evaluate method's performance and demonstrate its value for the CADx task.

artificial intelligence, machine learning, tracklet, (17 more...)

2306.08591

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Pudyel, Megh, Atay, Mustafa

An Exploratory Study of Masked Face Recognition with Machine Learning Algorithms

Automated face recognition is a widely adopted machine learning technology for contactless identification of people in various processes such as automated border control, secure login to electronic devices, community surveillance, tracking school attendance, workplace clock in and clock out. Using face masks have become crucial in our daily life with the recent world-wide COVID-19 pandemic. The use of face masks causes the performance of conventional face recognition technologies to degrade considerably. The effect of mask-wearing in face recognition is yet an understudied issue. In this paper, we address this issue by evaluating the performance of a number of face recognition models which are tested by identifying masked and unmasked face images. We use six conventional machine learning algorithms, which are SVC, KNN, LDA, DT, LR and NB, to find out the ones which perform best, besides the ones which poorly perform, in the presence of masked face images. Local Binary Pattern (LBP) is utilized as the feature extraction operator. We generated and used synthesized masked face images. We prepared unmasked, masked, and half-masked training datasets and evaluated the face recognition performance against both masked and unmasked images to present a broad view of this crucial problem. We believe that our study is unique in elaborating the mask-aware facial recognition with almost all possible scenarios including half_masked-to-masked and half_masked-to-unmasked besides evaluating a larger number of conventional machine learning algorithms compared the other studies in the literature.

artificial intelligence, face image, machine learning, (14 more...)

doi: 10.1109/SoutheastCon51012.2023.10115205

2306.08549

Country:

North America > United States > North Carolina > Forsyth County > Winston-Salem (0.05)
North America > United States > Florida > Orange County > Orlando (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.89)
Government (0.69)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Enhancing COVID-19 Diagnosis through Vision Transformer-Based Analysis of Chest X-ray Images

Zavrak, Sultan

The advent of 2019 Coronavirus (COVID-19) has engendered a momentous global health crisis, necessitating the identification of the ailment in individuals through diverse diagnostic modalities. Radiological imaging, particularly the deployment of X-ray imaging, has been recognized as a pivotal instrument in the detection and characterization of COVID-19. Recent investigations have unveiled invaluable insights pertaining to the virus within X-ray images, instigating the exploration of methodologies aimed at augmenting diagnostic accuracy through the utilization of artificial intelligence (AI) techniques. The current research endeavor posits an innovative framework for the automated diagnosis of COVID-19, harnessing raw chest X-ray images, specifically by means of fine-tuning pre-trained Vision Transformer (ViT) models. The developed models were appraised in terms of their binary classification performance, discerning COVID-19 from Normal cases, as well as their ternary classification performance, discriminating COVID-19 from Pneumonia and Normal instances, and lastly, their quaternary classification performance, discriminating COVID-19 from Bacterial Pneumonia, Viral Pneumonia, and Normal conditions, employing distinct datasets. The proposed model evinced extraordinary precision, registering results of 99.92% and 99.84% for binary classification, 97.95% and 86.48% for ternary classification, and 86.81% for quaternary classification, respectively, on the respective datasets.

artificial intelligence, deep learning, machine learning, (20 more...)

2306.06914

Country:

Asia > Middle East > Republic of Türkiye > Duzce Province > Duzce (0.05)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Chen, Yizheng, Ding, Zhoujie, Wagner, David

Continuous Learning for Android Malware Detection

Machine learning methods can detect Android malware with very high accuracy. However, these classifiers have an Achilles heel, concept drift: they rapidly become out of date and ineffective, due to the evolution of malware apps and benign apps. Our research finds that, after training an Android malware classifier on one year's worth of data, the F1 score quickly dropped from 0.99 to 0.76 after 6 months of deployment on new test samples. In this paper, we propose new methods to combat the concept drift problem of Android malware classifiers. Since machine learning technique needs to be continuously deployed, we use active learning: we select new samples for analysts to label, and then add the labeled samples to the training set to retrain the classifier. Our key idea is, similarity-based uncertainty is more robust against concept drift. Therefore, we combine contrastive learning with active learning. We propose a new hierarchical contrastive learning scheme, and a new sample selection technique to continuously train the Android malware classifier. Our evaluation shows that this leads to significant improvements, compared to previously published methods for active learning. Our approach reduces the false negative rate from 14% (for the best baseline) to 9%, while also reducing the false positive rate (from 0.86% to 0.48%). Also, our approach maintains more consistent performance across a seven-year time period than past methods.

artificial intelligence, learning, machine learning, (18 more...)

2302.04332

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Continuing Education (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Bansal, Parikshit, Prabhu, Yashoteja, Kiciman, Emre, Sharma, Amit

Using Interventions to Improve Out-of-Distribution Generalization of Text-Matching Recommendation Systems

Given a user's input text, text-matching recommender systems output relevant items by comparing the input text to available items' description, such as product-to-product recommendation on e-commerce platforms. As users' interests and item inventory are expected to change, it is important for a text-matching system to generalize to data shifts, a task known as out-of-distribution (OOD) generalization. However, we find that the popular approach of fine-tuning a large, base language model on paired item relevance data (e.g., user clicks) can be counter-productive for OOD generalization. For a product recommendation task, fine-tuning obtains worse accuracy than the base model when recommending items in a new category or for a future time period. To explain this generalization failure, we consider an intervention-based importance metric, which shows that a fine-tuned model captures spurious correlations and fails to learn the causal features that determine the relevance between any two text inputs. Moreover, standard methods for causal regularization do not apply in this setting, because unlike in images, there exist no universally spurious features in a text-matching task (the same token may be spurious or causal depending on the text it is being matched to). For OOD generalization on text inputs, therefore, we highlight a different goal: avoiding high importance scores for certain features. We do so using an intervention-based regularizer that constraints the causal effect of any token on the model's relevance score to be similar to the base model. Results on Amazon product and 3 question recommendation datasets show that our proposed regularizer improves generalization for both in-distribution and OOD evaluation, especially in difficult scenarios when the base model is not accurate.

artificial intelligence, machine learning, natural language, (17 more...)

2210.10636

Country:

North America > United States (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)