AITopics

2303.01201

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Nesaragi, Naimahmed, Høiseth, Lars Øivind, Qadir, Hemin Ali, Rosseland, Leiv Arne, Halvorsen, Per Steinar, Balasingham, Ilangko

Non-invasive Waveform Analysis for Emergency Triage via Simulated Hemorrhage: An Experimental Study using Novel Dynamic Lower Body Negative Pressure Model

The extent to which advanced waveform analysis of non-invasive physiological signals can diagnose levels of hypovolemia remains insufficiently explored. The present study explores the discriminative ability of a deep learning (DL) framework to classify levels of ongoing hypovolemia, simulated via novel dynamic lower body negative pressure (LBNP) model among healthy volunteers. We used a dynamic LBNP protocol as opposed to the traditional model, where LBNP is applied in a predictable step-wise, progressively descending manner. This dynamic LBNP version assists in circumventing the problem posed in terms of time dependency, as in real-life pre-hospital settings, intravascular blood volume may fluctuate due to volume resuscitation. A supervised DL-based framework for ternary classification was realized by segmenting the underlying noninvasive signal and labeling segments with corresponding LBNP target levels. The proposed DL model with two inputs was trained with respective time-frequency representations extracted on waveform segments to classify each of them into blood volume loss: Class 1 (mild); Class 2 (moderate); or Class 3 (severe). At the outset, the latent space derived at the end of the DL model via late fusion among both inputs assists in enhanced classification performance. When evaluated in a 3-fold cross-validation setup with stratified subjects, the experimental findings demonstrated PPG to be a potential surrogate for variations in blood volume with average classification performance, AUROC: 0.8861, AUPRC: 0.8141, $F1$-score:72.16%, Sensitivity:79.06 %, and Specificity:89.21 %. Our proposed DL algorithm on PPG signal demonstrates the possibility of capturing the complex interplay in physiological responses related to both bleeding and fluid resuscitation using this challenging LBNP setup.

classification, negative pressure, waveform segment, (15 more...)

2303.06064

Country:

Europe > Norway > Eastern Norway > Oslo (0.05)
Oceania > Australia (0.04)
North America > United States > California > Orange County > Irvine (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

FedScore: A privacy-preserving framework for federated scoring system development

Li, Siqi, Ning, Yilin, Ong, Marcus Eng Hock, Chakraborty, Bibhas, Hong, Chuan, Xie, Feng, Yuan, Han, Liu, Mingxuan, Buckland, Daniel M., Chen, Yong, Liu, Nan

We propose FedScore, a privacy-preserving federated learning framework for scoring system generation across multiple sites to facilitate cross-institutional collaborations. The FedScore framework includes five modules: federated variable ranking, federated variable transformation, federated score derivation, federated model selection and federated model evaluation. To illustrate usage and assess FedScore's performance, we built a hypothetical global scoring system for mortality prediction within 30 days after a visit to an emergency department using 10 simulated sites divided from a tertiary hospital in Singapore. We employed a pre-existing score generator to construct 10 local scoring systems independently at each site and we also developed a scoring system using centralized data for comparison. We compared the acquired FedScore model's performance with that of other scoring models using the receiver operating characteristic (ROC) analysis. The FedScore model achieved an average area under the curve (AUC) value of 0.763 across all sites, with a standard deviation (SD) of 0.020. We also calculated the average AUC values and SDs for each local model, and the FedScore model showed promising accuracy and stability with a high average AUC value which was closest to the one of the pooled model and SD which was lower than that of most local models. This study demonstrates that FedScore is a privacy-preserving scoring system generator with potentially good generalizability.

artificial intelligence, data mining, machine learning, (17 more...)

doi: 10.1016/j.jbi.2023.104485

2303.00282

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Asia > Singapore > Central Region > Singapore (0.05)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > Canada > Ontario (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Fairness Evaluation in Text Classification: Machine Learning Practitioner Perspectives of Individual and Group Fairness

Ashktorab, Zahra, Hoover, Benjamin, Agarwal, Mayank, Dugan, Casey, Geyer, Werner, Yang, Hao Bang, Yurochkin, Mikhail

Mitigating algorithmic bias is a critical task in the development and deployment of machine learning models. While several toolkits exist to aid machine learning practitioners in addressing fairness issues, little is known about the strategies practitioners employ to evaluate model fairness and what factors influence their assessment, particularly in the context of text classification. Two common approaches of evaluating the fairness of a model are group fairness and individual fairness. We run a study with Machine Learning practitioners (n=24) to understand the strategies used to evaluate models. Metrics presented to practitioners (group vs. individual fairness) impact which models they consider fair. Participants focused on risks associated with underpredicting/overpredicting and model sensitivity relative to identity token manipulations. We discover fairness assessment strategies involving personal experiences or how users form groups of identity tokens to test model fairness. We provide recommendations for interactive tools for evaluating fairness in text classification.

machine learning, natural language, text classification, (14 more...)

doi: 10.1145/3544548.3581227

2303.00673

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Germany > Hamburg (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.82)

Piland, Jacob, Czajka, Adam, Sweet, Christopher

Improving Model's Focus Improves Performance of Deep Learning-Based Synthetic Face Detectors

Deep learning-based models generalize better to unknown data samples after being guided "where to look" by incorporating human perception into training strategies. We made an observation that the entropy of the model's salience trained in that way is lower when compared to salience entropy computed for models training without human perceptual intelligence. Thus the question: does further increase of model's focus, by lowering the entropy of model's class activation map, help in further increasing the performance? In this paper we propose and evaluate several entropy-based new loss function components controlling the model's focus, covering the full range of the level of such control, from none to its "aggressive" minimization. We show, using a problem of synthetic face detection, that improving the model's focus, through lowering entropy, leads to models that perform better in an open-set scenario, in which the test samples are synthesized by unknown generative models. We also show that optimal performance is obtained when the model's loss function blends three aspects: regular classification, low-entropy of the model's focus, and human-guided saliency.

artificial intelligence, entropy, machine learning, (18 more...)

2303.00818

Country:

North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
North America > United States > Hawaii (0.04)
North America > United States > Alaska > Anchorage Municipality > Anchorage (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias

Wu, Shangxi, He, Qiuyang, Wu, Fangzhao, Sang, Jitao, Wang, Yaowei, Xu, Changsheng

With the swift advancement of deep learning, state-of-the-art algorithms have been utilized in various social situations. Nonetheless, some algorithms have been discovered to exhibit biases and provide unequal results. The current debiasing methods face challenges such as poor utilization of data or intricate training requirements. In this work, we found that the backdoor attack can construct an artificial bias similar to the model bias derived in standard training. Considering the strong adjustability of backdoor triggers, we are motivated to mitigate the model bias by carefully designing reverse artificial bias created from backdoor attack. Based on this, we propose a backdoor debiasing framework based on knowledge distillation, which effectively reduces the model bias from original data and minimizes security risks from the backdoor attack. The proposed solution is validated on both image and structured datasets, showing promising results. This work advances the understanding of backdoor attacks and highlights its potential for beneficial applications. The code for the study can be found at \url{https://anonymous.4open.science/r/DwB-BC07/}.

artificial intelligence, deep learning, machine learning, (17 more...)

2303.01504

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Nguyen, Thy, Chaturvedi, Anamay, Nguyen, Huy Lê

Improved Learning-augmented Algorithms for k-means and k-medians Clustering

We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space, and a label for each data point given by an oracle indicating what subsets of points should be clustered together. This setting captures situations where we have access to some auxiliary information about the data set relevant for our clustering objective, for instance the labels output by a neural network. Following prior work, we assume that there are at most an $\alpha \in (0,c)$ for some $c<1$ fraction of false positives and false negatives in each predicted cluster, in the absence of which the labels would attain the optimal clustering cost $\mathrm{OPT}$. For a dataset of size $m$, we propose a deterministic $k$-means algorithm that produces centers with improved bound on clustering cost compared to the previous randomized algorithm while preserving the $O( d m \log m)$ runtime. Furthermore, our algorithm works even when the predictions are not very accurate, i.e. our bound holds for $\alpha$ up to $1/2$, an improvement over $\alpha$ being at most $1/7$ in the previous work. For the $k$-medians problem we improve upon prior work by achieving a biquadratic improvement in the dependence of the approximation factor on the accuracy parameter $\alpha$ to get a cost of $(1+O(\alpha))\mathrm{OPT}$, while requiring essentially just $O(md \log^3 m/\alpha)$ runtime.

algorithm, artificial intelligence, machine learning, (18 more...)

2210.17028

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)

Iselborn, Kevin, Stricker, Marco, Miyamoto, Takashi, Nuske, Marlon, Dengel, Andreas

On the Importance of Feature Representation for Flood Mapping using Classical Machine Learning Approaches

Climate change has increased the severity and frequency of weather disasters all around the world. Flood inundation mapping based on earth observation data can help in this context, by providing cheap and accurate maps depicting the area affected by a flood event to emergency-relief units in near-real-time. Building upon the recent development of the Sen1Floods11 dataset, which provides a limited amount of hand-labeled high-quality training data, this paper evaluates the potential of five traditional machine learning approaches such as gradient boosted decision trees, support vector machines or quadratic discriminant analysis. By performing a grid-search-based hyperparameter optimization on 23 feature spaces we can show that all considered classifiers are capable of outperforming the current state-of-the-art neural network-based approaches in terms of total IoU on their best-performing feature spaces. With total and mean IoU values of 0.8751 and 0.7031 compared to 0.70 and 0.5873 as the previous best-reported results, we show that a simple gradient boosting classifier can significantly improve over deep neural network based approaches, despite using less training data. Furthermore, an analysis of the regional distribution of the Sen1Floods11 dataset reveals a problem of spatial imbalance. We show that traditional machine learning models can learn this bias and argue that modified metric evaluations are required to counter artifacts due to spatial imbalance. Lastly, a qualitative analysis shows that this pixel-wise classifier provides highly-precise surface water classifications indicating that a good choice of a feature space and pixel-wise classification can generate high-quality flood maps using optical and SAR data. We make our code publicly available at: https://github.com/DFKI-Earth-And-Space-Applications/Flood_Mapping_Feature_Space_Importance

artificial intelligence, feature space, machine learning, (16 more...)

2303.00691

Country:

South America > Bolivia (0.05)
Asia > India (0.04)
North America > United States > Colorado (0.04)
(19 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.93)
Energy (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Maddock, Samuel, Sablayrolles, Alexandre, Stock, Pierre

CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning

Federated Learning (FL) is a setting for training machine learning models in distributed environments where the clients do not share their raw data but instead send model updates to a server. However, model updates can be subject to attacks and leak private information. Differential Privacy (DP) is a leading mitigation strategy which involves adding noise to clipped model updates, trading off performance for strong theoretical privacy guarantees. Previous work has shown that the threat model of DP is conservative and that the obtained guarantees may be vacuous or may overestimate information leakage in practice. In this paper, we aim to achieve a tighter measurement of the model exposure by considering a realistic threat model. We propose a novel method, CANIFE, that uses canaries - carefully crafted samples by a strong adversary to evaluate the empirical privacy of a training round. We apply this attack to vision models trained on CIFAR-10 and CelebA and to language models trained on Sent140 and Shakespeare. In particular, in realistic FL scenarios, we demonstrate that the empirical per-round epsilon obtained with CANIFE is 4-5x lower than the theoretical bound.

artificial intelligence, deep learning, machine learning, (17 more...)

2210.02912

Country: North America > United States > Virginia (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Maan, Jitendra, Maan, Harsh

Customer Churn Prediction Model using Explainable Machine Learning

It becomes a significant challenge to predict customer behavior and retain an existing customer with the rapid growth of digitization which opens up more opportunities for customers to choose from subscription-based products and services model. Since the cost of acquiring a new customer is five-times higher than retaining an existing customer, henceforth, there is a need to address the customer churn problem which is a major threat across the Industries. Considering direct impact on revenues, companies identify the factors that increases the customer churn rate. Here, key objective of the paper is to develop a unique Customer churn prediction model which can help to predict potential customers who are most likely to churn and such early warnings can help to take corrective measures to retain them. Here, we evaluated and analyzed the performance of various tree-based machine learning approaches and algorithms and identified the Extreme Gradient Boosting XGBOOST Classifier as the most optimal solution to Customer churn problem. To deal with such real-world problems, Paper emphasize the Model interpretability which is an important metric to help customers to understand how Churn Prediction Model is making predictions. In order to improve Model explainability and transparency, paper proposed a novel approach to calculate Shapley values for possible combination of features to explain which features are the most important/relevant features for a model to become highly interpretable, transparent and explainable to potential customers.

artificial intelligence, customer, machine learning, (12 more...)

2303.0096

Country: Asia > India (0.05)

Genre: Research Report > Promising Solution (0.66)

Industry: Telecommunications (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.73)