AITopics

2210.14396

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(7 more...)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.45)
Information Technology (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Sokol, Kacper, Kull, Meelis, Chan, Jeffrey, Salim, Flora Dilys

Cross-model Fairness: Empirical Study of Fairness and Ethics Under Model Multiplicity

arXiv.org Artificial IntelligenceAug-17-2023

While data-driven predictive models are a strictly technological construct, they may operate within a social context in which benign engineering choices entail implicit, indirect and unexpected real-life consequences. Fairness of such systems -- pertaining both to individuals and groups -- is one relevant consideration in this space; it arises when data capture protected characteristics upon which people may be discriminated. To date, this notion has predominantly been studied for a fixed model, often under different classification thresholds, striving to identify and eradicate undesirable, discriminative and possibly unlawful aspects of its operation. Here, we backtrack on this fixed model assumption to propose and explore a novel definition of cross-model fairness where individuals can be harmed when one predictor is chosen ad hoc from a group of equally-well performing models, i.e., in view of utility-based model multiplicity. Since a person may be classified differently across models that are otherwise considered equivalent, this individual could argue for a predictor granting them the most favourable outcome, employing which may have adverse effects on others. We introduce this scenario with a two-dimensional example and linear classification; then, we present a comprehensive empirical study based on real-life predictive models and data sets that are popular with the algorithmic fairness community; finally, we investigate analytical properties of cross-model fairness and its ramifications in a broader context. Our findings suggest that such unfairness can be readily found in the real life and it may be difficult to mitigate by technical means alone as doing so is likely to degrade predictive performance.

artificial intelligence, data mining, machine learning, (20 more...)

2203.07139

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Law (0.92)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Modeling & Simulation (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Kloos, Kevin, Karch, Julian D., Meertens, Quinten A., de Rooij, Mark

Continuous Sweep: an improved, binary quantifier

Quantification is a supervised machine learning task, focused on estimating the class prevalence of a dataset rather than labeling its individual observations. We introduce Continuous Sweep, a new parametric binary quantifier inspired by the well-performing Median Sweep. Median Sweep is currently one of the best binary quantifiers, but we have changed this quantifier on three points, namely 1) using parametric class distributions instead of empirical distributions, 2) optimizing decision boundaries instead of applying discrete decision rules, and 3) calculating the mean instead of the median. We derive analytic expressions for the bias and variance of Continuous Sweep under general model assumptions. This is one of the first theoretical contributions in the field of quantification learning. Moreover, these derivations enable us to find the optimal decision boundaries. Finally, our simulation study shows that Continuous Sweep outperforms Median Sweep in a wide range of situations.

artificial intelligence, continuous sweep, machine learning, (16 more...)

2308.08387

Country:

Europe > Netherlands (0.28)
North America > United States > New York (0.14)
Europe > Italy (0.14)

Genre: Research Report (1.00)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.68)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.68)
Energy > Oil & Gas > Midstream (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Label Propagation Techniques for Artifact Detection in Imbalanced Classes using Photoplethysmogram Signals

Macabiau, Clara, Le, Thanh-Dung, Albert, Kevin, Jouvet, Philippe, Noumeir, Rita

Photoplethysmogram (PPG) signals are widely used in healthcare for monitoring vital signs, but they are susceptible to motion artifacts that can lead to inaccurate interpretations. In this study, the use of label propagation techniques to propagate labels among PPG samples is explored, particularly in imbalanced class scenarios where clean PPG samples are significantly outnumbered by artifact-contaminated samples. With a precision of 91%, a recall of 90% and an F1 score of 90% for the class without artifacts, the results demonstrate its effectiveness in labeling a medical dataset, even when clean samples are rare. For the classification of artifacts our study compares supervised classifiers such as conventional classifiers and neural networks (MLP, Transformers, FCN) with the semi-supervised label propagation algorithm. With a precision of 89%, a recall of 95% and an F1 score of 92%, the KNN supervised model gives good results, but the semi-supervised algorithm performs better in detecting artifacts. The findings suggest that the semi-supervised algorithm label propagation hold promise for artifact detection in PPG signals, which can enhance the reliability of PPG-based health monitoring systems in real-world applications.

data mining, machine learning, natural language, (19 more...)

2308.0848

Country: North America > Canada > Quebec > Montreal (0.05)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

ChinaTelecom System Description to VoxCeleb Speaker Recognition Challenge 2023

Du, Mengjie, Fang, Xiang, Li, Jie

This technical report describes ChinaTelecom system for Track 1 (closed) of the VoxCeleb2023 Speaker Recognition Challenge (VoxSRC 2023). Our system consists of several ResNet variants trained only on VoxCeleb2, which were fused for better performance later. Score calibration was also applied for each variant and the fused system. The final submission achieved minDCF of 0.1066 and EER of 1.980%.

ieee international conference, recognition, speaker verification, (9 more...)

2308.08181

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Speech Recognition (0.63)

Sotubadi, Saleh Valizadeh, Liu, Rui, Neguyen, Vinh

Explainable AI for tool wear prediction in turning

This research aims develop an Explainable Artificial Intelligence (XAI) framework to facilitate human-understandable solutions for tool wear prediction during turning. A random forest algorithm was used as the supervised Machine Learning (ML) classifier for training and binary classification using acceleration, acoustics, temperature, and spindle speed during the orthogonal tube turning process as input features. The ML classifier was used to predict the condition of the tool after the cutting process, which was determined in a binary class form indicating if the cutting tool was available or failed. After the training process, the Shapley criterion was used to explain the predictions of the trained ML classifier. Specifically, the significance of each input feature in the decision-making and classification was identified to explain the reasoning of the ML classifier predictions. After implementing the Shapley criterion on all testing datasets, the tool temperature was identified as the most significant feature in determining the classification of available versus failed cutting tools. Hence, this research demonstrates capability of XAI to provide machining operators the ability to diagnose and understand complex ML classifiers in prediction of tool wear.

machine learning, natural language, prediction, (18 more...)

2308.08765

Country:

North America > United States > Michigan (0.05)
North America > United States > New York > Monroe County > Rochester (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.91)

Efficient Commercial Bank Customer Credit Risk Assessment Based on LightGBM and Feature Engineering

Sun, Yanjie, Gong, Zhike, Shi, Quan, Chen, Lin

Effective control of credit risk is a key link in the steady operation of commercial banks. This paper is mainly based on the customer information dataset of a foreign commercial bank in Kaggle, and we use LightGBM algorithm to build a classifier to classify customers, to help the bank judge the possibility of customer credit default. This paper mainly deals with characteristic engineering, such as missing value processing, coding, imbalanced samples, etc., which greatly improves the machine learning effect. The main innovation of this paper is to construct new feature attributes on the basis of the original dataset so that the accuracy of the classifier reaches 0.734, and the AUC reaches 0.772, which is more than many classifiers based on the same dataset. The model can provide some reference for commercial banks' credit granting, and also provide some feature processing ideas for other similar studies.

algorithm, artificial intelligence, machine learning, (16 more...)

2308.08762

Country:

Asia > China > Sichuan Province > Chengdu (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance > Credit (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Singh, Anant, Gupta, Akshat

Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition

Recent advancements in transformer-based speech representation models have greatly transformed speech processing. However, there has been limited research conducted on evaluating these models for speech emotion recognition (SER) across multiple languages and examining their internal representations. This article addresses these gaps by presenting a comprehensive benchmark for SER with eight speech representation models and six different languages. We conducted probing experiments to gain insights into inner workings of these models for SER. We find that using features from a single optimal layer of a speech model reduces the error rate by 32\% on average across seven datasets when compared to systems where features from all layers of speech models are used. We also achieve state-of-the-art results for German and Persian languages. Our probing results indicate that the middle layers of speech models capture the most important emotional information for speech emotion recognition.

machine learning, natural language, recognition, (18 more...)

2308.08713

Country:

North America > United States > New York (0.05)
Europe > Latvia > Riga Municipality > Riga (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.85)
(2 more...)

Khadka, Puskal, Lamichhane, Prabhav

Content-based Recommendation Engine for Video Streaming Platform

Recommendation engine suggest content, product or services to the user by using machine learning algorithm. This paper proposed a content-based recommendation engine for providing video suggestion to the user based on their previous interests and choices. We will use TF-IDF text vectorization method to determine the relevance of words in a document. Then we will find out the similarity between each content by calculating cosine similarity between them. Finally, engine will recommend videos to the users based on the obtained similarity score value. In addition, we will measure the engine's performance by computing precision, recall, and F1 core of the proposed system.

artificial intelligence, machine learning, natural language, (18 more...)

2308.08406

Country:

North America > United States > Texas (0.05)
Asia > Nepal > Bagmati Province > Kathmandu District > Kathmandu (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment (0.70)
Media > Film (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.32)

DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models

Gao, Ruiyuan, Zhao, Chenchen, Hong, Lanqing, Xu, Qiang

Given a classifier, the inherent property of semantic Out-of-Distribution (OOD) samples is that their contents differ from all legal classes in terms of semantics, namely semantic mismatch. There is a recent work that directly applies it to OOD detection, which employs a conditional Generative Adversarial Network (cGAN) to enlarge semantic mismatch in the image space. While achieving remarkable OOD detection performance on small datasets, it is not applicable to ImageNet-scale datasets due to the difficulty in training cGANs with both input images and labels as conditions. As diffusion models are much easier to train and amenable to various conditions compared to cGANs, in this work, we propose to directly use pre-trained diffusion models for semantic mismatch-guided OOD detection, named DiffGuard. Specifically, given an OOD input image and the predicted label from the classifier, we try to enlarge the semantic difference between the reconstructed OOD image under these conditions and the original input image. We also present several test-time techniques to further strengthen such differences. Experimental results show that DiffGuard is effective on both Cifar-10 and hard cases of the large-scale ImageNet, and it can be easily combined with existing OOD detection techniques to achieve state-of-the-art OOD detection results.

diffusion model, machine learning, natural language, (18 more...)

2308.07687

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)