AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

#artificialintelligenceFeb-26-2022, 08:40:06 GMT

Hyperspectral Image Segmentation

The simple image captured in camera consists colors of different wavelengths (visible spectrum) which can be represented with combination of three colors — Red,Green and Blue (RGB). Thus digital…

classification, information, prediction, (16 more...)

Country:

North America > United States > Indiana (0.04)
North America > United States > Florida > Brevard County (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)

#artificialintelligenceFeb-26-2022, 06:56:25 GMT

Fuzzy Bootstrap Matching - DataScienceCentral.com

This paper discusses techniques for merging data files where no key field exists between the files. The paper will illustrate an approach to resolve two issues that are common to most fuzzy matching techniques: 1) how to weight proxy identifier fields, and 2) how to measure the Type One and Type Two errors of the merge estimation algorithm. A common requirement in analytics is to merge records in two or more large sets of information (i.e., thousands if not millions of records) where no exact key exists to match records between the information sets. When no exact key between the two data sets exists, a common merging solution is to use "fuzzy" matching. "Fuzzy" matching uses proxy keys as substitute keys to match records between the two data files.

accuracy, holdout sample record, proxy key, (13 more...)

Country: North America > United States (0.30)

Genre: Research Report (0.51)

Industry: Health & Medicine (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.98)

#artificialintelligenceFeb-26-2022, 05:10:42 GMT

Computer Vision and Machine Learning for Tuna and Salmon Meat Classification

Aquatic products are popular among consumers, and their visual quality used to be detected manually for freshness assessment. This paper presents a solution to inspect tuna and salmon meat from digital images. The solution proposes hardware and a protocol for preprocessing images and extracting parameters from the RGB, HSV, HSI, and L*a*b* spaces of the collected images to generate the datasets. Experiments are performed using machine learning classification methods. We evaluated the AutoML models to classify the freshness levels of tuna and salmon samples through the metrics of: accuracy, receiver operating characteristic curve, precision, recall, f1-score, and confusion matrix (CM). The ensembles generated by AutoML, for both tuna and salmon, reached 100% in all metrics, noting that the method of inspection of fish freshness from image collection, through preprocessing and extraction/fitting of features showed exceptional results when datasets were subjected to the machine learning models. We emphasize how easy it is to use the proposed solution in different contexts. Computer vision and machine learning, as a nondestructive method, were viable for external quality detection of tuna and salmon meat products through its efficiency, objectiveness, consistency, and reliability due to the experiments’ high accuracy.

accuracy, computer vision and machine learning, tuna and salmon meat classification, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.62)

Yong, Bang Xiang, Brintrup, Alexandra

Bayesian autoencoders with uncertainty quantification: Towards trustworthy anomaly detection

arXiv.org Machine LearningFeb-25-2022

Despite numerous studies of deep autoencoders (AEs) for unsupervised anomaly detection, AEs still lack a way to express uncertainty in their predictions, crucial for ensuring safe and trustworthy machine learning systems in high-stake applications. Therefore, in this work, the formulation of Bayesian autoencoders (BAEs) is adopted to quantify the total anomaly uncertainty, comprising epistemic and aleatoric uncertainties. To evaluate the quality of uncertainty, we consider the task of classifying anomalies with the additional option of rejecting predictions of high uncertainty. In addition, we use the accuracy-rejection curve and propose the weighted average accuracy as a performance metric. Our experiments demonstrate the effectiveness of the BAE and total anomaly uncertainty on a set of benchmark datasets and two real datasets for manufacturing: one for condition monitoring, the other for quality inspection.

anomaly uncertainty, data mining, machine learning, (20 more...)

2202.12653

Country: Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (0.46)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

arXiv.org Machine LearningFeb-25-2022

Trying to Outrun Causality with Machine Learning: Limitations of Model Explainability Techniques for Identifying Predictive Variables

Vowels, Matthew J.

Machine Learning explainability techniques have been proposed as a means of `explaining' or interrogating a model in order to understand why a particular decision or prediction has been made. Such an ability is especially important at a time when machine learning is being used to automate decision processes which concern sensitive factors and legal outcomes. Indeed, it is even a requirement according to EU law. Furthermore, researchers concerned with imposing overly restrictive functional form (e.g., as would be the case in a linear regression) may be motivated to use machine learning algorithms in conjunction with explainability techniques, as part of exploratory research, with the goal of identifying important variables which are associated with an outcome of interest. For example, epidemiologists might be interested in identifying `risk factors' - i.e. factors which affect recovery from disease - by using random forests and assessing variable relevance using importance measures. However, and as we demonstrate, machine learning algorithms are not as flexible as they might seem, and are instead incredibly sensitive to the underling causal structure in the data. The consequences of this are that predictors which are, in fact, critical to a causal system and highly correlated with the outcome, may nonetheless be deemed by explainability techniques to be unrelated/unimportant/unpredictive of the outcome. Rather than this being a limitation of explainability techniques per se, we show that it is rather a consequence of the mathematical implications of regression, and the interaction of these implications with the associated conditional independencies of the underlying causal structure. We provide some alternative recommendations for researchers wanting to explore the data for important variables.

algorithm, graph, random forest, (14 more...)

2202.09875

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Government > Regional Government > Europe Government (0.48)
Health & Medicine > Epidemiology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Servizi., Valentino, Persson, Dan R., Pereira, Francisco C., Villadsen, Hannah, Bækgaard, Per, Peled, Inon, Nielsen, Otto A.

"Is not the truth the truth?": Analyzing the Impact of User Validations for Bus In/Out Detection in Smartphone-based Surveys

arXiv.org Artificial IntelligenceFeb-24-2022

Passenger flow allows the study of users' behavior through the public network and assists in designing new facilities and services. This flow is observed through interactions between passengers and infrastructure. For this task, Bluetooth technology and smartphones represent the ideal solution. The latter component allows users' identification, authentication, and billing, while the former allows short-range implicit interactions, device-to-device. To assess the potential of such a use case, we need to verify how robust Bluetooth signal and related machine learning (ML) classifiers are against the noise of realistic contexts. Therefore, we model binary passenger states with respect to a public vehicle, where one can either be-in or be-out (BIBO). The BIBO label identifies a fundamental building block of continuously-valued passenger flow. This paper describes the Human-Computer interaction experimental setting in a semi-controlled environment, which involves: two autonomous vehicles operating on two routes, serving three bus stops and eighteen users, as well as a proprietary smartphone-Bluetooth sensing platform. The resulting dataset includes multiple sensors' measurements of the same event and two ground-truth levels, the first being validation by participants, the second by three video-cameras surveilling buses and track. We performed a Monte-Carlo simulation of labels-flip to emulate human errors in the labeling process, as is known to happen in smartphone surveys; next we used such flipped labels for supervised training of ML classifiers. The impact of errors on model performance bias can be large. Results show ML tolerance to label flips caused by human or machine errors up to 30%.

artificial intelligence, classifier, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TITS.2023.3291493

2202.11961

Country:

Europe > Denmark (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > Newfoundland and Labrador > Labrador (0.04)
Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry:

Transportation > Passenger (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (0.49)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningFeb-24-2022

Stacked Residuals of Dynamic Layers for Time Series Anomaly Detection

Zancato, L., Achille, A., Paolini, G., Chiuso, A., Soatto, S.

We present an end-to-end differentiable neural network architecture to perform anomaly detection in multivariate time series by incorporating a Sequential Probability Ratio Test on the prediction residual. The architecture is a cascade of dynamical systems designed to separate linearly predictable components of the signal such as trends and seasonality, from the non-linear ones. The former are modeled by local Linear Dynamic Layers, and their residual is fed to a generic Temporal Convolutional Network that also aggregates global statistics from different time series as context for the local predictions of each one. The last layer implements the anomaly detector, which exploits the temporal structure of the prediction residuals to detect both isolated point anomalies and set-point changes. It is based on a novel application of the classic CUMSUM algorithm, adapted through the use of a variational approximation of f-divergences. The model automatically adapts to the time scales of the observed signals. It approximates a SARIMA model at the get-go, and auto-tunes to the statistics of the signal and its covariates, without the need for supervision, as more data is observed. The resulting system, which we call STRIC, outperforms both state-of-the-art robust statistical methods and deep neural network architectures on multiple anomaly detection benchmarks.

likelihood ratio, stric, time sery, (15 more...)

2202.12457

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Indian Ocean (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Government (0.46)
Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningFeb-24-2022

A general framework for adaptive two-index fusion attribute weighted naive Bayes

Zhou, Xiaoliang, Wu, Dongyang, You, Zitong, Zhang, Li, Ye, Ning

Naive Bayes(NB) is one of the essential algorithms in data mining. However, it is rarely used in reality because of the attribute independent assumption. Researchers have proposed many improved NB methods to alleviate this assumption. Among these methods, due to high efficiency and easy implementation, the filter attribute weighted NB methods receive great attentions. However, there still exists several challenges, such as the poor representation ability for single index and the fusion problem of two indexes. To overcome above challenges, we propose a general framework for Adaptive Two-index Fusion attribute weighted NB(ATFNB). Two types of data description category are used to represent the correlation between classes and attributes, intercorrelation between attributes and attributes, respectively. ATFNB can select any one index from each category. Then, we introduce a switching factor \{beta} to fuse two indexes, which can adaptively adjust the optimal ratio of the two index on various datasets. And a quick algorithm is proposed to infer the optimal interval of switching factor \{beta}. Finally, the weight of each attribute is calculated using the optimal value \{beta} and is integrated into NB classifier to improve the accuracy. The experimental results on 50 benchmark datasets and a Flavia dataset show that ATFNB outperforms the basic NB and state-of-the-art filter weighted NB models. In addition, the ATFNB framework can improve the existing two-index NB model by introducing the adaptive switching factor \{beta}. Auxiliary experimental results demonstrate the improved model significantly increases the accuracy compared to the original model without the adaptive switching factor \{beta}.

algorithm, atfnb, dataset, (16 more...)

2202.11963

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.48)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningFeb-23-2022

Attainability and Optimality: The Equalized Odds Fairness Revisited

Tang, Zeyu, Zhang, Kun

Fairness of machine learning algorithms has been of increasing interest. In order to suppress or eliminate discrimination in prediction, various notions as well as approaches have been proposed to impose fairness. Given a notion of fairness, an essential problem is then whether or not it can always be attained, even if with an unlimited amount of data. This issue is, however, not well addressed yet. In this paper, focusing on the Equalized Odds notion of fairness, we consider the attainability of this criterion and, furthermore, if it is attainable, the optimality of the prediction performance under various settings. In particular, for prediction performed by a deterministic function of input features, we give conditions under which Equalized Odds can hold true; if the stochastic prediction is acceptable, we show that under mild assumptions, fair predictors can always be derived. For classification, we further prove that compared to enforcing fairness by post-processing, one can always benefit from exploiting all available features during training and get potentially better prediction performance while remaining fair. Moreover, while stochastic prediction can attain Equalized Odds with theoretical guarantees, we also discuss its limitation and potential negative social impacts.

equalized odds, prediction, predictor, (17 more...)

2202.11853

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Florida > Broward County (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)