AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Online Change Point Detection for Weighted and Directed Random Dot Product Graphs

Marenco, Bernardo, Bermolen, Paola, Fiori, Marcelo, Larroca, Federico, Mateos, Gonzalo

arXiv.org Machine LearningJan-26-2022

Given a sequence of random (directed and weighted) graphs, we address the problem of online monitoring and detection of changes in the underlying data distribution. Our idea is to endow sequential change-point detection (CPD) techniques with a graph representation learning substrate based on the versatile Random Dot Product Graph (RDPG) model. We consider efficient, online updates of a judicious monitoring function, which quantifies the discrepancy between the streaming graph observations and the nominal RDPG. This reference distribution is inferred via spectral embeddings of the first few graphs in the sequence. We characterize the distribution of this running statistic to select thresholds that guarantee error-rate control, and under simplifying approximations we offer insights on the algorithm's detection resolution and delay. The end result is a lightweight online CPD algorithm, that is also explainable by virtue of the well-appreciated interpretability of RDPG embeddings. This is in stark contrast with most existing graph CPD approaches, which either rely on extensive computation, or they store and process the entire observed time series. An apparent limitation of the RDPG model is its suitability for undirected and unweighted graphs only, a gap we aim to close here to broaden the scope of the CPD framework. Unlike previous proposals, our non-parametric RDPG model for weighted graphs does not require a priori specification of the weights' distribution to perform inference and estimation. This network modeling contribution is of independent interest beyond CPD. We offer an open-source implementation of the novel online CPD algorithm for weighted and direct graphs, whose effectiveness and efficiency are demonstrated via (reproducible) synthetic and real network data experiments.

graph, matrix, statistic, (15 more...)

arXiv.org Machine Learning

doi: 10.1109/TSIPN.2022.3149098

2201.11222

Country:

South America > Uruguay (0.04)
South America > Venezuela (0.04)
South America > Argentina (0.04)
(10 more...)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Sports (0.46)
Information Technology (0.34)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Designing UIs for Static-Analysis Tools

Communications of the ACMJan-25-2022, 12:07:40 GMT

Past research has shown that static-analysis tools suffer from common usability issues such as a high rate of false positives, lack of responsiveness, and unclear warning descriptions and classifications. To address the usability issues of static-analysis tools, Lisa Nguyen Quang Do et al.20 proposed a user-centered approach to designing these tools during the development of the analysis, as opposed to keeping the development of the analysis and its user interface (UI) separate. To this end, they defined 10 guidelines for designing the UI of an analysis tool. The authors extracted those guidelines from existing literature and a study that they have conducted across 17 static-analysis tools and 87 software developers at Software AG. The guidelines consider analysis engine requirements, user behavior, reporting platforms, and the effects of company policies on the usage and adoption of static-analysis tools.18 This article explores the effect of applying this user-centered approach and the design guidelines to SWAN,26 a security-focused static-analysis tool for the Swift programming language. SWAN is being actively developed to feature better integration into the Swift development workflow, a faster and more precise analysis engine, and a new UI. Our goal is to evaluate the effectiveness of the approach and guidelines for improving the usability of the next version of SWAN. SWAN is being created to address the lack of openly available static-analysis tools for Swift.

developer, nguyen quang, swan, (16 more...)

Communications of the ACM

Country:

North America > Canada > Alberta (0.14)
Europe > Switzerland > Zürich > Zürich (0.04)

Industry: Information Technology > Software (0.48)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Software Engineering (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Image Classification using Machine Learning - Analytics Vidhya

#artificialintelligenceJan-25-2022, 06:45:50 GMT

This article was published as a part of the Data Science Blogathon. In this blog, we will be discussing how to perform image classification using four popular machine learning algorithms namely, Random Forest Classifier, KNN, Decision Tree Classifier, and Naive Bayes classifier. We will directly jump into implementation step-by-step. At the end of the article, you will understand why Deep Learning is preferred for image classification. However, the work demonstrated here will help serve research purposes if one desires to compare their CNN image classifier model with some machine learning algorithms.

accuracy, algorithm, dataset, (10 more...)

#artificialintelligence

Industry: Transportation (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.84)
(2 more...)

Add feedback

Prediction of Neonatal Respiratory Distress in Term Babies at Birth from Digital Stethoscope Recorded Chest Sounds

Grooby, Ethan, Sitaula, Chiranjibi, Tan, Kenneth, Zhou, Lindsay, King, Arrabella, Ramanathan, Ashwin, Malhotra, Atul, Dumont, Guy A., Marzbanrad, Faezeh

arXiv.org Artificial IntelligenceJan-25-2022

Neonatal respiratory distress is a common condition that if left untreated, can lead to short- and long-term complications. This paper investigates the usage of digital stethoscope recorded chest sounds taken within 1min post-delivery, to enable early detection and prediction of neonatal respiratory distress. Fifty-one term newborns were included in this study, 9 of whom developed respiratory distress. For each newborn, 1min anterior and posterior recordings were taken. These recordings were pre-processed to remove noisy segments and obtain high-quality heart and lung sounds. The random undersampling boosting (RUSBoost) classifier was then trained on a variety of features, such as power and vital sign features extracted from the heart and lung sounds. The RUSBoost algorithm produced specificity, sensitivity, and accuracy results of 85.0%, 66.7% and 81.8%, respectively.

artificial intelligence, detection, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/EMBC48229.2022.9871449

2201.10105

Country:

Oceania > Australia > Victoria > Melbourne (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
Africa > Cameroon (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)
Health & Medicine > Therapeutic Area > Internal Medicine (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.91)

Add feedback

DebtFree: Minimizing Labeling Cost in Self-Admitted Technical Debt Identification using Semi-Supervised Learning

Tu, Huy, Menzies, Tim

arXiv.org Artificial IntelligenceJan-25-2022

Keeping track of and managing Self-Admitted Technical Debts (SATDs) is important for maintaining a healthy software project. Current active-learning SATD recognition tool involves manual inspection of 24% of the test comments on average to reach 90% of the recall. Among all the test comments, about 5% are SATDs. The human experts are then required to read almost a quintuple of the SATD comments which indicates the inefficiency of the tool. Plus, human experts are still prone to error: 95% of the false-positive labels from previous work were actually true positives. To solve the above problems, we propose DebtFree, a two-mode framework based on unsupervised learning for identifying SATDs. In mode1, when the existing training data is unlabeled, DebtFree starts with an unsupervised learner to automatically pseudo-label the programming comments in the training data. In contrast, in mode2 where labels are available with the corresponding training data, DebtFree starts with a pre-processor that identifies the highly prone SATDs from the test dataset. Then, our machine learning model is employed to assist human experts in manually identifying the remaining SATDs. Our experiments on 10 software projects show that both models yield a statistically significant improvement in effectiveness over the state-of-the-art automated and semi-automated models. Specifically, DebtFree can reduce the labeling effort by 99% in mode1 (unlabeled training data), and up to 63% in mode2 (labeled training data) while improving the current active learner's F1 relatively to almost 100%.

debtfree, satd, training data, (16 more...)

arXiv.org Artificial Intelligence

2201.10592

Country:

South America > Uruguay > Maldonado > Maldonado (0.05)
North America > United States > North Carolina (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.87)

Add feedback

The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization

Pilán, Ildikó, Lison, Pierre, Øvrelid, Lilja, Papadopoulou, Anthi, Sánchez, David, Batet, Montserrat

arXiv.org Artificial IntelligenceJan-25-2022

We present a novel benchmark and associated evaluation metrics for assessing the performance of text anonymization methods. Text anonymization, defined as the task of editing a text document to prevent the disclosure of personal information, currently suffers from a shortage of privacy-oriented annotated text resources, making it difficult to properly evaluate the level of privacy protection offered by various anonymization methods. This paper presents TAB (Text Anonymization Benchmark), a new, open-source annotated corpus developed to address this shortage. The corpus comprises 1,268 English-language court cases from the European Court of Human Rights (ECHR) enriched with comprehensive annotations about the personal information appearing in each document, including their semantic category, identifier type, confidential attributes, and co-reference relations. Compared to previous work, the TAB corpus is designed to go beyond traditional de-identification (which is limited to the detection of predefined semantic categories), and explicitly marks which text spans ought to be masked in order to conceal the identity of the person to be protected. Along with presenting the corpus and its annotation layers, we also propose a set of evaluation metrics that are specifically tailored towards measuring the performance of text anonymization, both in terms of privacy protection and utility preservation. We illustrate the use of the benchmark and the proposed metrics by assessing the empirical performance of several baseline text anonymization models. The full corpus along with its privacy-oriented annotation guidelines, evaluation scripts and baseline models are available on: https://github.com/NorskRegnesentral/text-anonymisation-benchmark

annotator, anonymization, information, (13 more...)

arXiv.org Artificial Intelligence

2202.00443

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Norway > Eastern Norway > Oslo (0.04)
North America > Montserrat (0.04)
(27 more...)

Genre: Overview (0.92)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

MeltpoolNet: Melt pool Characteristic Prediction in Metal Additive Manufacturing Using Machine Learning

Akbari, Parand, Ogoke, Francis, Kao, Ning-Yu, Meidani, Kazem, Yeh, Chun-Yu, Lee, William, Farimani, Amir Barati

arXiv.org Artificial IntelligenceJan-25-2022

Characterizing meltpool shape and geometry is essential in metal Additive Manufacturing (MAM) to control the printing process and avoid defects. Predicting meltpool flaws based on process parameters and powder material is difficult due to the complex nature of MAM process. Machine learning (ML) techniques can be useful in connecting process parameters to the type of flaws in the meltpool. In this work, we introduced a comprehensive framework for benchmarking ML for melt pool characterization. An extensive experimental dataset has been collected from more than 80 MAM articles containing MAM processing conditions, materials, meltpool dimensions, meltpool modes and flaw types. We introduced physics-aware MAM featurization, versatile ML models, and evaluation metrics to create a comprehensive learning framework for meltpool defect and geometry prediction. This benchmark can serve as a basis for melt pool control and process optimization. In addition, data-driven explicit models have been identified to estimate meltpool geometry from process parameters and material properties which outperform Rosenthal estimation for meltpool geometry while maintaining interpretability.

additive manufacturing, dataset, featurization, (15 more...)

arXiv.org Artificial Intelligence

2201.11662

Country:

Europe > United Kingdom (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.83)

Industry:

Materials > Metals & Mining (1.00)
Machinery > Industrial Machinery (0.88)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.71)

Add feedback

Beyond Visual Image: Automated Diagnosis of Pigmented Skin Lesions Combining Clinical Image Features with Patient Data

Esgario, José G. M., Krohling, Renato A.

arXiv.org Artificial IntelligenceJan-25-2022

Among the most common types of skin cancer are basal cell carcinoma, squamous cell carcinoma and melanoma. According to the who (2018), currently, between 2 and 3 million non-melanoma skin cancers and 132.000 melanoma skin cancer occur every year in the world. Melanoma is by far the most dangerous form of skin cancer, causing more than 75% of all skin cancer deaths (Allen, 2016). Early diagnosis of the disease plays an important role in reducing the mortality rate with a chance of cure greater than 90% (SBD, 2018). The diagnosis of pigmented skin lesions (PSLs) can be made by invasive and non-invasive methods. One of the most common non-invasive methods was presented by Soyer et al. (1987). The method allows the visualization of morphological structures not visible to the naked eye with the use of an instrument called dermatoscope. When compared to the clinical diagnosis, the use of dermatoscope by experts makes the diagnosis of PSLs easier, increasing by 10-27% the diagnostic sensitivity (Mayer et al., 1997).

dataset, diagnosis, information, (15 more...)

arXiv.org Artificial Intelligence

2201.1065

Country:

South America > Brazil > Espírito Santo > Vitória (0.04)
Oceania > Australia (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

Pre-Trained Language Transformers are Universal Image Classifiers

Goel, Rahul, Sulaiman, Modar, Noorbakhsh, Kimia, Sharifi, Mahdi, Sharma, Rajesh, Jamshidi, Pooyan, Roy, Kallol

arXiv.org Artificial IntelligenceJan-25-2022

Facial images disclose many hidden personal traits such as age, gender, race, health, emotion, and psychology. Understanding these traits will help to classify the people in different attributes. In this paper, we have presented a novel method for classifying images using a pretrained transformer model. We apply the pretrained transformer for the binary classification of facial images in criminal and non-criminal classes. The pretrained transformer of GPT-2 is trained to generate text and then fine-tuned to classify facial images. During the finetuning process with images, most of the layers of GT-2 are frozen during backpropagation and the model is frozen pretrained transformer (FPT). The FPT acts as a universal image classifier, and this paper shows the application of FPT on facial images. We also use our FPT on encrypted images for classification. Our FPT shows high accuracy on both raw facial images and encrypted images. We hypothesize the meta-learning capacity FPT gained because of its large size and trained on a large size with theory and experiments. The GPT-2 trained to generate a single word token at a time, through the autoregressive process, forced to heavy-tail distribution. Then the FPT uses the heavy-tail property as its meta-learning capacity for classifying images. Our work shows one way to avoid bias during the machine classification of images.The FPT encodes worldly knowledge because of the pretraining of one text, which it uses during the classification. The statistical error of classification is reduced because of the added context gained from the text.Our paper shows the ethical dimension of using encrypted data for classification.Criminal images are sensitive to share across the boundary but encrypted largely evades ethical concern.FPT showing good classification accuracy on encrypted images shows promise for further research on privacy-preserving machine learning.

classification, encrypted image, facial image, (15 more...)

arXiv.org Artificial Intelligence

2201.10182

Country:

North America > United States > Tennessee (0.15)
North America > United States > South Carolina (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Introduction To Machine Learning

#artificialintelligenceJan-24-2022, 09:02:02 GMT

The wikipedia definition for ML is Machine learning is the study of computer algorithms that can improve automatically through experience and by the use of data. But what does it really mean? Machine Learning is using Data to make predictions or just use Data in any way to extract knowledge. I'll give a brief intro of these steps right now and go into more detail in the upcoming articles. There are 4 major types in which data will be available for use i.e.

dataset, machine learning, prediction, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.72)

Add feedback