Goto

Collaborating Authors

 Accuracy


Automatic Identification of Self-Admitted Technical Debt from Different Sources

arXiv.org Artificial Intelligence

Technical debt is a metaphor describing the situation that long-term benefits (e.g., maintainability and evolvability of software) are traded for short-term goals. When technical debt is admitted explicitly by developers in software artifacts (e.g., code comments or issue tracking systems), it is termed as Self-Admitted Technical Debt or SATD. Technical debt could be admitted in different sources, such as source code comments, issue tracking systems, pull requests, and commit messages. However, there is no approach proposed for identifying SATD from different sources. Thus, in this paper, we propose an approach for automatically identifying SATD from different sources (i.e., source code comments, issue trackers, commit messages, and pull requests).


Identifying Self-Admitted Technical Debt in Issue Tracking Systems using Machine Learning

arXiv.org Artificial Intelligence

Technical debt is a metaphor indicating sub-optimal solutions implemented for short-term benefits by sacrificing the long-term maintainability and evolvability of software. A special type of technical debt is explicitly admitted by software engineers (e.g. using a TODO comment); this is called Self-Admitted Technical Debt or SATD. Most work on automatically identifying SATD focuses on source code comments. In addition to source code comments, issue tracking systems have shown to be another rich source of SATD, but there are no approaches specifically for automatically identifying SATD in issues. In this paper, we first create a training dataset by collecting and manually analyzing 4,200 issues (that break down to 23,180 sections of issues) from seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase, Impala, and Thrift) using two popular issue tracking systems (i.e., Jira and Google Monorail). We then propose and optimize an approach for automatically identifying SATD in issue tracking systems using machine learning. Our findings indicate that: 1) our approach outperforms baseline approaches by a wide margin with regard to the F1-score; 2) transferring knowledge from suitable datasets can improve the predictive performance of our approach; 3) extracted SATD keywords are intuitive and potentially indicating types and indicators of SATD; 4) projects using different issue tracking systems have less common SATD keywords compared to projects using the same issue tracking system; 5) a small amount of training data is needed to achieve good accuracy.


Net benefit, calibration, threshold selection, and training objectives for algorithmic fairness in healthcare

arXiv.org Machine Learning

A growing body of work uses the paradigm of algorithmic fairness to frame the development of techniques to anticipate and proactively mitigate the introduction or exacerbation of health inequities that may follow from the use of model-guided decision-making. We evaluate the interplay between measures of model performance, fairness, and the expected utility of decision-making to offer practical recommendations for the operationalization of algorithmic fairness principles for the development and evaluation of predictive models in healthcare. We conduct an empirical case-study via development of models to estimate the ten-year risk of atherosclerotic cardiovascular disease to inform statin initiation in accordance with clinical practice guidelines. We demonstrate that approaches that incorporate fairness considerations into the model training objective typically do not improve model performance or confer greater net benefit for any of the studied patient populations compared to the use of standard learning paradigms followed by threshold selection concordant with patient preferences, evidence of intervention effectiveness, and model calibration. These results hold when the measured outcomes are not subject to differential measurement error across patient populations and threshold selection is unconstrained, regardless of whether differences in model performance metrics, such as in true and false positive error rates, are present. In closing, we argue for focusing model development efforts on developing calibrated models that predict outcomes well for all patient populations while emphasizing that such efforts are complementary to transparent reporting, participatory design, and reasoning about the impact of model-informed interventions in context.


Here's Why Your Rapid Test Is Negative Even If You Have COVID-19

International Business Times

Rapid COVID-19 tests can generate false-negative results because they aren't that sensitive, according to a medical expert. Rapid COVID-19 tests, or antigen tests, appear positive if they detect a certain amount of coronavirus -- also known as viral load -- from a sample taken from a person's body, according to BuzzFeed News. Dr. Emily Landon, an infectious disease expert, said that the window when viral load is at its peak can vary from person to person and can range from three days to more than a week as people's systems clear the virus at their own pace. Due to this, it may either take some time for an infected person's result to turn positive or never appear positive if they miss this window or collect their test sample incorrectly, among other things, according to Landon, who is also an associate professor of medicine at the University of Chicago Medicine. "Rapid tests are definitely not like a pregnancy test where it's going to be positive as long as it's been a few weeks after someone missed a period. It's only going to pick it up when you're at peak infectiousness, and they're almost never false positive," the doctor explained.


Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference

arXiv.org Machine Learning

A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overparameterization: the fact that overparameterized models are more vulnerable to privacy attacks, in particular the membership inference attack that predicts the (potentially sensitive) examples used to train a model. We significantly extend the relatively few empirical results on this problem by theoretically proving for an overparameterized linear regression model with Gaussian data that the membership inference vulnerability increases with the number of parameters. Moreover, a range of empirical studies indicates that more complex, nonlinear models exhibit the same behavior. Finally, we study different methods for mitigating such attacks in the overparameterized regime, such as noise addition and regularization, and conclude that simply reducing the parameters of an overparameterized model is an effective strategy to protect it from membership inference without greatly decreasing its generalization error.


VC-PCR: A Prediction Method based on Supervised Variable Selection and Clustering

arXiv.org Machine Learning

Sparse linear prediction methods suffer from decreased prediction accuracy when the predictor variables have cluster structure (e.g. there are highly correlated groups of variables). To improve prediction accuracy, various methods have been proposed to identify variable clusters from the data and integrate cluster information into a sparse modeling process. But none of these methods achieve satisfactory performance for prediction, variable selection and variable clustering simultaneously. This paper presents Variable Cluster Principal Component Regression (VC-PCR), a prediction method that supervises variable selection and variable clustering in order to solve this problem. Experiments with real and simulated data demonstrate that, compared to competitor methods, VC-PCR achieves better prediction, variable selection and clustering performance when cluster structure is present.


COVID-19 Rapid Test Recall: These Brands Give False Positives

International Business Times

The Food and Drug Administration issued a warning Friday to stop using the COVID-19 rapid antigen test Empowered Diagnostics CovClear and the neutralizing antibody rapid test ImmunoPass. These tests are being distributed in the U.S. with labels that show the FDA authorized them, which they did not. The FDA has concerns about the potential risk of false positives when using COVID-19 tests are not approved. They have classified the two tests as a Class I recall, which is the most serious type of recall. "These tests were distributed with labeling showing the FDA authorized them, but neither test has been authorized, cleared, or approved by the FDA for distribution or use in the United States. The FDA is concerned about the potentially higher risk of false results when using unauthorized tests," read an FDA press release.


Giving an AI control of nuclear weapons: What could possibly go wrong? - Bulletin of the Atomic Scientists

#artificialintelligence

If artificial intelligences controlled nuclear weapons, all of us could be dead. In 1983, Soviet Air Defense Forces Lieutenant Colonel Stanislav Petrov was monitoring nuclear early warning systems, when the computer concluded with the highest confidence that the United States had launched a nuclear war. But Petrov was doubtful: The computer estimated only a handful of nuclear weapons were incoming, when such a surprise attack would more plausibly entail an overwhelming first strike. He also didn't trust the new launch detection system, and the radar system didn't have corroborative evidence. Petrov decided the message was a false positive and did nothing. The computer was wrong; Petrov was right.


Evaluation Metrics for Classification Machine Learning Models

#artificialintelligence

Having built the Machine Learning model, we need to evaluate it.We can set targets for these metrics in the beginning of the Data Science or Machine Learning project. Achievement of these targets can be considered as one of the project success criteria. Evaluation metrics are used to measure the quality of the statistical or machine learning model. Evaluating machine learning models or algorithms is essential for any project. There are many different types of evaluation metrics available to test a model.


Ulixes: Facial Recognition Privacy with Adversarial Machine Learning

arXiv.org Artificial Intelligence

Facial recognition tools are becoming exceptionally accurate in identifying people from images. However, this comes at the cost of privacy for users of online services with photo management (e.g. social media platforms). Particularly troubling is the ability to leverage unsupervised learning to recognize faces even when the user has not labeled their images. In this paper we propose Ulixes, a strategy to generate visually non-invasive facial noise masks that yield adversarial examples, preventing the formation of identifiable user clusters in the embedding space of facial encoders. This is applicable even when a user is unmasked and labeled images are available online. We demonstrate the effectiveness of Ulixes by showing that various classification and clustering methods cannot reliably label the adversarial examples we generate. We also study the effects of Ulixes in various black-box settings and compare it to the current state of the art in adversarial machine learning. Finally, we challenge the effectiveness of Ulixes against adversarially trained models and show that it is robust to countermeasures.