AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Provable Guarantees on the Robustness of Decision Rules to Causal Interventions

Wang, Benjie, Lyle, Clare, Kwiatkowska, Marta

arXiv.org Artificial IntelligenceMay-19-2021

Robustness of decision rules to shifts in the data-generating process is crucial to the successful deployment of decision-making systems. Such shifts can be viewed as interventions on a causal graph, which capture (possibly hypothetical) changes in the data-generating process, whether due to natural reasons or by the action of an adversary. We consider causal Bayesian networks and formally define the interventional robustness problem, a novel model-based notion of robustness for decision functions that measures worst-case performance with respect to a set of interventions that denote changes to parameters and/or causal influences. By relying on a tractable representation of Bayesian networks as arithmetic circuits, we provide efficient algorithms for computing guaranteed upper and lower bounds on the interventional robustness probabilities. Experimental results demonstrate that the methods yield useful and interpretable bounds for a range of practical networks, paving the way towards provably causally robust decision-making systems.

classifier, intervention, subcircuit, (17 more...)

arXiv.org Artificial Intelligence

2105.09108

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine (0.68)
Banking & Finance (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)

Add feedback

Implementation and Evaluation of a Multivariate Abstraction-Based, Interval-Based Dynamic Time-Warping Method as a Similarity Measure for Longitudinal Medical Records

Shahar, Yuval, Lion, Matan

arXiv.org Artificial IntelligenceMay-18-2021

We extended dynamic time warping (DTW) into interval-based dynamic time warping (iDTW), including (A) interval-based representation (iRep): [1] abstracting raw, time-stamped data into interval-based abstractions, [2] comparison-period scoping, [3] partitioning abstract intervals into a given temporal granularity; (B) interval-based matching (iMatch): matching partitioned, abstract-concepts records, using a modified DTW. Using domain knowledge, we abstracted the raw data of medical records, for up to three concepts out of four or five relevant concepts, into two interval types: State abstractions (e.g. LOW, HIGH) and Gradient abstractions (e.g. INCREASING, DECREASING). We created all uni-dimensional (State or Gradient) or multi-dimensional (State and Gradient) abstraction combinations. Tasks: Classifying 161 oncology patients records as autologous or allogenic bone-marrow transplantation; classifying 125 hepatitis patients records as B or C hepatitis; predicting micro- or macro-albuminuria in the next year for 151 Type 2 diabetes patients. We used a k-Nearest-Neighbors majority, k=1 to SQRT(N), N = set size. 50,328 10-fold cross-validation experiments were performed: 23,400 (Oncology), 19,800 (Hepatitis), 7,128 (Diabetes). Measures: Area Under the Curve (AUC), optimal Youden's Index. Paired t-tests compared result vectors for equivalent configurations other than a tested variable, to determine a significant mean accuracy difference (P<0.05). Mean classification and prediction using abstractions was significantly better than using only raw time-stamped data. In each domain, at least one abstraction combination led to a significantly better performance than using raw data. Increasing feature number, and using multi-dimensional abstractions, enhanced performance. Unlike when using raw data, optimal performance was often reached with k=5, using abstractions.

abstraction, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2105.0845

Country:

Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)
South America > Brazil > São Paulo (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Confronting Structural Inequities in AI for Education

Madaio, Michael, Blodgett, Su Lin, Mayfield, Elijah, Dixon-Román, Ezekiel

arXiv.org Artificial IntelligenceMay-18-2021

Educational technologies, and the systems of schooling in which they are deployed, enact particular ideologies about what is important to know and how learners should learn. As artificial intelligence technologies -- in education and beyond -- have led to inequitable outcomes for marginalized communities, various approaches have been developed to evaluate and mitigate AI systems' disparate impact. However, we argue in this paper that the dominant paradigm of evaluating fairness on the basis of performance disparities in AI models is inadequate for confronting the structural inequities that educational AI systems (re)produce. We draw on a lens of structural injustice informed by critical theory and Black feminist scholarship to critically interrogate several widely-studied and widely-adopted categories of educational AI systems and demonstrate how educational AI technologies are bound up in and reproduce historical legacies of structural injustice and inequity, regardless of the parity of their models' performance. We close with alternative visions for a more equitable future for educational AI research.

confronting structural inequity, educational ai, student, (12 more...)

arXiv.org Artificial Intelligence

2105.08847

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(13 more...)

Genre: Instructional Material (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.93)
(4 more...)

Add feedback

Classifying variety of customer's online engagement for churn prediction with mixed-penalty logistic regression

Šimović, Petra Posedel, Horvatic, Davor, Sun, Edward W.

arXiv.org Machine LearningMay-17-2021

Using big data to analyze consumer behavior can provide effective decision-making tools for preventing customer attrition (churn) in customer relationship management (CRM). Focusing on a CRM dataset with several different categories of factors that impact customer heterogeneity (i.e., usage of self-care service channels, duration of service, and responsiveness to marketing actions), we provide new predictive analytics of customer churn rate based on a machine learning method that enhances the classification of logistic regression by adding a mixed penalty term. The proposed penalized logistic regression can prevent overfitting when dealing with big data and minimize the loss function when balancing the cost from the median (absolute value) and mean (squared value) regularization. We show the analytical properties of the proposed method and its computational advantage in this research. In addition, we investigate the performance of the proposed method with a CRM data set (that has a large number of features) under different settings by efficiently eliminating the disturbance of (1) least important features and (2) sensitivity from the minority (churn) class. Our empirical results confirm the expected performance of the proposed method in full compliance with the common classification criteria (i.e., accuracy, precision, and recall) for evaluating machine learning methods.

artificial intelligence, customer, machine learning, (18 more...)

arXiv.org Machine Learning

2105.07671

Country:

Europe > Croatia > Zagreb County > Zagreb (0.04)
North America > United States > Massachusetts (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Gambling (0.68)
Leisure & Entertainment > Sports (0.68)
Banking & Finance (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Deep Multistage Multi-Task Learning for Quality Prediction of Multistage Manufacturing Systems

Yan, Hao, Sergin, Nurretin Dorukhan, Brenneman, William A., Lange, Stephen Joseph, Ba, Shan

arXiv.org Artificial IntelligenceMay-17-2021

In multistage manufacturing systems, modeling multiple quality indices based on the process sensing variables is important. However, the classic modeling technique predicts each quality variable one at a time, which fails to consider the correlation within or between stages. We propose a deep multistage multi-task learning framework to jointly predict all output sensing variables in a unified end-to-end learning framework according to the sequential system architecture in the MMS. Our numerical studies and real case study have shown that the new model has a superior performance compared to many benchmark methods as well as great interpretability through developed variable selection techniques.

input variable, neural network, output variable, (17 more...)

arXiv.org Artificial Intelligence

2105.0818

Country:

North America > United States > Ohio > Hamilton County > Cincinnati (0.14)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Doc2Dict: Information Extraction as Text Generation

Townsend, Benjamin, Ito-Fisher, Eamon, Zhang, Lily, May, Madison

arXiv.org Artificial IntelligenceMay-16-2021

Typically, information extraction (IE) requires a pipeline approach: first, a sequence labeling model is trained on manually annotated documents to extract relevant spans; then, when a new document arrives, a model predicts spans which are then post-processed and standardized to convert the information into a database entry. We replace this labor-intensive workflow with a transformer language model trained on existing database records to directly generate structured JSON. Our solution removes the workload associated with producing token-level annotations and takes advantage of a data source which is generally quite plentiful (e.g. database records). As long documents are common in information extraction tasks, we use gradient checkpointing and chunked encoding to apply our method to sequences of up to 32,000 tokens on a single GPU. Our Doc2Dict approach is competitive with more complex, hand-engineered pipelines and offers a simple but effective baseline for document-level information extraction. We release our Doc2Dict model and code to reproduce our experiments and facilitate future work.

dataset, extraction, information extraction, (13 more...)

arXiv.org Artificial Intelligence

2105.0751

Country:

Europe > Bulgaria > Sofia City Province > Sofia (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining > Text Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Deep learning for detecting pulmonary tuberculosis via chest radiography: an international study across 10 countries

Kazemzadeh, Sahar, Yu, Jin, Jamshy, Shahar, Pilgrim, Rory, Nabulsi, Zaid, Chen, Christina, Beladia, Neeral, Lau, Charles, McKinney, Scott Mayer, Hughes, Thad, Kiraly, Atilla, Kalidindi, Sreenivasa Raju, Muyoyeta, Monde, Malemela, Jameson, Shih, Ting, Corrado, Greg S., Peng, Lily, Chou, Katherine, Chen, Po-Hsuan Cameron, Liu, Yun, Eswaran, Krish, Tse, Daniel, Shetty, Shravya, Prabhakara, Shruthi

arXiv.org Artificial IntelligenceMay-16-2021

Tuberculosis (TB) is a top-10 cause of death worldwide. Though the WHO recommends chest radiographs (CXRs) for TB screening, the limited availability of CXR interpretation is a barrier. We trained a deep learning system (DLS) to detect active pulmonary TB using CXRs from 9 countries across Africa, Asia, and Europe, and utilized large-scale CXR pretraining, attention pooling, and noisy student semi-supervised learning. Evaluation was on (1) a combined test set spanning China, India, US, and Zambia, and (2) an independent mining population in South Africa. Given WHO targets of 90% sensitivity and 70% specificity, the DLS's operating point was prespecified to favor sensitivity over specificity. On the combined test set, the DLS's ROC curve was above all 9 India-based radiologists, with an AUC of 0.90 (95%CI 0.87-0.92). The DLS's sensitivity (88%) was higher than the India-based radiologists (75% mean sensitivity), p<0.001 for superiority; and its specificity (79%) was non-inferior to the radiologists (84% mean specificity), p=0.004. Similar trends were observed within HIV positive and sputum smear positive sub-groups, and in the South Africa test set. We found that 5 US-based radiologists (where TB isn't endemic) were more sensitive and less specific than the India-based radiologists (where TB is endemic). The DLS also remained non-inferior to the US-based radiologists. In simulations, using the DLS as a prioritization tool for confirmatory testing reduced the cost per positive case detected by 40-80% compared to using confirmatory testing alone. To conclude, our DLS generalized to 5 countries, and merits prospective evaluation to assist cost-effective screening efforts in radiologist-limited settings. Operating point flexibility may permit customization of the DLS to account for site-specific factors such as TB prevalence, demographics, clinical resources, and customary practice patterns.

dataset, radiologist, specificity, (15 more...)

arXiv.org Artificial Intelligence

2105.0754

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Africa > Zambia > Lusaka Province > Lusaka (0.04)
Asia > India > Tamil Nadu > Chennai (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

20x times faster Grid Search Cross-Validation

#artificialintelligenceMay-15-2021, 19:25:06 GMT

To train a robust machine learning model, one must select the correct machine learning algorithm with the correct combination of hyperparameters. The process of choosing the optimal set of parameters is known as hyperparameter tuning. One must train the dataset on all machine learning algorithms and on a different combination of its hyperparameters to improve the performance metric. The cross-validation technique can be used to train the dataset on various machine learning algorithms and choose the best out of it. Cross-Validation is a resampling technique that can be used to evaluate and select machine learning algorithms on a limited dataset.

grid search cv, search cv, time complexity, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.90)

Add feedback

Cohort Shapley value for algorithmic fairness

Mase, Masayoshi, Owen, Art B., Seiler, Benjamin B.

arXiv.org Artificial IntelligenceMay-15-2021

Cohort Shapley value is a model-free method of variable importance grounded in game theory that does not use any unobserved and potentially impossible feature combinations. We use it to evaluate algorithmic fairness, using the well known COMPAS recidivism data as our example. This approach allows one to identify for each individual in a data set the extent to which they were adversely or beneficially affected by their value of a protected attribute such as their race. The method can do this even if race was not one of the original predictors and even if it does not have access to a proprietary algorithm that has made the predictions. The grounding in game theory lets us define aggregate variable importance for a data set consistently with its per subject definitions. We can investigate variable importance for multiple quantities of interest in the fairness literature including false positive predictions.

artificial intelligence, factor 0, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2105.07168

Country:

North America > United States > Florida > Broward County (0.04)
North America > United States > Tennessee (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Law (1.00)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Understanding the Effect of Bias in Deep Anomaly Detection

Ye, Ziyu, Chen, Yuxin, Zheng, Haitao

arXiv.org Artificial IntelligenceMay-15-2021

Anomaly detection presents a unique challenge in machine learning, due to the scarcity of labeled anomaly data. Recent work attempts to mitigate such problems by augmenting training of deep anomaly detection models with additional labeled anomaly samples. However, the labeled data often does not align with the target distribution and introduces harmful bias to the trained model. In this paper, we aim to understand the effect of a biased anomaly set on anomaly detection. Concretely, we view anomaly detection as a supervised learning task where the objective is to optimize the recall at a given false positive rate. We formally study the relative scoring bias of an anomaly detector, defined as the difference in performance with respect to a baseline anomaly detector. We establish the first finite sample rates for estimating the relative scoring bias for deep anomaly detection, and empirically validate our theoretical results on both synthetic and real-world datasets. We also provide an extensive empirical study on how a biased training anomaly set affects the anomaly score function and therefore the detection performance on different anomaly classes. Our study demonstrates scenarios in which the biased anomaly set can be useful or problematic, and provides a solid benchmark for future research.

anomaly, anomaly detection, detection, (13 more...)

arXiv.org Artificial Intelligence

2105.07346

Country:

North America > United States > Texas (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Yemen > Amran Governorate > Amran (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback