AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Adversarial Multiscale Feature Learning for Overlapping Chromosome Segmentation

Mei, Liye, Yu, Yalan, Weng, Yueyun, Guo, Xiaopeng, Liu, Yan, Wang, Du, Liu, Sheng, Zhou, Fuling, Lei, Cheng

arXiv.org Artificial IntelligenceDec-22-2020

Chromosome karyotype analysis is of great clinical importance in the diagnosis and treatment of diseases, especially for genetic diseases. Since manual analysis is highly time and effort consuming, computer-assisted automatic chromosome karyotype analysis based on images is routinely used to improve the efficiency and accuracy of the analysis. Due to the strip shape of the chromosomes, they easily get overlapped with each other when imaged, significantly affecting the accuracy of the analysis afterward. Conventional overlapping chromosome segmentation methods are usually based on manually tagged features, hence, the performance of which is easily affected by the quality, such as resolution and brightness, of the images. To address the problem, in this paper, we present an adversarial multiscale feature learning framework to improve the accuracy and adaptability of overlapping chromosome segmentation. Specifically, we first adopt the nested U-shape network with dense skip connections as the generator to explore the optimal representation of the chromosome images by exploiting multiscale features. Then we use the conditional generative adversarial network (cGAN) to generate images similar to the original ones, the training stability of which is enhanced by applying the least-square GAN objective. Finally, we employ Lovasz-Softmax to help the model converge in a continuous optimization setting. Comparing with the established algorithms, the performance of our framework is proven superior by using public datasets in eight evaluation criteria, showing its great potential in overlapping chromosome segmentation

chromosome, chromosome segmentation, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2012.11847

Country:

Asia > China > Hubei Province > Wuhan (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(9 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Genetic Disease (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

The COVID-19 pandemic: socioeconomic and health disparities

Javaheri, Behzad

arXiv.org Artificial IntelligenceDec-22-2020

Disadvantaged groups around the world have suffered and endured higher mortality during the current COVID-19 pandemic. This contrast disparity suggests that socioeconomic and health-related factors may drive inequality in disease outcome. To identify these factors correlated with COVID-19 outcome, country aggregate data provided by the Lancet COVID-19 Commission subjected to correlation analysis. Socioeconomic and health-related variables were used to predict mortality in the top 5 most affected countries using ridge regression and extreme gradient boosting (XGBoost) models. Our data reveal that predictors related to demographics and social disadvantage correlate with COVID-19 mortality per million and that XGBoost performed better than ridge regression. Taken together, our findings suggest that the health consequence of the current pandemic is not just confined to indiscriminate impact of a viral infection but that these preventable effects are amplified based on pre-existing health and socioeconomic inequalities.

covid-19 mortality, mortality, predictor, (15 more...)

arXiv.org Artificial Intelligence

2012.11399

Country:

Asia > India (0.06)
South America > Brazil (0.06)
North America > Mexico (0.06)
(4 more...)

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.56)

Add feedback

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Goel, Naman, Amayuelas, Alfonso, Deshpande, Amit, Sharma, Amit

arXiv.org Artificial IntelligenceDec-21-2020

Training datasets for machine learning often have some form of missingness. For example, to learn a model for deciding whom to give a loan, the available training data includes individuals who were given a loan in the past, but not those who were not. This missingness, if ignored, nullifies any fairness guarantee of the training procedure when the model is deployed. Using causal graphs, we characterize the missingness mechanisms in different real-world scenarios. We show conditions under which various distributions, used in popular fairness algorithms, can or can not be recovered from the training data. Our theoretical results imply that many of these algorithms can not guarantee fairness in practice. Modeling missingness also helps to identify correct design principles for fair algorithms. For example, in multi-stage settings where decisions are made in multiple screening rounds, we use our framework to derive the minimal distributions required to design a fair algorithm. Our proposed algorithm decentralizes the decision-making process and still achieves similar performance to the optimal algorithm that requires centralization and non-recoverable distributions.

algorithm, constraint, missingness, (14 more...)

arXiv.org Artificial Intelligence

2012.11448

Country:

North America > United States > New York (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Law (0.93)
Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Residual Energy-Based Models for Text

Bakhtin, Anton, Deng, Yuntian, Gross, Sam, Ott, Myle, Ranzato, Marc'Aurelio, Szlam, Arthur

arXiv.org Machine LearningDec-21-2020

Current large-scale auto-regressive language models (Radford et al., 2019; Liu et al., 2018; Graves, 2013) display impressive fluency and can generate convincing text. In this work we start by asking the question: Can the generations of these models be reliably distinguished from real text by statistical discriminators? We find experimentally that the answer is affirmative when we have access to the training data for the model, and guardedly affirmative even if we do not. This suggests that the auto-regressive models can be improved by incorporating the (globally normalized) discriminators into the generative process. We give a formalism for this using the Energy-Based Model framework, and show that it indeed improves the results of the generative models, measured both in terms of perplexity and in terms of human evaluation.

discriminator, language model, residual energy-based model, (16 more...)

arXiv.org Machine Learning

2004.10188

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Ohio > Cuyahoga County > Lakewood (0.04)
(6 more...)

Genre:

Research Report (0.64)
Instructional Material (0.45)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Education (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Biased Models Have Biased Explanations

Jain, Aditya, Ravula, Manish, Ghosh, Joydeep

arXiv.org Artificial IntelligenceDec-20-2020

We study fairness in Machine Learning (FairML) through the lens of attribute-based explanations generated for machine learning models. Our hypothesis is: Biased Models have Biased Explanations. To establish that, we first translate existing statistical notions of group fairness and define these notions in terms of explanations given by the model. Then, we propose a novel way of detecting (un)fairness for any black box model. We further look at post-processing techniques for fairness and reason how explanations can be used to make a bias mitigation technique more individually fair. We also introduce a novel post-processing mitigation technique which increases individual fairness in recourse while maintaining group level fairness.

explanation, prediction, shap value, (16 more...)

arXiv.org Artificial Intelligence

2012.10986

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

(Decision and regression) tree ensemble based kernels for regression and classification

Feng, Dai, Baumgartner, Richard

arXiv.org Machine LearningDec-19-2020

Tree based ensembles such as Breiman's random forest (RF) and Gradient Boosted Trees (GBT) can be interpreted as implicit kernel generators, where the ensuing proximity matrix represents the data-driven tree ensemble kernel. Kernel perspective on the RF has been used to develop a principled framework for theoretical investigation of its statistical properties. Recently, it has been shown that the kernel interpretation is germane to other tree-based ensembles e.g. GBTs. However, practical utility of the links between kernels and the tree ensembles has not been widely explored and systematically evaluated. Focus of our work is investigation of the interplay between kernel methods and the tree based ensembles including the RF and GBT. We elucidate the performance and properties of the RF and GBT based kernels in a comprehensive simulation study comprising of continuous and binary targets. We show that for continuous targets, the RF/GBT kernels are competitive to their respective ensembles in higher dimensional scenarios, particularly in cases with larger number of noisy features. For the binary target, the RF/GBT kernels and their respective ensembles exhibit comparable performance. We provide the results from real life data sets for regression and classification to show how these insights may be leveraged in practice. Overall, our results support the tree ensemble based kernels as a valuable addition to the practitioner's toolbox. Finally, we discuss extensions of the tree ensemble based kernels for survival targets, interpretable prototype and landmarking classification and regression. We outline future line of research for kernels furnished by Bayesian counterparts of the frequentist tree ensembles.

classification, kernel, rf kernel, (12 more...)

arXiv.org Machine Learning

2012.10737

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California (0.05)
North America > United States > Iowa (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)

Add feedback

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

Mathew, Binny, Saha, Punyajoy, Yimam, Seid Muhie, Biemann, Chris, Goyal, Pawan, Mukherjee, Animesh

arXiv.org Artificial IntelligenceDec-18-2020

Hate speech is a challenging issue plaguing the online social media. While better models for hate speech detection are continuously being developed, there is little research on the bias and interpretability aspects of hate speech. In this paper, we introduce HateXplain, the first benchmark hate speech dataset covering multiple aspects of the issue. Each post in our dataset is annotated from three different perspectives: the basic, commonly used 3-class classification (i.e., hate, offensive or normal), the target community (i.e., the community that has been the victim of hate speech/offensive speech in the post), and the rationales, i.e., the portions of the post on which their labelling decision (as hate, offensive or normal) is based. We utilize existing state-of-the-art models and observe that even models that perform very well in classification do not score high on explainability metrics like model plausibility and faithfulness. We also observe that models, which utilize the human rationales for training, perform better in reducing unintended bias towards target communities. We have made our code and dataset public at https://github.com/punyajoy/HateXplain

computational linguistic, proceedings, rationale, (14 more...)

arXiv.org Artificial Intelligence

2012.10289

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
(24 more...)

Genre: Research Report (0.84)

Industry:

Law (0.68)
Information Technology > Security & Privacy (0.68)
Information Technology > Services (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

EVA: Generating Longitudinal Electronic Health Records Using Conditional Variational Autoencoders

Biswal, Siddharth, Ghosh, Soumya, Duke, Jon, Malin, Bradley, Stewart, Walter, Sun, Jimeng

arXiv.org Artificial IntelligenceDec-17-2020

Researchers require timely access to real-world longitudinal electronic health records (EHR) to develop, test, validate, and implement machine learning solutions that improve the quality and efficiency of healthcare. In contrast, health systems value deeply patient privacy and data security. De-identified EHRs do not adequately address the needs of health systems, as de-identified data are susceptible to re-identification and its volume is also limited. Synthetic EHRs offer a potential solution. In this paper, we propose EHR Variational Autoencoder (EVA) for synthesizing sequences of discrete EHR encounters (e.g., clinical visits) and encounter features (e.g., diagnoses, medications, procedures). We illustrate that EVA can produce realistic EHR sequences, account for individual differences among patients, and can be conditioned on specific disease conditions, thus enabling disease-specific studies. We design efficient, accurate inference algorithms by combining stochastic gradient Markov Chain Monte Carlo with amortized variational inference. We assess the utility of the methods on large real-world EHR repositories containing over 250, 000 patients. Our experiments, which include user studies with knowledgeable clinicians, indicate the generated EHR sequences are realistic. We confirmed the performance of predictive models trained on the synthetic data are similar with those trained on real EHRs. Additionally, our findings indicate that augmenting real data with synthetic EHRs results in the best predictive performance - improving the best baseline by as much as 8% in top-20 recall.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Artificial Intelligence

2012.1002

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Automatic detection of abnormal EEG signals using wavelet feature extraction and gradient boosting decision tree

Albaqami, Hezam, Hassan, Ghulam Mubashar, Subasi, Abdulhamit, Datta, Amitava

arXiv.org Artificial IntelligenceDec-17-2020

Electroencephalography is frequently used for diagnostic evaluation of various brain-related disorders due to its excellent resolution, non-invasive nature and low cost. However, manual analysis of EEG signals could be strenuous and a time-consuming process for experts. It requires long training time for physicians to develop expertise in it and additionally experts have low inter-rater agreement (IRA) among themselves. Therefore, many Computer Aided Diagnostic (CAD) based studies have considered the automation of interpreting EEG signals to alleviate the workload and support the final diagnosis. In this paper, we present an automatic binary classification framework for brain signals in multichannel EEG recordings. We propose to use Wavelet Packet Decomposition (WPD) techniques to decompose the EEG signals into frequency sub-bands and extract a set of statistical features from each of the selected coefficients. Moreover, we propose a novel method to reduce the dimension of the feature space without compromising the quality of the extracted features. The extracted features are classified using different Gradient Boosting Decision Tree (GBDT) based classification frameworks, which are CatBoost, XGBoost and LightGBM. We used Temple University Hospital EEG Abnormal Corpus V2.0.0 to test our proposed technique. We found that CatBoost classifier achieves the binary classification accuracy of 87.68%, and outperforms state-of-the-art techniques on the same dataset by more than 1% in accuracy and more than 3% in sensitivity. The obtained results in this research provide important insights into the usefulness of WPD feature extraction and GBDT classifiers for EEG classification.

classifier, coefficient, detection, (14 more...)

arXiv.org Artificial Intelligence

2012.10034

Country:

North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > Western Australia (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Promising Solution (0.69)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score

Schuler, Alejandro, Walsh, David, Hall, Diana, Walsh, Jon, Fisher, Charles

arXiv.org Machine LearningDec-17-2020

Estimating causal effects from randomized experiments is central to clinical research. Reducing the statistical uncertainty in these analyses is an important objective for statisticians. Registries, prior trials, and health records constitute a growing compendium of historical data on patients under standard-of-care conditions that may be exploitable to this end. However, most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control. Here, we propose a use of historical data that exploits linear covariate adjustment to improve the efficiency of trial analyses without incurring bias. Specifically, we train a prognostic model on the historical data, then estimate the treatment effect using a linear regression while adjusting for the trial subjects' predicted outcomes (their prognostic scores). We prove that, under certain conditions, this prognostic covariate adjustment procedure attains the minimum variance possible among a large class of estimators. When those conditions are not met, prognostic covariate adjustment is still more efficient than raw covariate adjustment and the gain in efficiency is proportional to a measure of the predictive accuracy of the prognostic model. We demonstrate the approach using simulations and a reanalysis of an Alzheimer's Disease clinical trial and observe meaningful reductions in mean-squared error and the estimated variance. Lastly, we provide a simplified formula for asymptotic variance that enables power and sample size calculations that account for the gains from the prognostic model for clinical trial design.

adjustment, estimator, prognostic model, (16 more...)

arXiv.org Machine Learning

2012.09935

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada (0.04)
(3 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback