AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

UFC 213 Betting Odds, PPV Info For Amanda Nunes vs. Valentina Shevchenko And Entire Fight Card

International Business TimesJul-8-2017, 11:45:47 GMT

For the first time in 2017, Amanda Nunes will defend her UFC women's bantamweight championship. She'll put the belt on the line Saturday night in the main event of UFC 213 in Las Vegas. Nunes hasn't stepped inside the octagon since she needed 48 seconds to defeat Ronda Rousey on Dec. 30. She headlined the pay-per-view in her previous two title fights, and it'll cost fans $69.99 to order Saturday's fight. The PPV starts at 10 p.m. EDT, though fans can watch the event with a live stream online at ufc.tv for $59.99.

artificial intelligence, machine learning, valentina shevchenko, (13 more...)

International Business Times

Country: North America > United States > Nevada > Clark County > Las Vegas (0.25)

Industry: Leisure & Entertainment > Sports > Boxing (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.61)

Add feedback

Estimating network edge probabilities by neighborhood smoothing

Zhang, Yuan, Levina, Elizaveta, Zhu, Ji

arXiv.org Machine LearningJul-8-2017

The estimation of probabilities of network edges from the observed adjacency matrix has important applications to predicting missing links and network denoising. It has usually been addressed by estimating the graphon, a function that determines the matrix of edge probabilities, but this is ill-defined without strong assumptions on the network structure. Here we propose a novel computationally efficient method, based on neighborhood smoothing to estimate the expectation of the adjacency matrix directly, without making the structural assumptions that graphon estimation requires. The neighborhood smoothing method requires little tuning, has a competitive mean-squared error rate, and outperforms many benchmark methods on link prediction in simulated and real networks.

data mining, machine learning, neighborhood, (18 more...)

arXiv.org Machine Learning

1509.08588

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science > Data Mining (0.90)

Add feedback

Dr.VAE: Drug Response Variational Autoencoder

Rampasek, Ladislav, Hidru, Daniel, Smirnov, Petr, Haibe-Kains, Benjamin, Goldenberg, Anna

arXiv.org Machine LearningJul-6-2017

We present two deep generative models based on Variational Autoencoders to improve the accuracy of drug response prediction. Our models, Perturbation Variational Autoencoder and its semi-supervised extension, Drug Response Variational Autoencoder (Dr.VAE), learn latent representation of the underlying gene states before and after drug application that depend on: (i) drug-induced biological change of each gene and (ii) overall treatment response outcome. Our VAE-based models outperform the current published benchmarks in the field by anywhere from 3 to 11% AUROC and 2 to 30% AUPR. In addition, we found that better reconstruction accuracy does not necessarily lead to improvement in classification accuracy and that jointly trained models perform better than models that minimize reconstruction error independently.

artificial intelligence, gene expression, machine learning, (15 more...)

arXiv.org Machine Learning

1706.08203

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Online Rules for Control of False Discovery Rate and False Discovery Exceedance

Javanmard, Adel, Montanari, Andrea

arXiv.org Machine LearningJul-6-2017

Multiple hypothesis testing is a core problem in statistical inference and arises in almost every scientific field. Given a set of null hypotheses $\mathcal{H}(n) = (H_1,\dotsc, H_n)$, Benjamini and Hochberg introduced the false discovery rate (FDR), which is the expected proportion of false positives among rejected null hypotheses, and proposed a testing procedure that controls FDR below a pre-assigned significance level. Nowadays FDR is the criterion of choice for large scale multiple hypothesis testing. In this paper we consider the problem of controlling FDR in an "online manner". Concretely, we consider an ordered --possibly infinite-- sequence of null hypotheses $\mathcal{H} = (H_1,H_2,H_3,\dots )$ where, at each step $i$, the statistician must decide whether to reject hypothesis $H_i$ having access only to the previous decisions. This model was introduced by Foster and Stine. We study a class of "generalized alpha-investing" procedures and prove that any rule in this class controls online FDR, provided $p$-values corresponding to true nulls are independent from the other $p$-values. (Earlier work only established mFDR control.) Next, we obtain conditions under which generalized alpha-investing controls FDR in the presence of general $p$-values dependencies. Finally, we develop a modified set of procedures that also allow to control the false discovery exceedance (the tail of the proportion of false discoveries). Numerical simulations and analytical results indicate that online procedures do not incur a large loss in statistical power with respect to offline approaches, such as Benjamini-Hochberg.

artificial intelligence, hypothesis, machine learning, (18 more...)

arXiv.org Machine Learning

1603.09

Country: North America > United States > California (0.27)

Genre:

Research Report > Experimental Study (0.96)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (0.67)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

WWE Great Balls Of Fire 2017: Predictions, Match Card For 'Monday Night Raw' PPV

International Business TimesJul-5-2017, 14:20:52 GMT

SummerSlam is still more than a month away, but WWE has its biggest pay-per-view since WrestleMania 33 set for Sunday night. Great Balls of Fire 2017 features a stacked card with several intriguing matches. It'd be stunning to see Joe beat Lesnar and win his first title on the main roster. That doesn't mean it's not a highly anticipated match, and the build towards Sunday's main event has been better than any of Lesnar's recent feuds. Even if Joe does get pinned to end the PPV, he should find himself near the top of the card at SummerSlam because of what he's done in the weeks leading up to this WWE Universal Championship Match.

artificial intelligence, machine learning, prediction, (16 more...)

International Business Times

Country:

Oceania > Samoa (0.07)
North America > United States > Nevada > Clark County > Las Vegas (0.05)

Industry: Leisure & Entertainment > Sports > Martial Arts (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.63)

Add feedback

A Data Science Approach to Understanding Residential Water Contamination in Flint

Chojnacki, Alex, Dai, Chengyu, Farahi, Arya, Shi, Guangsha, Webb, Jared, Zhang, Daniel T., Abernethy, Jacob, Schwartz, Eric

arXiv.org Machine LearningJul-5-2017

When the residents of Flint learned that lead had contaminated their water system, the local government made water-testing kits available to them free of charge. The city government published the results of these tests, creating a valuable dataset that is key to understanding the causes and extent of the lead contamination event in Flint. This is the nation's largest dataset on lead in a municipal water system. In this paper, we predict the lead contamination for each household's water supply, and we study several related aspects of Flint's water troubles, many of which generalize well beyond this one city. For example, we show that elevated lead risks can be (weakly) predicted from observable home attributes. Then we explore the factors associated with elevated lead. These risk assessments were developed in part via a crowd sourced prediction challenge at the University of Michigan. To inform Flint residents of these assessments, they have been incorporated into a web and mobile application funded by \texttt{Google.org}. We also explore questions of self-selection in the residential testing program, examining which factors are linked to when and how frequently residents voluntarily sample their water.

data mining, flint, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1145/3097983.3098078

1707.01591

Country: North America > United States > Michigan > Genesee County > Flint (0.14)

Genre: Research Report (0.64)

Industry:

Water & Waste Management > Water Management > Water Supplies & Services (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Employee turnover prediction and retention policies design: a case study

Ribes, Edouard, Touahri, Karim, Perthame, Benoît

arXiv.org Machine LearningJul-5-2017

Machine learning algorithms are often showcased in customer churn study. Applications in fields such as telecommunication or product marketing (gaming, insurance etc..)(see [1],[2] for a recent review) are multiple. The implementation of these methods in Customer Relationship Management is becoming the new norm, as improving customer retention yields superior business results. We argue that this type of techniques can easily be applied to employee turnover. Note that the employee turnover can actually be subdivided in 3 buckets: involuntary turnover (induced by the company), voluntary turnover (employee resignation) and retirements. Retirement and an involuntary turnover are out of the scope of this paper.

artificial intelligence, machine learning, turnover, (17 more...)

arXiv.org Machine Learning

1707.01377

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Machine Learning Tests for Effects on Multiple Outcomes

Ludwig, Jens, Mullainathan, Sendhil, Spiess, Jann

arXiv.org Machine LearningJul-5-2017

A core challenge in the analysis of experimental data is that the impact of some intervention is often not entirely captured by a single, well-defined outcome. Instead there may be a large number of outcome variables that are potentially affected and of interest. In this paper, we propose a data-driven approach rooted in machine learning to the problem of testing effects on such groups of outcome variables. It is based on two simple observations. First, the 'false-positive' problem that a group of outcomes is similar to the concern of 'over-fitting,' which has been the focus of a large literature in statistics and computer science. We can thus leverage sample-splitting methods from the machine-learning playbook that are designed to control over-fitting to ensure that statistical models express generalizable insights about treatment effects. The second simple observation is that the question whether treatment affects a group of variables is equivalent to the question whether treatment is predictable from these variables better than some trivial benchmark (provided treatment is assigned randomly). This formulation allows us to leverage data-driven predictors from the machine-learning literature to flexibly mine for effects, rather than rely on more rigid approaches like multiple-testing corrections and pre-analysis plans. We formulate a specific methodology and present three kinds of results: first, our test is exactly sized for the null hypothesis of no effect; second, a specific version is asymptotically equivalent to a benchmark joint Wald test in a linear regression; and third, this methodology can guide inference on where an intervention has effects. Finally, we argue that our approach can naturally deal with typical features of real-world experiments, and be adapted to baseline balance checks.

artificial intelligence, hypothesis, machine learning, (17 more...)

arXiv.org Machine Learning

1707.01473

Genre: Research Report > Experimental Study (0.71)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Identifying Significant Predictive Bias in Classifiers

Zhang, Zhe, Neill, Daniel B.

arXiv.org Machine LearningJul-4-2017

We present a novel subset scan method to detect if a probabilistic binary classifier has statistically significant bias -- over or under predicting the risk -- for some subgroup, and identify the characteristics of this subgroup. This form of model checking and goodness-of-fit test provides a way to interpretably detect the presence of classifier bias or regions of poor classifier fit. This allows consideration of not just subgroups of a priori interest or small dimensions, but the space of all possible subgroups of features. To address the difficulty of considering these exponentially many possible subgroups, we use subset scan and parametric bootstrap-based methods. Extending this method, we can penalize the complexity of the detected subgroup and also identify subgroups with high classification errors. We demonstrate these methods and find interesting results on the COMPAS crime recidivism and credit delinquency data.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

1611.08292

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > Experimental Study (0.70)

Industry:

Health & Medicine (0.47)
Law (0.47)

Technology:

Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

Weakly Supervised Classification in High Energy Physics

Dery, Lucio Mwinmaarong, Nachman, Benjamin, Rubbo, Francesco, Schwartzman, Ariel

arXiv.org Machine LearningJul-2-2017

As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics - quark versus gluon tagging - we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervised classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available.

artificial intelligence, machine learning, supervised classifier, (15 more...)

arXiv.org Machine Learning

doi: 10.1007/JHEP05(2017)145

1702.00414

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.40)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback