Accuracy
Looking For AI Exposure? Cyber Security May Have You Covered
One of the complications, and opportunities, confronting the cyber security industry is that the cyber threat may be escalating beyond the capacity of a human-centric response. Consider for instance the remarks of the outgoing chief of the Department of Defense, that "given the volume [of attacks] and where I see the threat moving it will be impossible for humans by themselves to keep pace." The DoD currently finds itself amidst a $1.6 billion project of centralizing its hundreds of separate firewalls into a unified system, the end purpose being to enable effective integration of artificial intelligence capabilities. While this DoD example is an isolated one, it nonetheless epitomizes the human limitation in countering the cyber threat, which is primarily a digitalized, computer-driven hazard. As Benedict Cumberbatch playing Alan Turing in The Imitation Game quipped, "our problem is that we're trying to beat [enigma] with men. What if only a machine can defeat another machine?"
Estimating and Controlling the False Discovery Rate for the PC Algorithm Using Edge-Specific P-Values
Strobl, Eric V., Spirtes, Peter L., Visweswaran, Shyam
The PC algorithm allows investigators to estimate a complete partially directed acyclic graph (CPDAG) from a finite dataset, but few groups have investigated strategies for estimating and controlling the false discovery rate (FDR) of the edges in the CPDAG. In this paper, we introduce PC with p-values (PC-p), a fast algorithm which robustly computes edge-specific p-values and then estimates and controls the FDR across the edges. PC-p specifically uses the p-values returned by many conditional independence tests to upper bound the p-values of more complex edge-specific hypothesis tests. The algorithm then estimates and controls the FDR using the bounded p-values and the Benjamini-Yekutieli FDR procedure. Modifications to the original PC algorithm also help PC-p accurately compute the upper bounds despite non-zero Type II error rates. Experiments show that PC-p yields more accurate FDR estimation and control across the edges in a variety of CPDAGs compared to alternative methods.
Canelo Alvarez vs. Julio Cesar Chavez: LIVE Round By Round Scorecard, Actual Start Time For PPV Fight
Preview: Canelo Alvarez (48–1–1, 34 KOs) and Julio Cesar Chavez Jr. (50–2–1, 32 KOs) meet in a non-title fight on Saturday night at T-Mobile Arena in Las Vegas. Alvarez enters the fight as the favorite at -600 compared to Chavez's 450. Both Mexican boxers weighed in at 164 pounds for the catchweight of 164.5 pounds. Alvarez is listed at 5-foot-9, with a reach of 70.5 inches. Chavez is 6-foot-1 and has a 73-inch reach.
Finding Bottlenecks: Predicting Student Attrition with Unsupervised Classifier
Sajjadi, Seyed, Shapiro, Bruce, McKinlay, Christopher, Sarkisyan, Allen, Shubin, Carol, Osoba, Efunwande
Policy makers, the public, university administrators, students and their families are concerned about low graduation rates and lengthy times to degree in higher education. The median time to graduation is six years at CSUN (1). The fouryear and the six-year graduation rates are 13% and 50%, respectively (2). With an enrollment of over 6000 undergraduate students, CoBaE is one of largest business schools in the nation. CoBaE confers the second most undergraduate degrees at CSUN (behind the College of Social and Behavioral Science), and it has three of the top ten most popular majors (Management, Finance, and Marketing) at CSUN.
Learning Local Dependence In Ordered Data
In many applications, data come with a natural ordering. This ordering can often induce local dependence among nearby variables. However, in complex data, the width of this dependence may vary, making simple assumptions such as a constant neighborhood size unrealistic. We propose a framework for learning this local dependence based on estimating the inverse of the Cholesky factor of the covariance matrix. Penalized maximum likelihood estimation of this matrix yields a simple regression interpretation for local dependence in which variables are predicted by their neighbors. Our proposed method involves solving a convex, penalized Gaussian likelihood problem with a hierarchical group lasso penalty. The problem decomposes into independent subproblems which can be solved efficiently in parallel using first-order methods. Our method yields a sparse, symmetric, positive definite estimator of the precision matrix, encoding a Gaussian graphical model. We derive theoretical results not found in existing methods attaining this structure. In particular, our conditions for signed support recovery and estimation consistency rates in multiple norms are as mild as those in a regression problem. Empirical results show our method performing favorably compared to existing methods. We apply our method to genomic data to flexibly model linkage disequilibrium. Our method is also applied to improve the performance of discriminant analysis in sound recording classification.
Canelo Alvarez - Julio Cesar Chavez Jr Fight: TV Channel, Start Time, PPV Price, Undercard For Boxing Event
In a Cinco de Mayo weekend battle between high-profile Mexican boxers, Julio Cesar Chavez Jr. seeks an upset over Canelo Alvarez. Alvarez, one of the top pound-for-pound boxers, enters Saturday's fight at T-Mobile Arena in Las Vegas as 1/6 favorite. The fight is expected to have heavy attention consider Alvarez's reputation and Chavez's famous father. The two boxers, fighting at 164 pounds, are arguably the most well-known active Mexican boxers and they are set to tangle on Mexico's most famous holiday. While Chavez has never fought on a Cinco de Mayo weekend, Alvarez defeated Amir Khan at about the same time last year and knocked out James Kirkland on May 9, 2015, and also held off Shane Mosley on Cinco de Mayo in 2012.
How machine learning could prevent money laundering
Machine learning is being put to use in all sorts of areas today. From smart cars and homes and beyond, the use of artificial intelligence (AI) and machine learning (ML) are becoming a larger part of how many companies conduct business. As more and more businesses are hit with cyber crime rather than physical crimes, there has been a needed shift from commercial surveillance systems towards cyber security systems to protect confidential data. More recently, we've seen ML sink its teeth into anti money laundering (AML) with big potential impacts there. Most current AML systems are founded on an extensive list of rules.
People on Drugs: Credibility of User Statements in Health Communities
Mukherjee, Subhabrata, Weikum, Gerhard, Danescu-Niculescu-Mizil, Cristian
Online health communities are a valuable source of information for patients and physicians. However, such user-generated resources are often plagued by inaccuracies and misinformation. In this work we propose a method for automatically establishing the credibility of user-generated medical statements and the trustworthiness of their authors by exploiting linguistic cues and distant supervision from expert sources. To this end we introduce a probabilistic graphical model that jointly learns user trustworthiness, statement credibility, and language objectivity. We apply this methodology to the task of extracting rare or unknown side-effects of medical drugs --- this being one of the problems where large scale non-expert data has the potential to complement expert medical knowledge. We show that our method can reliably extract side-effects and filter out false statements, while identifying trustworthy users that are likely to contribute valuable medical information.
20 Questions to Detect Fake Data Scientists
Check the answers from KDnuggets Editors to these questions (and one more): 21 Must-Know Data Science Interview Questions and Answers Now that the Data Scientist is officially the sexiest job of the 21st century, everyone wants a piece of the pie. That means there are a few data posers out there. People who call themselves Data Scientists, but who don't actually have the right skill set. This isn't always done out of a desire to deceive. The newness of data science and lack of a widely understood job description means that many people may think they are data scientists purely because they deal with data.
Canelo Alvarez vs. Julio Cesar Chavez Jr: Prediction, Betting Odds, Preview For PPV Fight
Two high-profile Mexican fighters meet on Cinco de Mayo weekend, with Canelo Alvarez (48–1–1, 34 KOs) the heavy favorite over Julio Cesar Chavez Jr. (50–2–1, 32 KOs) at T-Mobile Arena in Las Vegas. Oddsmakers list Alvarez at -600, while Chavez Jr. has 450 odds in their matchup Saturday night at a 164.5-pound catch-weight. Alvarez, among the best pound-for-pound boxers in the world, has won six consecutive fights after losing in a majority decision to Floyd Mayweather in September 2013. Chavez has lost two of his last six fights and hasn't won a fight by knockout since defeating Andy Lee in June 2012. Both boxers are coming off wins at a much different weight.