Accuracy
What Time Does WrestleMania 33 Start? 2017 WWE PPV Includes 13 Matches, Kickoff Show
WrestleMania 33 is easily WWE's biggest event of 2017, and it will be reflected in the length of Sunday's show. Between the kickoff show and the actual pay-per-view, the 13 matches on the card could span more than six hours. In order to catch every second of the action, fans should tune into WWE Network at 5 p.m. EDT when the kickoff show begins. The second hour of the kickoff show moves over to TV on USA at 6 p.m. EDT, and the PPV has a start time of 7 p.m. EDT. WWE PPVs usually end before 11 p.m. EDT (most start at 8 p.m. EDT), but that might not be the case Sunday night at Camping World Stadium in Orlando.
WWE WrestleMania 33: Predictions, Match Card, Preview For 2017 PPV
After months of rumors regarding WWE's biggest event of 2017, the WrestleMania 33 card is finally set. Thirteen matches are scheduled for Sunday's pay-per-view at Camping World Stadium in Orlando, and eight titles will be on the line. The WWE Universal Championship Match between Brock Lesnar and Goldberg is expected to go on last, putting an end to their feud that began with Goldberg's return in November. It seems to be clear which WWE superstar is winning the main event, but predictions for other matches aren't as easy to make. The SmackDown Tag Team Championships are the only belts that won't be defended at WrestleMania 33.
Cybersecurity can't rely on artificial intelligence too much, report says
AI is helping cybersecurity but researchers are warning about over-promising. Cybersecurity pros shouldn't rely on artificial intelligence and machine learning just yet, according to a new report. The report from security firm Carbon Black, which surveyed 410 cybersecurity researchers and 74 percent said that AI-driven security solutions are flawed, citing "high false-positive rates", while 70 percent claimed attackers can bypass machine learning techniques. The respondents did not write off AI or machine learning as unhelpful but rather said that they just aren't there yet and cannot be solely relied on to make big decisions when it comes to security. AI and machine learning should be used "primarily to assist and augment human decision making," said the report.
New IBM security tool uses machine learning to help businesses detect phishing - TechRepublic
A new machine-learning based security solution from IBM could help businesses detect phishing sites up to 250% faster than other methods. Announced via a blog post on Monday, the cognitive phishing detection feature is part of the IBM Security Trusteer platform. When it comes to hacking, phishing is one of the oldest tricks in the book. It has stayed around for so long, in part, because it still works. According to IBM Security research cited in the post, some 30% of phishing emails are opened by targeted recipients. Phishing works well because it capitalizes on the fact that humans are typically the weakest link in an organization's cybersecurity.
Gradient-based Regularization Parameter Selection for Problems with Non-smooth Penalty Functions
In high-dimensional and/or non-parametric regression problems, regularization (or penalization) is used to control model complexity and induce desired structure. Each penalty has a weight parameter that indicates how strongly the structure corresponding to that penalty should be enforced. Typically the parameters are chosen to minimize the error on a separate validation set using a simple grid search or a gradient-free optimization method. It is more efficient to tune parameters if the gradient can be determined, but this is often difficult for problems with non-smooth penalty functions. Here we show that for many penalized regression problems, the validation loss is actually smooth almost-everywhere with respect to the penalty parameters. We can therefore apply a modified gradient descent algorithm to tune parameters. Through simulation studies on example regression problems, we find that increasing the number of penalty parameters and tuning them using our method can decrease the generalization error.
Deep scattering transform applied to note onset detection and instrument recognition
Cazau, D., Revillon, G., Adam, O.
Automatic Music Transcription (AMT) is one of the oldest and most well-studied problems in the field of music information retrieval. Within this challenging research field, onset detection and instrument recognition take important places in transcription systems, as they respectively help to determine exact onset times of notes and to recognize the corresponding instrument sources. The aim of this study is to explore the usefulness of multiscale scattering operators for these two tasks on plucked string instrument and piano music. After resuming the theoretical background and illustrating the key features of this sound representation method, we evaluate its performances comparatively to other classical sound representations. Using both MIDI-driven datasets with real instrument samples and real musical pieces, scattering is proved to outperform other sound representations for these AMT subtasks, putting forward its richer sound representation and invariance properties.
Detecting Dependencies in Sparse, Multivariate Databases Using Probabilistic Programming and Non-parametric Bayes
Saad, Feras, Mansinghka, Vikash
Datasets with hundreds of variables and many missing values are commonplace. In this setting, it is both statistically and computationally challenging to detect true predictive relationships between variables and also to suppress false positives. This paper proposes an approach that combines probabilistic programming, information theory, and non-parametric Bayes. It shows how to use Bayesian non-parametric modeling to (i) build an ensemble of joint probability models for all the variables; (ii) efficiently detect marginal independencies; and (iii) estimate the conditional mutual information between arbitrary subsets of variables, subject to a broad class of constraints. Users can access these capabilities using BayesDB, a probabilistic programming platform for probabilistic data analysis, by writing queries in a simple, SQL-like language. This paper demonstrates empirically that the method can (i) detect context-specific (in)dependencies on challenging synthetic problems and (ii) yield improved sensitivity and specificity over baselines from statistics and machine learning, on a real-world database of over 300 sparsely observed indicators of macroeconomic development and public health.
Additive Models with Trend Filtering
Sadhanala, Veeranjaneyulu, Tibshirani, Ryan J.
We consider additive models built with trend filtering, i.e., additive models whose components are each regularized by the (discrete) total variation of their $(k+1)$st (discrete) derivative, for a chosen integer $k \geq 0$. This results in $k$th degree piecewise polynomial components, (e.g., $k=0$ gives piecewise constant components, $k=1$ gives piecewise linear, $k=2$ gives piecewise quadratic, etc.). In univariate nonparametric regression, the localized nature of the total variation regularizer used by trend filtering has been shown to produce estimates with superior local adaptivity to those from smoothing splines (and linear smoothers, more generally) (Tibshirani [2014]). Further, the structured nature of this regularizer has been shown to lead to highly efficient computational routines for trend filtering (Kim et al. [2009], Ramdas and Tibshirani [2016]). In this paper, we argue that both of these properties carry over to the additive models setting. We derive fast error rates for additive trend filtering estimates, and prove that these rates are minimax optimal when the underlying function is itself additive and has component functions whose derivatives are of bounded variation. We show that such rates are unattainable by additive smoothing splines (and by additive models built from linear smoothers, in general). We argue that backfitting provides an efficient algorithm for additive trend filtering, as it is built around the fast univariate trend filtering solvers; moreover, we describe a modified backfitting procedure whose iterations can be run in parallel. Finally, we conduct experiments to examine the empirical properties of additive trend filtering, and outline some possible extensions.
How Big Data is Redefining the Banking and Financial Industry
In the face of unmetered innovation across multiple industries, the banking industry has been rather quiet. For centuries, the banking industry has gone unscathed by the unrelenting tides of change. People still queue in banks to perform the simplest transactions. Appending the wrong signature on a check or form can lock you out of your bank account - or at the very least, turn your day into a nightmare. Thankfully, there are positive signs that the industry is slowly undergoing a transformation.
"Influence Sketching": Finding Influential Samples In Large-Scale Regressions
Wojnowicz, Mike, Cruz, Ben, Zhao, Xuan, Wallace, Brian, Wolff, Matt, Luan, Jay, Crable, Caleb
There is an especially strong need in modern large-scale data analysis to prioritize samples for manual inspection. For example, the inspection could target important mislabeled samples or key vulnerabilities exploitable by an adversarial attack. In order to solve the "needle in the haystack" problem of which samples to inspect, we develop a new scalable version of Cook's distance, a classical statistical technique for identifying samples which unusually strongly impact the fit of a regression model (and its downstream predictions). In order to scale this technique up to very large and high-dimensional datasets, we introduce a new algorithm which we call "influence sketching." Influence sketching embeds random projections within the influence computation; in particular, the influence score is calculated using the randomly projected pseudo-dataset from the post-convergence Generalized Linear Model (GLM). We validate that influence sketching can reliably and successfully discover influential samples by applying the technique to a malware detection dataset of over 2 million executable files, each represented with almost 100,000 features. For example, we find that randomly deleting approximately 10% of training samples reduces predictive accuracy only slightly from 99.47% to 99.45%, whereas deleting the same number of samples with high influence sketch scores reduces predictive accuracy all the way down to 90.24%. Moreover, we find that influential samples are especially likely to be mislabeled. In the case study, we manually inspect the most influential samples, and find that influence sketching pointed us to new, previously unidentified pieces of malware.