Scientific Discovery


Bridging the gap between big banks and challengers

#artificialintelligence

Since leaving Barclays in 2015, Mr Jenkins has spent time speaking to fintech startups and banking chief executives to get a sense of the gulf between the two parties and how to bridge it. Mr Jenkins founded 10x Technologies, a startup that offers a cloud-based core banking platform – a modern operating system for finance. However, it's the last challenge, cultural resistance, that Mr Jenkins says is "the most difficult and the most powerful" obstacle. "Bank management is largely technologically illiterate," says Mr Derhalli.


Data Science Simplified Part 3: Hypothesis Testing

#artificialintelligence

Like a crime-fiction story, hypothesis testing, based on data, leads us from a novel suggestion to an effective proposition. If there are statistically significant evidences that suggest that the alternate hypothesis is valid, then the NULL hypothesis is rejected. Like all statistical testing, hypothesis testing has to deal with uncertainty. The p-value is the probability that the t-statistic observed by chance under the assumption that NULL hypothesis is true.


Adam Schiff, President Trump and the serendipity of slander

Los Angeles Times

Schiff is the nine-term Democratic congressman from Southern California, representing parts of Los Angeles and several communities, including Burbank, Glendale and Pasadena, skirting the nearby mountains. With well over a dozen members of Congress crowding the Southern California media market and TV stations having close to zero interest in politics, about the only way for a Washington lawmaker to get attention is participating in a high-speed car chase, ideally in prime time with a Kardashian riding shotgun. It is one reason the area's congressional lawmakers have a decades-long, unblemished record of futility when it comes to seeking prominent state office. Schiff would face all the hurdles confronting any Southern California member of Congress trying to make the broad leap to the Senate: the mountainous fundraising requirement, the difficulty of launching from a small geographic base and, not least, the built-in bias many Northern Californians have against voting for any politician from (ugh!)


Importance of Hypothesis Testing in Quality Management

@machinelearnbot

When you need to make decisions such as how much you should spend on advertising or what effect a price increase will have your customer base, it's easy to make wild assumptions or get lost in analysis paralysis. Hypothesis testing is categorized as parametric test and nonparametric test. The parametric test includes z-test, t-test, f-test. The nonparametric test includes sign test, Wilcoxon Rank-sum test, Kruskal-Wallis test and permutation test.


Artificial Intelligence Will Create a Paradigm Shift Within the Next Decade

#artificialintelligence

Today, enterprise software is largely at the "power steering" phase. Today, enterprise software is largely at the "power steering" phase, where workflow-based software helps you "steer" more easily. Over the next decade, I believe enterprise software will get to level 4/5, where software will be self driving, and we'll see a paradigm shift in the coming years when we move from a mindset of machines are assisting humans to humans are assisting machines. Salesforce has been a largely workflow driven solution to push sales reps to input their activities (so they get paid) and thus allow sales managers to view activities of their direct report and manage more efficiently.


DuPont Pioneer: Data Engineer

@machinelearnbot

DuPont has a rich history of scientific discovery that has enabled countless innovations and today, we're looking for more people, in more places, to collaborate with us to make life the best that it can be. Seeking a Data Engineer/Software Developer to design, develop, and implement high quality data solutions and applications for our data science and analytics platform in AWS. Education & Experience: BS degree in Computer Science, Physics, Electrical Engineering, or a related field.


Machine Learning for Everyone - Part 2: Spotting anomalous data

#artificialintelligence

Next, we create the predictive model using Random Forest, doing the model parameter tuning with caret library using 4-fold cross-validation optimized for the ROC metric. We did some proof of concept to automatically spot the most suspicious login cases in order to boost current anomaly detection feature, and ROC curve was a good option to test the predictive model sensitivity. We've built a machine learning model in order to know the abnormal cases, using random forest. The cases flagged as abnormal, plus the top 2 percent of suspicious ones detected by the random forest, are mapped closer together away from the normal cases, because they behave differently.



Opensource & Machine Learning for GDPR Data Discovery

#artificialintelligence

Basically, we focus our data discovery on three main areas: column discovery, data discovery and file discovery. From that we use pre-trained Machine Learning (OpenNLP) models (a few examples here that are public, but only for English language) and using techniques like tokenization, sentence segmentation, named entity extraction and parsing to understand if the data is sensitive or not. As example, if you column is called "X_DATA" but has personal information like Address, column discovery will not help. From them, we explore into a sample of data inside "X_DATA" and apply our pre-trained models based on OpenNLP to understand if that sample contains any Address.


On-Demand Serendipity

#artificialintelligence

As I explained in my previous post, allowing services to make such decisions requires them to have a better model of their user and a better understanding of their current mental state. Here's a crude mock-up of a possible persona selection screen: Notifications are sorted by personas. The user has also chosen to always receive important notifications from Elsa and the system has determined that a time-sensitive persona (handling the arrangements for tonight's movie) should be activated. This system works the same way for information and media consumption: if the user wanted to hear about the latest in wine tasting, they would just have to activate their wine tasting persona.