Collaborating Authors

Statistical Learning

Logistic Regression using SAS - Indepth Predictive Modeling


What is this course all about? This course is all about credit scoring / logistic regression model building using SAS. There course promises to explain concepts in a crystal clear manner. It goes through the practical issue faced by analyst. How to clarify objective and ensure data sufficiency?

Simultaneous clustering and representation learning


The success of deep learning over the last decade, particularly in computer vision, has depended greatly on large training data sets. Even though progress in this area boosted the performance of many tasks such as object detection, recognition, and segmentation, the main bottleneck for future improvement is more labeled data. Self-supervised learning is among the best alternatives for learning useful representations from the data. In this article, we will briefly review the self-supervised learning methods in the literature and discuss the findings of a recent self-supervised learning paper from ICLR 2020 [14]. We may assume that most learning problems can be tackled by having clean labeling and more data obtained in an unsupervised way.

Real Time Anomaly Detection for Cognitive Intelligence - XenonStack


Classical Analytics – Around ten years ago, the tools for analytics or the available resources were excel, SQL databases, and similar relatively simple ones when compared to the advanced ones that are available nowadays. The analytics also used to target things like reporting, customer classification, sales trend whether they are going up or down, etc.In this article we will discuss about Real Time Anomaly Detection. As time passed by the amount of data has got a revolutionary explosion with various factors like social media data, transaction records, sensor information, etc. in the past five years. With the increase of data, how data is stored has also changed. It used to be SQL databases the most and analytics used to happen for the same during the ideal time. The analytics also used to be serialized. Later, NoSQL databases started to replace the traditional SQL databases since the data size has become huge and the analysis also changed from serial analytics to parallel processing and distributed systems for quick results.

New Oracle Machine Learning Features in 19c and 20c


Here are links to blog posts and articles I've written about the new features of Oracle Machine Learning in 19c (and previous) and 20c. I've given a presentation on these topics at ACES@Home and Yatra online conferences. Each of the following links will explain each of the algorithms, and gives demo code for you to try. This entry was posted in Oracle.

Data Visualization Audiences and Scenarios


Whether you are doing multivariate analysis or building a deep learning neural network, data visualization arguably the most important part of any data profession. I personally enjoy the analytics more than the visualization of data, however, if the data I am analyzing is not understood by the end user, then what's the point? Data visualization is an interesting part of data professions because it is one of the only, if not the only part of the profession that can be left up to interpretation, rather than pure fact. Sure, your bar graph that compares sales is correct, but maybe it would've made more sense to the end user if it was in a circle graph? Data visualization is a part of the profession that I continue to get some extra practice on, however, with all the mistakes and practice I've had, I'd like to offer some of it to my readers and fellow data professionals/students.

SWAP: Softmax-Weighted Average Pooling


Blake Elias is a Researcher at the New England Complex Systems Institute. Shawn Jain is an AI Resident at Microsoft Research. Our method, softmax-weighted average pooling (SWAP), applies average-pooling, but re-weights the inputs by the softmax of each window. We present a pooling method for convolutional neural networks as an alternative to max-pooling or average pooling. Our method, softmax-weighted average pooling (SWAP), applies average-pooling, but re-weights the inputs by the softmax of each window.

Introduction to Data Science with Python


If you want to learn more about exploratory analysis using Pandas, check out Simplilearn's Data Science with Python video, which can help. We can see that columns like LoanAmount and ApplicantIncome contain some extreme values. We need to process this data using data wrangling techniques to normalize and standardize the data. We will now take a look at data wrangling using Pandas as a part of our learning of Data Science with Python. Data wrangling refers to the process of cleaning and unifying messy and complicated data sets.

3 facts about time series forecasting that surprise experienced machine learning practitioners.


Time series forecasting is something of a dark horse in the field of data science: It is one of the most applied data science techniques in business, used extensively in finance, in supply chain management and in production and inventory planning, and it has a well established theoretical grounding in statistics and dynamic systems theory. Yet it retains something of an outsider status compared to more recent and popular machine learning topics such as image recognition and natural language processing, and it gets little or no treatment at all in introductory courses to data science and machine learning. My original training is in neural networks and other machine learning methods, but I gravitated towards time series methods after my career led me to the role of demand forecasting specialist. In recent weeks, as part of my team's effort to expand beyond traditional time series forecasting capabilities and into a borader ML based approach to our business, I found myself having several discussions with experienced ML engineers, who were very good at ML in general, but didn't have much experience with times series methods. I realized from those discussions that there were several things specific to time series forecasting that the forecasting community takes for granted but are very surprising to other ML practioners and data scientists, especially when compared to the way standard ML problems are approached.

How to Build a Machine Learning Model


How to Build a Machine Learning Model A Visual Guide to Learning Data Science Jul 25 · 13 min read Learning data science may seem intimidating but it doesn't have to be that way. Let's make learning data science fun and easy. So the challenge is how do we exactly make learning data science both fun and easy? Cartoons are fun and since "a picture is worth a thousand words", so why not make a cartoon about data science? With that goal in mind, I've set out to doodle on my iPad the elements that are required for building a machine learning model.

Machine Learning Algorithms For Beginners with Code Examples in Python


Machine learning (ML) is rapidly changing the world, from diverse types of applications and research pursued in industry and academia. Machine learning is affecting every part of our daily lives. From voice assistants using NLP and machine learning to make appointments, check our calendar and play music, to programmatic advertisements -- that are so accurate that they can predict what we will need before we even think of it. More often than not, the complexity of the scientific field of machine learning can be overwhelming, making keeping up with "what is important" a very challenging task. However, to make sure that we provide a learning path to those who seek to learn machine learning, but are new to these concepts.