### ML.NET - Get started in 10 minutes

Notepad), and save it as iris-data.txt When you paste the data it will look like the following. Each row represents a different sample of an iris flower. From left to right, the columns represent: sepal length, sepal width, petal length, petal width, and type of iris flower. If you're following along in Visual Studio, you'll need to configure iris-data.txt

Simpson’s paradox is the phenomenon that a trend of an association in the whole population reverses within the subpopulations defined by a categorical variable. Detecting Simpson’s paradox indicates surprising and interesting patterns of the data set for the user. It is generally discussed in terms of binary variables, but studies for the exploration of it for continuous variables are relatively rare. This paper describes a method to discover Simpson’s paradox for the trend of the pair of continuous variables. Correlation coefficient is used to indicate the association between a pair of continuous variables. We use categorical variables to partition the whole data set into groups. Our algorithm’s goal is to find the sign reversal between the coefficient correlations measured in the group relative to the original entire data. We show that our approach detects cases in real data sets as well as synthetic data sets, and demonstrate that our approach can uncover the hidden surprising pattern by detecting occurrences of Simpson’s paradox. This paper also proposes an approach that exploits sampled data for early Simpson’s paradox detection. We show the running time for the algorithm by examining through the combination of different conditions.

### Clustering in Power BI using R

Here, I've used the famous Iris Flower dataset to show the clustering in Power BI using R. I've used the K-means clustering method to show the different species of Iris flower. About the dataset: The Iris dataset has 5 attributes (Sepal length, Sepal width, Petal width, Petal length, Species). The 3 different species are named as Setosa, Versicolor and Virginica. It is observed that, the Petal Length and Petal Width are similar in each Species, hence I have considered Petal Length for x axis and Petal Width for y axis to plot a graph. K-means Clustering: K means is a non-hierarchical iterative clustering technique.In this technique we start by randomly assigning the data points to clusters.