Scalable Select of Random Rows in SQL

@machinelearnbot 

If you're new to the big data world and also migrating from tools like Google Analytics or Mixpanel for your web analytics, you probably noticed performance differences. Google Analytics can show you predefined reports in seconds, while the same query for the same data in your data warehouse can take several minutes or even more. Such performance boosts are achieved by selecting random rows or the sampling technique. Let's learn how to select random rows in SQL. Before we start to work on sampling implementation, it is worth mentioning some sampling fundamentals. Sampling is based on a subset selection of individuals from some population to describe this population's properties.