Data science versus statistics, to solve problems: case study
In this article, I compare two approaches (with their advantages and drawbacks) to compute a simple metric: the number of unique visitors ("uniques") per year for a website. I use the word user or visitor interchangeably. The problem seems straightforward at first glance, but it is not. It is a complex big data problem because the naive approach involves sorting hundreds of billions of observations - called transactions or page views here. It is also complicated because there's no 100% sure way to identify and track a user over long time periods: cookies and IP addresses / browser combinations both have drawbacks.
Mar-28-2016, 19:51:18 GMT
- Country:
- North America > United States > New York > New York County > New York City (0.05)
- Industry:
- Banking & Finance (0.30)
- Technology: