Big Data: Planning for Peak Season -- Part 2: Proactive Data Pruning

#artificialintelligence 

Part of every big data application is the archiving (or purge) of old or stale data. Big data storage tends to grow up (more records) and out (more tables, more data elements) as time passes, and the combined effects of these growth patterns cause difficulties in capacity planning and query performance. Another aspect of data growth is the reaction of business analysts to analytical query results. As the number of successful and profitable queries increases, two things happen: analysts want to re-run these queries against larger volumes of data (and over longer time periods); and query results usually suggest additional queries to execute. The bottom line is that the universe of analytical queries will grow as well as data volume, and this also contributes to query performance.