According to Weisensee et al., Data warehouse architecture follows following principles: ETL process is the foundation of BI. Success and failure of BI projects depends upon ETL process. It plays a vital role to integrate and enhance the worth of data. After the extraction, cleansing and arrangement of data, it will be loaded into data warehouse. In short, ETL is the transferring process of data from data source to the target data warehouse.
Data mining is the process of extracting useful information from an accumulation of data, often from a data warehouse or collection of linked datasets. Data mining tools include powerful statistical, mathematical, and analytics capabilities whose primary purpose is to sift through large sets of data to identify trends, patterns, and relationships to support informed decision-making and planning. Often associated with marketing department inquiries, data mining is seen by many executives as a way to help them better understand demand and to see the effect that changes in products, pricing, or promotion have on sales. But data mining has considerable benefit for other business areas as well. Engineers and designers can analyze the effectiveness of product changes and look for possible causes of product success or failure related to how, when, and where products are used.
These minor data entries can be very important at times (crime investigations, return of products etc) 4. Definition Many Definitions Extraction of implicit, previously unknown and potentially useful information from data Exploration and analysis, by automatic or semi- automatic means, of large quantities of data in order to discover meaningful patterns 5. What is data mining?
Below figure shows the overall Big Data analytics architecture framework. MapReduce and Spark provide the large data processing capabilities for different types of analytics. For example, descriptive analytics uses MapReduce to filter and summarize a large amount of data. Similarly, predictive analytics techniques employ MapReduce to process data from data warehouses. Before a data analytics process begins, the relevant data are collected from a variety of sources (stage 1).