A Beginner's Guide to Data Engineering – Part II

@machinelearnbot 

In A Beginner's Guide to Data Engineering -- Part I, I explained that an organization's analytics capability is built layers upon layers. From collecting raw data and building data warehouses to applying Machine Learning, we saw why data engineering plays a critical role in all of these areas. One of any data engineer's most highly sought-after skills is the ability to design, build, and maintain data warehouses. I defined what data warehousing is and discussed its three common building blocks -- Extract, Transform, and Load, where the name ETL comes from. For those who are new to ETL processes, I introduced a few popular open source frameworks built by companies like LinkedIn, Pinterest, Spotify, and highlight Airbnb's own open-sourced tool Airflow.