Migrating from AWS Glue to BigQuery for ETL
Our journey with AWS Glue was a bit of a struggle once we started to dig deeper into the streaming functionality of it, the orchestration of so many layers added a huge overhead that we weren't expecting and whilst most of that is handled within the AWS suite of products, there are just too many benefits to switching our pipelines over to GCP and BigQuery to be ignored. Next steps are to finalise our deployment by using Cloud Composer (Airflow) to orchestrate the creation of each of the tables and provide a monitoring dashboard to help us detect failures and act on them. I will say that AWS got in touch with me after my previous article and I got on a call with the AWS Glue product team, in their words I had "hit pretty much every sharp edge possible" (seems to be a running theme with me -- perhaps I should switch careers to QA engineer?),
Sep-15-2021, 08:00:34 GMT