PyData Global 2022

Running Apache Airflow at Scale
12-01, 20:00–20:30 (UTC), Talk Track II

Apache Airflow is a foundational component of data platform orchestration at Shopify. In this talk, we'll dive into the many performance and reliability challenges we’ve encountered running Airflow at Shopify’s scale, our custom tooling, and the new multi-instance architecture we rolled out.


Along the way we'll share our tips and lessons learned so you can run Airflow at scale, too.


Prior Knowledge Expected

No previous knowledge expected

JM (Jean-Martin) is a Staff Data Engineer at Shopify and part of the Data Foundations team which provides the primitives (compute, query, orchestration, etc.) leveraged by the data science and analytics teams. Before Shopify he developed data-intensive applications in the supply chain, financial, and energy software industry. He is an avid cyclist living in Victoria, BC, Canada with his wife, relatively young son, and two cats who believe they are dogs.

This speaker also appears in:

Michael is a Data Engineer at Shopify. Currently Michael works on the team focused on providing intuitive access to the computing and orchestration building blocks for applications and platforms that transform and query data across Shopify.

This speaker also appears in: