PyData Global 2022

Elijah ben Izzy

Elijah has always enjoyed working at the intersection of math and engineering. More recently, he has focused his career on building tools to make data scientists more productive. At Two Sigma, he was building infrastructure to help quantitative researchers efficiently turn ideas into production trading models. At Stitch Fix he leads the Model Lifecycle team — a team that focuses on streamlining the experience for data scientists to create and ship machine learning models. In his spare time, he enjoys geeking out about fractals, poring over antique maps, and playing jazz piano.

The speaker's profile picture

Sessions

12-03
14:00
30min
Scalable Feature Engineering with Hamilton
Elijah ben Izzy, Stefan Krawczyk

In this talk we present Hamilton, a novel open-source framework for developing and maintaining scalable feature engineering dataflows. Hamilton was initially built to solve the problem of managing a codebase of transforms on pandas dataframes, enabling a data science team to scale the capabilities they offer with the complexity of their business. Since then, it has grown into a general-purpose tool for writing and maintaining dataflows in python. We introduce the framework, discuss its motivations and initial successes at Stitch Fix, and share recent extensions that seamlessly integrate it with distributed compute offerings, such as Dask, Ray, and Spark.

Talk Track I