PyData Global 2022

HuggingFace + Ray AIR Integration: A Python developer’s guide to scaling Transformers
12-03, 17:30–18:00 (UTC), Talk Track II

Hugging Face Transformers is a popular open-source project with cutting edge Machine Learning (ML), but meeting the computational requirements for advanced models it provides often requires scaling beyond a single machine. In this session, we explore the integration between Hugging Face and Ray AI Runtime (AIR), allowing users to scale their model training and data loading seamlessly. We will dive deep into the implementation and API and explore how we can use Ray AIR to create an end-to-end Hugging Face workflow, from data ingest through fine-tuning and HPO to inference and serving.


Hugging Face Transformers is a popular open-source project with cutting edge Machine Learning (ML), but meeting the computational requirements for advanced models it provides often requires scaling beyond a single machine. In this session, we explore the integration between Hugging Face and Ray AI Runtime (AIR), allowing users to scale their model training and data loading seamlessly. We will dive deep into the implementation and API and explore how we can use Ray AIR to create an end-to-end Hugging Face workflow, from data ingest through fine-tuning and HPO to inference and serving.

The computational and memory requirements for fine-tuning and training these models can be significant. To deal with this issue, the Ray team has developed a Hugging Face integration for Ray AI Runtime (AIR), allowing Transformers model training to be easily parallelized across multiple CPUs or GPUs in a Ray Cluster, saving time and money, all the while allowing to take advantage of the rich Ray ML ecosystem thanks to common API.

In this session, we explore the integration between Hugging Face and Ray AIR, allowing users to scale their model training and data loading seamlessly. We will dive deep into the implementation and API and explore how we can use Ray AIR to create an end-to-end Hugging Face workflow, from data ingest through fine-tuning and HPO to inference and serving.

Key Takeaways:
Python developers and machine learning engineers can use Transformers and scale their language models.
Get exposed to Ray AIR’s Python APIs for end-to-end Hugging Face and ML workflow.
Understand how Ray AIR, built atop Ray, can scale your Python-based ML workloads.


Prior Knowledge Expected

No previous knowledge expected

Antoni Baum is a software engineer at Anyscale, working on Ray AIR, XGBoost-Ray, and other ML libraries. In his spare time, he contributes to various open source projects, trying to make machine learning more accessible and approachable.