PyData Global 2022

Machine Learning Frameworks Interoperability
12-02, 12:00–12:30 (UTC), Talk Track I

To develop mature data science, machine learning, and deep learning applications, one must develop a large number of pipeline components, such as data loading, feature extraction, and frequently a multitude of machine learning models.


The complexity of those components frequently requires using a broad range of software components/tools, creating challenges during pipeline integration. We'll discuss zero-copy functionality across several GPU-accelerated and non-GPU-accelerated data science frameworks, including, among others, PyTorch, TensorFlow, Pandas, SciKit Learn, RAPIDS, CuPy, Numba, and JAX. Zero-copy avoids unnecessary data transfers, hence drastically reducing the execution time of your application. We'll also address memory layouts of the associated data objects in various frameworks, the efficient conversion of data objects using zero-copy, as well as using a joint memory pool when mixing frameworks.


Prior Knowledge Expected

No previous knowledge expected

Christian is a theoretical physicist by training and holds a PhD in computer science with a passion for group theory, differential geometry, and massively parallel computing. In his current role as manager for AI Developer Technology he leads a team of dynamic and gifted engineers optimizing computational primitives for Deep Learning as well as scaling out end-to-end pipelines for a broad variety of scientific domains.

Miguel Martínez is a senior deep learning data scientist at NVIDIA, concentrating on Recommender Systems, NLP and Data Analytics. Previously, he mentored students at Udacity's Artificial Intelligence Nanodegree. He has a strong background in financial services, mainly focused on payments and channels. As a constant and steadfast learner, Miguel is always up for new challenges.

This speaker also appears in: