Xinrong Meng
Xinrong Meng is a software engineer at Databricks and Apache Spark committer, focusing on PySpark. She is one of the major contributors of Pandas API on Spark.
Sessions
12-02
20:00
30min
Scale Data Science by Pandas API on Spark
Xinrong Meng, Takuya Ueshin
With Python emerging as the primary language for data science, pandas has grown rapidly to become one of the standard data science libraries. One of the known limitations in pandas is that it does not scale with your data volume linearly due to single-machine processing.
Pandas API on Spark overcomes the limitation, enabling users to work with large datasets by leveraging Apache Spark. In this talk, we will introduce Pandas API on Spark and help you scale your existing data science workloads using that. Furthermore, we will share the cutting-edge features in Pandas API on Spark.
Talk Track I