Takuya Ueshin
Takuya Ueshin is a software engineer at Databricks, and an Apache Spark committer and a PMC member. His main interests are in Spark SQL internals as well as PySpark. He is one of the major contributors of pandas API on Spark, a.k.a. the Koalas project.
Sessions
With Python emerging as the primary language for data science, pandas has grown rapidly to become one of the standard data science libraries. One of the known limitations in pandas is that it does not scale with your data volume linearly due to single-machine processing.
Pandas API on Spark overcomes the limitation, enabling users to work with large datasets by leveraging Apache Spark. In this talk, we will introduce Pandas API on Spark and help you scale your existing data science workloads using that. Furthermore, we will share the cutting-edge features in Pandas API on Spark.