PyData Global 2022

Lightning Talks
12-01, 20:30–22:00 (UTC), Talk Track II

Lightning Talks are short 5-10 minute sessions presented by community members on a variety of interesting topics.


Order of Presentations
1. OpenTeams Score: A Way to Assess Open Source Projects by Fatma Tarlaci, Brian Skinn, Dale Tovar
2. k-NN on steroids - an introduction to approximate nearest neighbours by Kacper Łukawski
3. Introducing Meadowrun: write code locally, run on the cloud by Kurt Schelfthout, Richard Lee
4. It might look normal but this distribution will ruin your stats by Allan Campopiano
5. Everything That You Wanted To Know About P-Values But Were Afraid To Ask by Eyal Kazin
6. Quokka: Rewriting SparkSQL in Python by Ziheng Wang
7. Lessons learned from using the C Foreign Function Interface to integrate neural networks into Earth system models by Caroline Arnold


Prior Knowledge Expected

No previous knowledge expected

Brian Skinn (@btskinn, @[email protected]) has always been a programmer (TI-82, represent!), but took a long arc through chemical engineering---B.S., Ph.D., and ten years in industrial electrochemical engineering R&D---before joining OpenTeams Incubator as a technology marketer in May 2022. Along the way, he learned a lot of VBA and a little MATLAB, Maple, and Java, before discovering Python in 2014 and never looking back. He maintains a couple of open source Python libraries, occasionally posts on his blog (https://bskinn.github.io), and reads SF/F & makes enthusiastically amateur music in his remaining spare time.

Kacper Łukawski is a Developer Advocate at Qdrant - an open-source neural search engine. His broad experience is mostly related to data engineering, machine learning, and software design. He has been actively contributing to the discussion on Artificial Intelligence by conducting lectures and workshops locally and internationally.

Kurt has worked in a variety of application domains, from logistics to most recently finance. He's currently working on Meadowrun (https://meadowrun.io), to make cloud compute more easily accessible for data scientists and data engineers.

Kurt writes software engineering and computing science deep dives at Get Code (https://getcode.substack.com), and hot takes on Twitter as @kurt2001 (https://twitter.com/kurt2001) or Mastodon as @kurts (https://mastodon.online/@kurts).

As a data scientist at Deepnote, I have the privilege of partnering with developers all over the world in order to help them promote their tools to the broader scientific community. By demonstrating the leading data science tools in Deepnote, scientists and developers can easily onboard to new concepts and techniques.

My degree in cognitive and behavioural neuroscience helped me realize my dual passion for (1) developing scientific software and (2) communicating technical concepts in a straightforward manner. My main goal is to find creative ways to lower the barrier-to-entry for scientists who are learning new tools.

To this end, I've published two peer-reviewed statistical software libraries. The most notable is Hypothesize—a Python library for robust statistics based on Rand Wilcox's R package. I continue to deliver workshops on robust statistics, data visualization, and data science tooling in general.

This speaker also appears in:

Ex-cosmologist turned data scientist with over 15 years experience in solving challenging problems. I am motivated by intellectual challenges, highly detail oriented and love visualising data results to communicate insights for better decisions within organisations.

My main drive as a data scientist is applying scientific approaches that result in practical and clear solutions. To accomplish these, I use whatever works, be it statistical/causal inference, machine/deep learning or optimisation algorithms. Being result driven I have a passion for quantifying and communicating the impact of interventions to non-specialist audiences in an accessible manner.

My claim for fame is between 2004-2014 living in four different continents within a span of a decade, including three tennis Grand Slam cities (NYC, Melbourne, London).

This speaker also appears in:

Tony got his BS and MS from a cold dark place called MIT, where he spent years learning the archaic language Verilog. After graduating, he briefly did a startup writing assembly speeding up machine learning model inference. But after discovering most of his prospective customers spend more time loading inputs from a database than actually running the models, he decided to pursue a PhD at Stanford on big data processing.

Caroline is a research data scientist with the German Climate Computing Centre DKRZ.