PyData Global 2022

Lessons Learned Building Our Own Dashboard Solution Using Open-Source Technologies
12-02, 12:30–13:00 (UTC), Talk Track I

Most organisations habe implemented some kind of dashboard to monitor their data, processes, or business. However, many dashboard solutions come with a caveat – either the licensing costs, lack of transparency in the workflows, limited creativity, or they cannot be connected to existing infrastructure.
This talk is aimed at Data Scientists, Data Engineers, Data Practitioners and Managers struggling with choosing between a myriad of commercial dashboard solutions and DIY. We present how to create your own dashboard using open-source Python technologies like FastAPI, SQLAlchemy, and Celery and the challenges involved. We look back at the pitfalls and solutions we have worked on over the past 3 years. The goal is not to present our unique solution, but to show how we can combine different Python libraries to implement custom solutions to solve different use cases. Attendees should be familiar with the basic concepts of web infrastructure. Previous knowledge of any libraries is not required. We hope to provide a starting point to build your custom dashboard solution using open-source tooling.


In this presentation, we showcase the difficulties and challenges in creating a dashboard solution based on three real-world examples that offer an alternative to standard commercial dashboard products. We start our talk with an introduction to the problem that we tried to solve with the dashboard. We introduce custom dashboard solutions and offer attendees solutions to avoid our pitfalls. We structure our presentation on three broader topics based on the use case.

First, we discuss web frameworks and why we decided to work with FastAPI. We provide a short overview of different web frameworks and raise some questions that should be considered when starting a new project. Second, we explain how to link possible data sources and what we learned about using different Object-relational mapping (ORM) libraries. Namely, we discuss Orator ORM and SQLAlchemy. Thirdly, we analyze various performance issues and the usage of task queues to overcome performance problems. Finally, we provide a small peek into our dashboard solution and hope to start a small discussion and answer questions.

In minutes 0-5, we introduce the problem that we tried to solve. In minutes 5-10, we present how to find a web framework that matches your requirements. In minutes 10-15, we discuss how to link data sources (relational databases) to your dashboard. In minutes 15-20, we demonstrate how to use task queues to parallelize calculation tasks. In the next 2 minutes, we wrap-up the discussed discussions and provide a short overview over the resulting system. The remaining time is used for Q&A and hopefully a discussion on tools to build your own custom interactive reporting open-source solution.


Prior Knowledge Expected

No previous knowledge expected

Jan Dix is co-founder and Head of Software Development at &effect. His technical interest cuts across Software Engineering, Data Science, and Visualization. At &effect, he helps organizations in the public and social sector to make effective decisions. In his daily work, Jan Dix is overwhelmingly using and enjoying open-source technologies.

He holds a Master in Social and Economic Data Analysis (University of Konstanz) and a Master in Global Studies (Göteborgs Universitet). Since 2015, he has been volunteering at CorrelAid and has supported numerous non-profit organizations in implementing Data Science projects, given talks and workshops, and has been involved in the mentoring program.

Zornitsa Manolova leads the Data Quality Management and Data Science team at the Global Legal Entity Identifier Foundation (GLEIF). Since April 2018, she is responsible for enhancing and improving the established data quality and data governance framework by introducing innovative data analytics approaches. Previously, Zornitsa managed forensic data analytics projects on international financial investigations at PwC Forensics. She holds a German Diploma in Computer Sciences with a focus on Machine Learning from the Philipps University in Marburg.

Dominik is a Data Scientist at the Global Legal Entity Identifier Foundation (GLEIF). His professional focus is on achieving the highest possible data quality in the Global LEI System. In this context, he contributes to establishing best practices based on the most recent developments in the world of data standardization and analytics. Before joining GLEIF, Dominik gained experience in business development and forensic technology solutions and graduated from the Johannes Gutenberg University of Mainz with a master’s degree in Computational Sciences.

Camille Koenders is a Software Developer at &effect. Together with Jan Dix, she programmed the dashboard presented in this talk. She holds a degree in Molecular Biotechnology from the University of Heidelberg and a degree in Computer Science from the Technical University of Berlin. Before working at &effect, she was able to gather several years of programming experience in different companies, such as SAP. She volunteers at CorrelAid, where she is one of the hosts of the data science podcast CorrelTalk.