PyData Global 2022

Modern Analytics in the Cloud - A case for fraud detection
12-03, 17:00–17:30 (UTC), Talk Track II

There’s a growing interest from small and large companies alike to move their data and their analytical pipelines into the Cloud as it adds large cost and operational benefits to businesses. Despite this, it can be unclear and sometimes confusing to know how cloud services can be used to replicate your existing analytical solutions in the Cloud or even how services can fit together to build new solutions.
The goal of this talk is to help answer these two questions. First by explaining what modern analytics look like in cloud environments and then by presenting a live use case for building an end-to-end analytical solution in the context of fraud detection for E-commerce businesses.

This talk will assume knowledge in some areas, such as the Hadoop ecosystem and the main tools used such as Airflow, Kafka, Spark, etc. an overall idea will be more than sufficient and some experience with building and deploying machine learning models (some MLOps experience). Therefore, the target audience would be data scientists/engineers with 4-5 years of experience working in analytics and/or architects looking to move their analytics solutions to the Cloud but are still unsure how it can fit together.

At the end of the talk, the audience will have a clear understanding of how modern analytics can be performed in the cloud and what a typical modern data architecture looks like. In the context of AWS, the audience will also have an understanding of the AWS analytics service offerings and what services can be used for/tailored to their needs. Finally, the audience will gain a clearer idea of how they can leverage ML capabilities to build a full pipeline in the cloud while cutting their development time by half.

The proposed outline for the talk will follow the description below:

The evolution of analytics from the 90s to current day (2-3 mins)
Modern analytics in the Cloud - what’s available (4-5 mins)
How analytics is done in the Cloud - tools to help manage the cloud solutions (5 mins)
Case study - Fraud Detection for Ecommerce (2-3 mins)
Refresher concepts (3 mins)
Breaking down the architecture (6-7 mins)
Scaling and improving the solution (5-6 mins)


The goal of the first half of the talk is to provide the audience will a solid understanding of what analytics look like in the Cloud (specifically AWS). We'll go over all the analytical solutions available and the use case for each one so it provides you with a better idea of how it can be used. The goal of the second half of the talk is to go over a live case study previously worked on to build a fraud detection model to detect fraudulent transactions for an E-commerce business. The architecture built will be explained and additional improvements to it will be discussed.


Prior Knowledge Expected

Previous knowledge expected

Hello, I'm 3X AWS certified data analytics/ML specialist focusing on building end-to-end analytical solutions in the AWS Cloud for small and medium sized businesses. I primarily work with clients to help them build out data architectures that are scalable, reliable and efficient then help them explore and build additional analytics capabilities and data-driven solutions they should have to make better business decisions or better serve their customers.
I have worked with clients across multiple industries such as e-commerce, digital marketing, politics and NGOs.
Outside of work, I play sports 2 times per day and I am a professional diver who loves travelling around the world chasing sharks and dolphins while learning a word or two in different languages.