PyData Global 2022

Mischief Managed: What hackers can do on your Jupyter instance
12-01, 18:00–18:30 (UTC), Talk Track I

Many Python data professionals work daily in JupyterLab or Notebook instances. What can a hacker do with access to that system? In this presentation, I will introduce the threat model and show why Jupyter instances are valuable targets. Next, I will demonstrate several post-exploitation activities that someone may try to perform on systems hosting Jupyter instances. We will conclude with some defensive strategies to minimize the likelihood and impact of these activities. This talk will help data scientists and information technology professionals better understand the perspective of potential attackers operating in Jupyter environments to improve defensive awareness and behavior.


As an offensive security researcher for artificial intelligence systems at NVIDIA, I regularly conduct offensive “red team” operations against data scientists and machine learning researchers. JupyterLab is a great development environment that enables rapid prototyping and collaboration with access to shared resources. However, it is also a tool whose security context is often poorly understood and managed. Instead of focusing on the initial “hack” or exploitation, I will demonstrate the mischief and damage that an attacker can cause after initial access (the “post exploitation” phase). All of these demonstrations will focus on configurations and documented functionality. There will be no active exploitation of software vulnerabilities (nothing requiring responsible disclosure). This way, the presentation can use the “mischief” to propose defensive awareness strategies for attendees. Attendees should be familiar with Jupyter, Python, and client/server architecture, but will not need a security background. After the presentation, data scientists and information technology leaders should be more aware of the risks introduced by JupyterLab environments. The presentation is intended to increase security awareness and build good instincts, not to instill fear or motivate a shift away from JupyterLab.

Some examples of demonstrated post-exploitation activities are documented in my blog here: https://josephtlucas.github.io/blog/content/jupyter.html and in this tweet: https://twitter.com/josephtlucas/status/1570158892163956737.

Timeline:
Minute 0-1: Introduction
Minute 2-4: Introduce JupyterLab and explore the various configurations (local, remote server, cloud offerings, etc).
Minute 5-6: Brief introduction to attack phases (define “post exploitation”). Since much of this presentation will be live demonstrations, orient the audience to the various terminal sessions (ex: attacker’s screen vs data scientist’s screen).
Minute 7-20: Demonstrations of various post-exploitation activities including: stealing user secrets, overwriting user variables, persistence mechanisms, tampering with history, poisoning imports, and others. Each demonstration will include references to relevant documentation. This functionality exists for benign purposes, attackers may just use it in unintended ways.
Minute 20-25: Present defensive recommendations and mitigation strategies.
Minute 25-30: Questions.


Prior Knowledge Expected

No previous knowledge expected

Joe loves the intersection of data science and security. He is currently an Offensive Security Researcher for Artificial Intelligence at NVIDIA and was previously a TPM on the AWS Red Team. He's a fan of the PyData ecosystem and has spoken at PyCon US and made small contributions to several open source libraries. He maintains a repository of machine learning security puzzles (HackThisAI) and recently worked with the AI Village to run a Capture-the-Flag competition at DEFCON30.