Running Jupyter lab behind NGINX--Part 2

If you haven’t read part 1, you may want to start there.

In the last post, we left off with a working reverse proxy, but we couldn’t access Jupyter lab due to its auth enforcement. Because of how we’re setting this up, we will be handling authentication upstream of Jupyter Lab, and we don’t want to rely on them for handling authentication. What we are going to do here is generally considered “unsafe.”
Again, if you’re looking to do this for your team, check out Jupyter Hub–it probably makes more sense for your use case.

To disable token auth, we will update our Jupyter Lab config.

There is an extensive config file for Jupyter Lab. In a production environment, I recommend using it (you can generate a sample file by running jupyter notebook --generate-config). But, for this toy example, we will pass our config as cmd line args. To disable token auth and to allow same-origin requests, we’re going to update our Jupyter Lab Dockerfile Entrypoint to include these arguments

"--ServerApp.token=", "--ServerApp.password=", "--ServerApp.allow_origin", "*"

Our Dockerfile should now look like

FROM ubuntu:20.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y \
      python3-pip \
      python3-dev && \
      python3 -m pip install jupyterlab

RUN useradd -ms /bin/bash jupyter
EXPOSE 8888
ENTRYPOINT ["jupyter", "lab", "--ip=0.0.0.0", "--port", "8888", "--allow-root", "--ServerApp.token=", "--ServerApp.password=", "--ServerApp.allow_origin", "*"]

And if we rebuild and start our docker compose again

docker compose build && docker compose up

We now get through to Juypter! jupyter lab

But if we try and open the Python kernel, we’ll notice it’s having trouble connecting. jupyter lab_cant_connect

Opening our browser dev tools shows that there is an issue with how our proxy is handling WebSockets jupyter lab_websockets

We’ll have to update our Nginx config to address this.
We will add these lines to set headers properly for WebSockets to our / location in the server block.

...
  server {
    listen       8000;
    server_name  localhost;

    location / {
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_hide_header "X-Frame-Options";
        proxy_pass http://upstream_jupyter;
        
        # websocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade "websocket";
        proxy_set_header Connection "Upgrade";
        proxy_read_timeout 86400;
    }
  }
...

And now, if we restart our containers using the updated config, we’ll see our kernel connects!

jupyter lab_websockets

If you’re wondering how we will handle security when we’re basically giving whoever is using this a terminal into our cloud, the answer is using AWS to isolate the instance via IAM roles/ policy. We aren’t going to get too much into that in this post, but it is a valid concern. There isn’t much we can do to prevent a privilege escalation/container escape from a sophisticated user, but we can at least not give root access.

We’re going to update our Jupyter Dockerfile to have a new user, ‘jupyter’, and we’ll run Jupyter Lab as this user.

...
RUN useradd -ms /bin/bash jupyter
USER jupyter
...

We’re also going to update our ENTRYPOINT, so the Jupyter Lab root directory is set to the Jupyter user’s home directory

"--ServerApp.root_dir", "/home/jupyter", "--ServerApp.notebook_dir", "/home/jupyter"

Our Dockerfile should now look like

FROM ubuntu:20.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y \
      python3-pip \
      python3-dev && \
      python3 -m pip install jupyterlab

# add user and switch to them
RUN useradd -ms /bin/bash jupyter
USER jupyter

EXPOSE 8888
ENTRYPOINT ["jupyter", "lab", "--ip=0.0.0.0", "--port", "8888", "--ServerApp.token=", "--ServerApp.password=", "--ServerApp.allow_origin", "*", "--ServerApp.root_dir", "/home/jupyter", "--ServerApp.notebook_dir", "/home/jupyter"]

If we reload our site, we’ll see that the working directory is now set to /home/jupyter, and if we try to write to /, we’ll get a permissions error. It’s important to note that while this makes it a little more difficult for a malicious user to take over this ‘instance’, we will be giving them access to the internet, the ability to download and install packages, execute code, etc. It would not be too difficult for someone with mal intent to get around this. Changing the user and working directory does more to help an innocent user from accidentally breaking something.

Great! Now we have disabled token authentication, added a system user (who is now running Jupyter), and changed our notebook directory to our user’s directory! In the next post, we’ll set up a task definition and deploy to ECS.

If you haven’t read part 1, you may want to start there.¶

If you haven’t read part 1, you may want to start there.