April 27, 2020

Run Streamlit.io on Google Cloud Kubernetes

By Rui Costa

I’m a massive advocate of Streamlit. Streamlit is an open-source app framework for Machine Learning and Data Science teams. Streamlit allows you to develop beautiful apps without needing a tools team. But when your app becomes a standard within an organization, how do you scale it, secure and minimize the complexity of managing a deployment workflow.

Streamlit abstracts the tools teams, and Google Cloud with Kubernetes, Identity Aware Proxy (IAP), and an external HTTP(S) Load Balancer will scale, and secure the Streamlit application.

In this tutorial, I will create a container for the Streamlit application, deploy it to Google Cloud Registry, build a Kubernetes cluster, deploy the application to the newly created cluster, create a GLSB, and secure it with SSL – wow. 


Clone the repo

First clone the repo that contains the Streamlit/Spacy application, and the required files to deploy the container to Kubernetes.

git clone https://gitlab.com/ruicosta-blog/streamlit-k8s.git

Create a Docker container

To start the deployment install the Python dependencies for the Streamlit/Spacy application. You can either choose to use a virtualenv or install the packages directly without isolation. virtualenv is a tool for creating isolated virtual python environments, I highly recommend to use virtualenv.

Install the Python packages using your virtualenv [Recommended]

# Mac/Linux
python -m venv env
source env/bin/activate
pip install -r requirements.txt

# Windows
python -m venv env
env\Scripts\activate
pip install -r requirements.txt

Install Python packages locally [Optional]

pip install -r requirements.txt

Possible Errors

  • If you encounter the following error “Cython needs to be installed in Python as a module“, install Cython locally or within your virtualenv (see command below).
    • Cython is an optimising static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex). It makes writing C extensions for Python as easy as Python itself
pip3 install Cython
  • If you encounter errors installing Spacy, please make sure you are not using Python 3.8 (unfortunately it’s not fully supported) as of 04.16.2010

Build a Docker Image

The docker build command builds Docker images from a Dockerfile

docker build -f Dockerfile -t app:latest .

Run the Docker Image Locally [Optional]

Test the created image locally before you move to getting Kubernetes ready for a deployment. Docker processes the image in an isolated container. 

docker run -p 8501:8501 app:latest

If all goes well you will be able to access the Streamlit with Spacy application, http://localhost:8501/.

Try the application, enter a phrase and see what Entities Spacy displays for you.

Google Cloud Kubernetes Deployment

Create a Google Cloud Project

To follow along you will need a Google Cloud Project. Head over to https://cloud.google.com and sign-up for an account or go to the getting started page for more information.

We will need to reserve a Static IP address from Google Cloud. You can run the command below if you have gcloud installed. Follow this guide to install gcloud (https://cloud.google.com/sdk/install).

Google Cloud SDK runs on Linux, macOS, and Windows. Cloud SDK requires Python. Supported versions are 3.5 to 3.7, and 2.7.9 or higher. Some tools bundled with Cloud SDK have additional requirements. For example, Java tools for Google App Engine development require Java 1.7 or later.

Before executing the commands, please confirm you are set to the correct project. You might want to create a new configuration for the new project you created.

gcloud init

Deploy the Container to Google Cloud Registry

List the docker images and keep note of the Image ID of the container we created.

docker images

In the command line below replace the [IMAGE_ID] and the [PROJECT_ID] with your respective IDs.

docker tag [IMAGE_ID] gcr.io/[PROJECT_ID]/app
docker push gcr.io/[PROJECT_ID]/app

If you encounter the following error “You don’t have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in https://cloud.google.com/container-registry/docs/advanced-authentication”

Follow the instructions to authorize docker to push images to the Google Cloud Registry. – https://cloud.google.com/container-registry/docs/advanced-authentication

Creating a GKE cluster

cluster consists of at least one cluster master machine and multiple worker machines called nodes. Nodes are Compute Engine virtual machine (VM) instances that run the Kubernetes processes necessary to make them part of the cluster. You deploy applications to clusters, and the applications run on the nodes.

The following command creates a one-node cluster. Replace cluster-name with the name of your cluster:

gcloud container clusters create cluster-name --num-nodes=1

Please wait for the cluster to complete before continuing.

Reserve an IP Address

gcloud compute addresses create is used to reserve an IP addresses. Once an IP address is reserved, it will be associated with the project until it is released using ‘gcloud compute addresses delete’.

gcloud compute addresses create streamlitweb-ip --global
gcloud compute addresses describe streamlitweb-ip --global

After running the describe command, keep a note of the IP assigned as we will need to create an A record to the associated address.

Update your domain and add an A record to point to the IP address reserved above. For mine, I created an A record for stream.ruicosta.blog.

Using Google-managed SSL certificates

With Google-managed SSL certificates certificates are provisioned, renewed, and managed for your domain names.

Before we apply the certificate, edit the certificate.yaml from the cloned repo and change the host name to your respective choice. In a few you will need to create an A record for your chosen host name with the IP address reserved above.

apiVersion: networking.gke.io/v1beta1
kind: ManagedCertificate
metadata:
 name: streamlit-certificate
spec:
 domains:
   - stream.ruicosta.blog

Once you edit the certificate.yaml, run the command below. Kubectl is a command line tool for controlling Kubernetes clusters.

kubectl apply -f certificate.yaml

Creating a Kubernetes deployment

Deployment provides declarative updates for Pods and ReplicaSets.

Let’s edit the provided deployment.yaml file to describe a desired state in our Deployment.

Update the deployment.yaml and replace [PROJECT_ID] with your project ID.

apiVersion: apps/v1
kind: Deployment
metadata:
 name: streamlitweb
 labels:
   app: streamlit
spec:
 selector:
   matchLabels:
     app: streamlit
     tier: web
 template:
   metadata:
     labels:
       app: streamlit
       tier: web
   spec:
     containers:
     - name: streamlit-app
       image: gcr.io/[PROJECT_ID]/app
       ports:
       - containerPort: 8501

Once you update the yaml file, execute the command below.

kubectl apply -f deployment.yaml

Apply the Ingress Configuration

Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.

    internet
        |
   [ Ingress ]
   --|-----|--
   [ Services ]

You use it to make internal services reachable from outside your cluster. Edit the ingress.yaml you will need to edit the following based on what you choose for your public IP address and your certificate name:

kubernetes.io/ingress.global-static-ip-name: streamlitweb-ip
networking.gke.io/managed-certificates: streamlit-certificate
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
 name: streamlitweb
 annotations:
   kubernetes.io/ingress.global-static-ip-name: streamlitweb-ip
   networking.gke.io/managed-certificates: streamlit-certificate
 labels:
   app: streamlit
spec:
 backend:
   serviceName: streamlitweb-backend
   servicePort: 8501
---
apiVersion: v1
kind: Service
metadata:
 name: streamlitweb-backend
 labels:
   app: streamlit
spec:
 type: NodePort
 selector:
   app: streamlit
   tier: web
 ports:
 - port: 8501
   targetPort: 8501

Once you have edited the ingress.yaml execute the command below.

kubectl apply -f ingress.yaml

You can now head over to the Google Cloud console and under Kubernetes Engine -> Services & Ingress you can see the Ingress being created.

Wait for the Ingress to be created before you continue.

Once completed, you can now visit your deployed application. For our demo you can visit: https://stream.ruicosta.blog

Enabling IAP for GKE

You can now secure your application with Identity-Aware Proxy (IAP)

https://cloud.google.com/iap/docs/enabling-kubernetes-howto

To learn how to deploy IAP for Streamlit, please see my other post: https://ruicosta.blog/2020/04/29/secure-streamlit-io-running-on-google-cloud-kubernetes/