Run Streamlit.io on Google Cloud Kubernetes
I’m a massive advocate of Streamlit. Streamlit is an open-source app framework for Machine Learning and Data Science teams. Streamlit allows you to develop beautiful apps without needing a tools team. But when your app becomes a standard within an organization, how do you scale it, secure and minimize the complexity of managing a deployment workflow.
Streamlit abstracts the tools teams, and Google Cloud with Kubernetes, Identity Aware Proxy (IAP), and an external HTTP(S) Load Balancer will scale, and secure the Streamlit application.
In this tutorial, I will create a container for the Streamlit application, deploy it to Google Cloud Registry, build a Kubernetes cluster, deploy the application to the newly created cluster, create a GLSB, and secure it with SSL – wow.
Clone the repo
First clone the repo that contains the Streamlit/Spacy application, and the required files to deploy the container to Kubernetes.
git clone https://gitlab.com/ruicosta-blog/streamlit-k8s.git
Create a Docker container
To start the deployment install the Python dependencies for the Streamlit/Spacy application. You can either choose to use a virtualenv or install the packages directly without isolation. virtualenv is a tool for creating isolated
virtual python environments, I highly recommend to use virtualenv.
Install the Python packages using your virtualenv [Recommended]
# Mac/Linux python -m venv env source env/bin/activate pip install -r requirements.txt # Windows python -m venv env env\Scripts\activate pip install -r requirements.txt
Install Python packages locally [Optional]
pip install -r requirements.txt
- If you encounter the following error “Cython needs to be installed in Python as a module“, install Cython locally or within your virtualenv (see command below).
- Cython is an optimising static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex). It makes writing C extensions for Python as easy as Python itself
pip3 install Cython
- If you encounter errors installing Spacy, please make sure you are not using Python 3.8 (unfortunately it’s not fully supported) as of 04.16.2010
Build a Docker Image
docker build command builds Docker images from a Dockerfile
docker build -f Dockerfile -t app:latest .
Run the Docker Image Locally [Optional]
Test the created image locally before you move to getting Kubernetes ready for a deployment. Docker processes the image in an isolated container.
docker run -p 8501:8501 app:latest
If all goes well you will be able to access the Streamlit with Spacy application, http://localhost:8501/.
Try the application, enter a phrase and see what Entities Spacy displays for you.
Google Cloud Kubernetes Deployment
Create a Google Cloud Project
Google Cloud SDK runs on Linux, macOS, and Windows. Cloud SDK requires Python. Supported versions are 3.5 to 3.7, and 2.7.9 or higher. Some tools bundled with Cloud SDK have additional requirements. For example, Java tools for Google App Engine development require Java 1.7 or later.
Before executing the commands, please confirm you are set to the correct project. You might want to create a new configuration for the new project you created.
Deploy the Container to Google Cloud Registry
List the docker images and keep note of the Image ID of the container we created.
In the command line below replace the [IMAGE_ID] and the [PROJECT_ID] with your respective IDs.
docker tag [IMAGE_ID] gcr.io/[PROJECT_ID]/app docker push gcr.io/[PROJECT_ID]/app
If you encounter the following error “You don’t have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in https://cloud.google.com/container-registry/docs/advanced-authentication”
Follow the instructions to authorize docker to push images to the Google Cloud Registry. – https://cloud.google.com/container-registry/docs/advanced-authentication
Creating a GKE cluster
A cluster consists of at least one cluster master machine and multiple worker machines called nodes. Nodes are Compute Engine virtual machine (VM) instances that run the Kubernetes processes necessary to make them part of the cluster. You deploy applications to clusters, and the applications run on the nodes.
The following command creates a one-node cluster. Replace
cluster-name with the name of your cluster:
gcloud container clusters create cluster-name --num-nodes=1
Please wait for the cluster to complete before continuing.
Reserve an IP Address
gcloud compute addresses create is used to reserve an IP addresses. Once an IP address is reserved, it will be associated with the project until it is released using ‘gcloud compute addresses delete’.
gcloud compute addresses create streamlitweb-ip --global gcloud compute addresses describe streamlitweb-ip --global
After running the describe command, keep a note of the IP assigned as we will need to create an A record to the associated address.
Update your domain and add an A record to point to the IP address reserved above. For mine, I created an A record for stream.ruicosta.blog.
Using Google-managed SSL certificates
With Google-managed SSL certificates certificates are provisioned, renewed, and managed for your domain names.
Before we apply the certificate, edit the certificate.yaml from the cloned repo and change the host name to your respective choice. In a few you will need to create an A record for your chosen host name with the IP address reserved above.
apiVersion: networking.gke.io/v1beta1 kind: ManagedCertificate metadata: name: streamlit-certificate spec: domains: - stream.ruicosta.blog
Once you edit the certificate.yaml, run the command below. Kubectl is a command line tool for controlling Kubernetes clusters.
kubectl apply -f certificate.yaml
Creating a Kubernetes deployment
Let’s edit the provided deployment.yaml file to describe a desired state in our Deployment.
Update the deployment.yaml and replace [PROJECT_ID] with your project ID.
apiVersion: apps/v1 kind: Deployment metadata: name: streamlitweb labels: app: streamlit spec: selector: matchLabels: app: streamlit tier: web template: metadata: labels: app: streamlit tier: web spec: containers: - name: streamlit-app image: gcr.io/[PROJECT_ID]/app ports: - containerPort: 8501
Once you update the yaml file, execute the command below.
kubectl apply -f deployment.yaml
Apply the Ingress Configuration
internet | [ Ingress ] --|-----|-- [ Services ]
You use it to make internal services reachable from outside your cluster. Edit the ingress.yaml you will need to edit the following based on what you choose for your public IP address and your certificate name:
kubernetes.io/ingress.global-static-ip-name: streamlitweb-ip networking.gke.io/managed-certificates: streamlit-certificate
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: streamlitweb annotations: kubernetes.io/ingress.global-static-ip-name: streamlitweb-ip networking.gke.io/managed-certificates: streamlit-certificate labels: app: streamlit spec: backend: serviceName: streamlitweb-backend servicePort: 8501 --- apiVersion: v1 kind: Service metadata: name: streamlitweb-backend labels: app: streamlit spec: type: NodePort selector: app: streamlit tier: web ports: - port: 8501 targetPort: 8501
Once you have edited the ingress.yaml execute the command below.
kubectl apply -f ingress.yaml
You can now head over to the Google Cloud console and under Kubernetes Engine -> Services & Ingress you can see the Ingress being created.
Wait for the Ingress to be created before you continue.
Once completed, you can now visit your deployed application. For our demo you can visit: https://stream.ruicosta.blog
Enabling IAP for GKE
You can now secure your application with Identity-Aware Proxy (IAP)
To learn how to deploy IAP for Streamlit, please see my other post: https://ruicosta.blog/2020/04/29/secure-streamlit-io-running-on-google-cloud-kubernetes/