Version: Next

Google Kubernetes Engine (GKE)

This guide walks you through deploying the Takeoff Stack in GCP GKE. As an example, we will deploy a llama3-8b model with an embedding model jina-v2-code-embed

How to Deploy Takeoff Stack in GCP GKE

Prerequisites

Ensure you have the following:

Access to the GCP Console.
gcloud CLI: For interacting with Google Cloud Platform from the terminal.
kubectl: For interacting with the Kubernetes cluster.
helm: For deploying Helm charts.

Step 1: Set Up an GKE Autopilot Cluster

Navigate to the GCP Console and select "Kubernetes Engine".
Click "Create" to set up a new cluster.
Provide a cluster name and region, then leave other configurations as default.
Hit the "Create" button, and GCP will handle the rest.

Step 2: Connect to the Cluster

Log in to your GCP account from the terminal:
```
gcloud auth login
```
Configure your Kubernetes credentials for the cluster:
- Replace <YOUR_CLUSTER_NAME> and <YOUR_CLUSTER_REGION> with your cluster name and region.
```
gcloud container clusters get-credentials <YOUR_CLUSTER_NAME> --region <YOUR_CLUSTER_REGION>
```

You should see output like this:

Fetching cluster endpoint and auth data.
kubeconfig entry generated for autopilot-takeoff-cluster.

Step 3: Prepare the Namespace and Secrets

Create the Namespace
- Use takeoff as the namespace (or choose a different name):
```
kubectl create namespace takeoff
```

Create Secrets

Docker Credentials: Ensure you provide your .dockerconfig.json file.

kubectl create secret generic takeoff-regcred --namespace takeoff \
  --from-file=.dockerconfigjson=<PATH_TO_YOUR_DOCKERCONFIG.JSON> \
  --type=kubernetes.io/dockerconfigjson

Takeoff Secrets: Replace <LICENSE_KEY> and <TOKEN> with your actual values:

kubectl create secret generic takeoff-secrets --namespace takeoff \
  --from-literal=LICENSE_KEY=<LICENSE_KEY> \
  --from-literal=TAKEOFF_ACCESS_TOKEN=<TOKEN>

Step 4: Deploy the Takeoff Stack Using Helm

Navigate to the Helm chart directory:
```
cd pantheon/hades/pantheon-helm/
```

Install the Takeoff Stack:

helm install takeoff-stack ./ --namespace takeoff -f values.yaml -f overwrites/gcp_gke.yaml

Step 5: Test and Validate the Service

Port-Forward the Service

Forward the service port to your local machine:

kubectl port-forward service/takeoff-controller-svc 3000:3000 3001:3001 --namespace takeoff

Access the Frontend
- Open your browser and navigate to:
  - http://localhost:3000 for the main frontend.
  - http://localhost:3001 for the management frontend.
Validate the Model with a Test Request
- Use curl to send a request to the service and verify the model is working:
```
curl -X POST http://localhost:3000/generate -N \
  -d '{"text": ["How are you?"], "consumer_group": "generate"}' \
  --header "Content-Type: application/json"
```
- If successful, the model should return a response confirming it is loaded and functioning.

Google Kubernetes Engine (GKE)

Table of Contents​

Prerequisites​

Step 1: Set Up an GKE Autopilot Cluster​

Step 2: Connect to the Cluster​

Step 3: Prepare the Namespace and Secrets​

Step 4: Deploy the Takeoff Stack Using Helm​

Step 5: Test and Validate the Service​

Table of Contents