Skip to main content
Version: Next

Google Kubernetes Engine (GKE)


This guide walks you through deploying the Takeoff Stack in GCP GKE. As an example, we will deploy a llama3-8b model with an embedding model jina-v2-code-embed

Table of Contents​


Prerequisites​

Ensure you have the following:

  • Access to the GCP Console.
  • gcloud CLI: For interacting with Google Cloud Platform from the terminal.
  • kubectl: For interacting with the Kubernetes cluster.
  • helm: For deploying Helm charts.

Step 1: Set Up an GKE Autopilot Cluster​

  1. Navigate to the GCP Console and select "Kubernetes Engine".
  2. Click "Create" to set up a new cluster.
  3. Provide a cluster name and region, then leave other configurations as default.
  4. Hit the "Create" button, and GCP will handle the rest. GKE

Step 2: Connect to the Cluster​

  1. Log in to your GCP account from the terminal:

    gcloud auth login
  2. Configure your Kubernetes credentials for the cluster:

    • Replace <YOUR_CLUSTER_NAME> and <YOUR_CLUSTER_REGION> with your cluster name and region.

      gcloud container clusters get-credentials <YOUR_CLUSTER_NAME> --region <YOUR_CLUSTER_REGION>
  3. You should see output like this:

    Fetching cluster endpoint and auth data.
    kubeconfig entry generated for autopilot-takeoff-cluster.

Step 3: Prepare the Namespace and Secrets​

  1. Create the Namespace

    • Use takeoff as the namespace (or choose a different name):
      kubectl create namespace takeoff
  2. Create Secrets

    • Docker Credentials: Ensure you provide your .dockerconfig.json file.

      kubectl create secret generic takeoff-regcred --namespace takeoff \
      --from-file=.dockerconfigjson=<PATH_TO_YOUR_DOCKERCONFIG.JSON> \
      --type=kubernetes.io/dockerconfigjson
    • Takeoff Secrets: Replace <LICENSE_KEY> and <TOKEN> with your actual values:

      kubectl create secret generic takeoff-secrets --namespace takeoff \
      --from-literal=LICENSE_KEY=<LICENSE_KEY> \
      --from-literal=TAKEOFF_ACCESS_TOKEN=<TOKEN>

Step 4: Deploy the Takeoff Stack Using Helm​

  1. Navigate to the Helm chart directory:

    cd pantheon/hades/pantheon-helm/
  2. Install the Takeoff Stack:

    helm install takeoff-stack ./ --namespace takeoff -f values.yaml -f overwrites/gcp_gke.yaml

Step 5: Test and Validate the Service​

  1. Port-Forward the Service

    • Forward the service port to your local machine:

      kubectl port-forward service/takeoff-controller-svc 3000:3000 3001:3001 --namespace takeoff
  2. Access the Frontend

    • Open your browser and navigate to:
      • http://localhost:3000 for the main frontend.
      • http://localhost:3001 for the management frontend.
  3. Validate the Model with a Test Request

    • Use curl to send a request to the service and verify the model is working:

      curl -X POST http://localhost:3000/generate -N \
      -d '{"text": ["How are you?"], "consumer_group": "generate"}' \
      --header "Content-Type: application/json"
    • If successful, the model should return a response confirming it is loaded and functioning.