Skip to main content
Version: 0.21.x

Elastic Kubernetes Service (EKS)


This guide walks you through deploying the Takeoff Stack in AWS EKS. As an example, we will deploy a llama3-8b model with an embedding model jina-v2-code-embed

Table of Contents​


Prerequisites​

Ensure you have the following:

  • Access to the AWS EKS Console.
  • AWS CLI installed and configured.
  • kubectl: For interacting with the Kubernetes cluster.
  • helm: For deploying Helm charts.

Step 1: Set Up an EKS Cluster​

  1. Create the EKS Cluster

    • Navigate to the AWS EKS Console and click "Create EKS cluster".
    • Follow the step-by-step configuration guide provided in the console.
      • Important: In Step 4 (Add-ons), enable the Amazon EBS CSI Driver by toggling it on. This is required for EBS volume support.c
    • Complete the configuration and create the cluster.
  2. Wait for Cluster Creation

    • Cluster creation may take some time. Once completed, proceed to add a compute node group and configure the GPU driver.

Step 2: Add a Compute Node Group​

eks-nodegroup

  1. In the EKS Console, create a Compute Node Group with the following configuration:

    • Instance Type: g5.16xlarge (1 x A10 GPU).
    • Scaling:
      • Minimum: 0
      • Maximum: 3
  2. After completing the setup, your node group should look like this:

    eks-nodegroup-success


Step 3: Connect to the Cluster​

  1. Open your terminal and connect to the cluster:

    • Replace aws-takeoff-cluster and eu-west-2 with your cluster name and region.
    aws eks update-kubeconfig --region <your-region> --name <your-cluster-name>
  2. Successful connection will display a message like:

    Added new context arn:aws:eks:<your-region>:<account-id>:cluster/<your-cluster-name> to ~/.kube/config

Step 4: Configure the GPU Driver​

  1. Deploy the NVIDIA device plugin to enable GPU support:

    kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml
  2. Verify GPU availability in the nodes:

    kubectl get nodes -o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu

Step 5: Prepare the Namespace and Secrets​

  1. Create the Namespace

    • Use takeoff as the namespace (or choose a different name):
      kubectl create namespace takeoff
  2. Create Secrets

    • Docker Credentials: Ensure you provide your .dockerconfig.json file.

      kubectl create secret generic takeoff-regcred --namespace takeoff \
      --from-file=.dockerconfigjson=<PATH_TO_YOUR_DOCKERCONFIG.JSON> \
      --type=kubernetes.io/dockerconfigjson
    • Takeoff Secrets: Replace <LICENSE_KEY> and <TOKEN> with your actual values:

      kubectl create secret generic takeoff-secrets --namespace takeoff \
      --from-literal=LICENSE_KEY=<LICENSE_KEY> \
      --from-literal=TAKEOFF_ACCESS_TOKEN=<TOKEN>

Step 6: Deploy the Takeoff Stack Using Helm​

  1. Navigate to the Helm chart directory:

    cd pantheon/hades/pantheon-helm/
  2. Install the Takeoff Stack:

    helm install takeoff-stack ./ --namespace takeoff -f values.yaml -f overwrites/aws_eks.yaml

Step 7: Test and Validate the Service​

  1. Port-Forward the Service

    • Forward the service port to your local machine:

      kubectl port-forward service/takeoff-controller-svc 3000:3000 3001:3001 --namespace takeoff
  2. Access the Frontend

    • Open your browser and navigate to:
      • http://localhost:3000 for the main frontend.
      • http://localhost:3001 for the management frontend.
  3. Validate the Model with a Test Request

    • Use curl to send a request to the service and verify the model is working:

      curl -X POST http://localhost:3000/generate -N \
      -d '{"text": ["How are you?"], "consumer_group": "generate"}' \
      --header "Content-Type: application/json"
    • If successful, the model should return a response confirming it is loaded and functioning.