Elastic Kubernetes Service (EKS)
How to Deploy Takeoff Stack in AWS EKS Table of Contents
- Prerequisites
- Step 1: Set Up an EKS Cluster
- Step 2: Add a Compute Node Group
- Step 3: Connect to the Cluster
- Step 4: Configure the GPU Driver
- Step 5: Prepare the Namespace and Secrets
- Step 6: Deploy the Takeoff Stack Using Helm
- Step 7: Test and Validate the Service
This guide walks you through deploying the Takeoff Stack in AWS EKS. As an example, we will deploy a llama3-8b model with an embedding model jina-v2-code-embed
Prerequisites​
Ensure you have the following:
- Access to the AWS EKS Console.
- AWS CLI installed and configured.
- kubectl: For interacting with the Kubernetes cluster.
- helm: For deploying Helm charts.
Step 1: Set Up an EKS Cluster​
-
Create the EKS Cluster
- Navigate to the AWS EKS Console and click "Create EKS cluster".
- Follow the step-by-step configuration guide provided in the console.
- Important: In Step 4 (Add-ons), enable the Amazon EBS CSI Driver by toggling it on. This is required for EBS volume support.c
- Complete the configuration and create the cluster.
-
Wait for Cluster Creation
- Cluster creation may take some time. Once completed, proceed to add a compute node group and configure the GPU driver.
Step 2: Add a Compute Node Group​
-
In the EKS Console, create a Compute Node Group with the following configuration:
- Instance Type:
g5.16xlarge
(1 x A10 GPU). - Scaling:
- Minimum:
0
- Maximum:
3
- Minimum:
- Instance Type:
-
After completing the setup, your node group should look like this:
Step 3: Connect to the Cluster​
-
Open your terminal and connect to the cluster:
- Replace
aws-takeoff-cluster
andeu-west-2
with your cluster name and region.
aws eks update-kubeconfig --region <your-region> --name <your-cluster-name>
- Replace
-
Successful connection will display a message like:
Added new context arn:aws:eks:<your-region>:<account-id>:cluster/<your-cluster-name> to ~/.kube/config
Step 4: Configure the GPU Driver​
-
Deploy the NVIDIA device plugin to enable GPU support:
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml
-
Verify GPU availability in the nodes:
kubectl get nodes -o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu
Step 5: Prepare the Namespace and Secrets​
-
Create the Namespace
- Use
takeoff
as the namespace (or choose a different name):kubectl create namespace takeoff
- Use
-
Create Secrets
-
Docker Credentials: Ensure you provide your
.dockerconfig.json
file.kubectl create secret generic takeoff-regcred --namespace takeoff \
--from-file=.dockerconfigjson=<PATH_TO_YOUR_DOCKERCONFIG.JSON> \
--type=kubernetes.io/dockerconfigjson -
Takeoff Secrets: Replace
<LICENSE_KEY>
and<TOKEN>
with your actual values:kubectl create secret generic takeoff-secrets --namespace takeoff \
--from-literal=LICENSE_KEY=<LICENSE_KEY> \
--from-literal=TAKEOFF_ACCESS_TOKEN=<TOKEN>
-
Step 6: Deploy the Takeoff Stack Using Helm​
-
Navigate to the Helm chart directory:
cd pantheon/hades/pantheon-helm/
-
Install the Takeoff Stack:
helm install takeoff-stack ./ --namespace takeoff -f values.yaml -f overwrites/aws_eks.yaml
Step 7: Test and Validate the Service​
-
Port-Forward the Service
-
Forward the service port to your local machine:
kubectl port-forward service/takeoff-controller-svc 3000:3000 3001:3001 --namespace takeoff
-
-
Access the Frontend
- Open your browser and navigate to:
http://localhost:3000
for the main frontend.http://localhost:3001
for the management frontend.
- Open your browser and navigate to:
-
Validate the Model with a Test Request
-
Use
curl
to send a request to the service and verify the model is working:curl -X POST http://localhost:3000/generate -N \
-d '{"text": ["How are you?"], "consumer_group": "generate"}' \
--header "Content-Type: application/json" -
If successful, the model should return a response confirming it is loaded and functioning.
-