Skip to main content

One post tagged with "titan takeoff"

View All Tags

LLMs in Production: Deploying the TitanML Takeoff Server on Kubernetes

· 12 min read
Fergus Finn
CTO, TitanML

Large Language Models (LLMs) are a transformative new technology that have great potential to transform the way that we build software. They generate text, answer questions, and write code. However, deploying these models remains challenging due to their size and the substantial compute resources they require. This post is focused on using two infrastructure tools, Docker and Kubernetes, to deploy Titan Takeoff, a docker image that bundles optimization and serving technology specifically designed for LLMs. We're following on from our primer where we give an introduction to Docker and Kubernetes, and explain how they can be used to deploy machine learning models.