Skip to main content
Version: Next

Launch Parameters


Docker run variables​

Environment variables​

The name, type and behaviour of the launched model can be specified by Takeoff-specific docker environment variables (-e). All mandatory variables, as well as some key optional ones, are listed below.

Environment Variable NameDefault ValueExplanation
TAKEOFF_MODEL_NAMENone (required)The name of the model to initially use - either a huggingface model or the name of a folder mounted to /code/models
TAKEOFF_DEVICENone (required)The device that the server should use - either "cuda" or "cpu".
TAKEOFF_CONSUMER_GROUPprimaryUsed to set the name of a consumer group that the initial model should belong to.
TAKEOFF_MAX_BATCH_SIZE1Sets the batch size the model can use.
TAKEOFF_ACCESS_TOKENNoneAccess token for private Huggingface repositories when running open source models, or API key for service when running API models.
TAKEOFF_CUDA_VISIBLE_DEVICESNone(GPU only) Which GPUs are visible to the reader. Model will be split over all devices listed. If not specified, all GPUs will be visible but the model will only use the first gpu available.
TAKEOFF_QUANT_TYPENone(GPU only) Type of quantization used with model. If no value provided, AWQ will be used if in the model name. If "awq", then will use AWQ irrespective of model name.
TAKEOFF_NVLINK_UNAVAILABLE0(GPU only) Should be set to 1 if you are on a system without NVLINK (e.g. L4s, 4090s) to allow use of multi-gpu.
TAKEOFF_MAX_SEQUENCE_LENGTH8192(GPU only) The maximum forseen length of prompt + generated tokens. If set to 0, will use model's maximum sequence length from its config file.
TAKEOFF_MAX_BODY_LENGTH_BYTES2097152The maximum size of the request body in bytes. If not set, will use the default value of 2097152 (2MB). Request bodies above this size will give a 413 error.
LICENSE_KEYNone (required on first run)Takeoff license key for key validation
OFFLINE_MODEfalseRun takeoff in offline mode
TAKEOFF_LOG_DISABLE_TIMESTAMPSfalseDo not output timestamps in takeoff logs.
NO_COLORfalseDo not output ANSI colours in takeoff logs.
TAKEOFF_TENSOR_PARALLELdeprecatedDeprecated - specify TAKEOFF_CUDA_VISIBLE_DEVICES instead.
note

Note that only LICENSE_KEY, OFFLINE_MODE, and TAKEOFF_MAX_BATCH_SIZE (and thus none of the variables marked as required) are supported when using docker run with a manifest file to configure Takeoff.

Standard Docker Options​

Key Docker options are listed below. These should be provided as flags, as shown in the examples.

OptionPurposeExampleUse in Takeoff
-vVolume mounts a directory, making a local filesystem folder available to the container, with syntax host_directory:container_directory-v ~/.takeoff_cache:/code/models Attaches the local takeoff_cache folder to /code/models inside the containerAllows model files hosted on the local machine be available within the container. Model files can then be shared between instances, rather than each instance having to download a new copy.
-pForwards a container port to a host's port, with syntax host_port:container_port-p 3005:3000
Forwards the internal port 3000 (Takeoff's inference endpoint) to 3005 on the host system.
Takeoff's ports must be forwarded to make its endpoints accessible outside of the container. The RHS port should be one of 3000 (Inference endpoint & playground), 3001 (management API) or 9090 (metrics endpoint). Multiple -p options should be used to allow access to each endpoint locally.
-itStarts the container in interactive mode-itAllows server logs to be monitored and interacted with, e.g. allowing Takeoff to be terminated by CTRL+C
--gpusSpecifies which gpus are available to the container--gpus allAllows Takeoff to access GPUs
--shm-sizeSet the amount of memory available for IPC within the container.--shm-size 2gbAllows the various processes in Takeoff to communicate. Strongly reccomended to be set at 2gb.

See a full reference here.

Manifest file structures​

config.yaml​

The config manifest consists of two sections:

  • server_config: Parameters which control the server as a whole. If not specified, will use the defaults from above.

  • readers_config: An array of reader configurations, with each specifying the behaviour of a specific reader. You can launch Takeoff without any readers by leaving this array empty.

Keys are specified as in the table but without the TAKEOFF_ prefix. The following should be noted:

  • LICENSE_KEY and OFFLINE_MODE can only be specified as environment variables using -e to docker run.
  • Standard Docker options can only be specified as arguments to docker run.
  • 'server_config' variables in config.yaml can be overridden by passing environment variables to docker run.
  • 'reader_config' variables must not be passed in to docker run whilst specifying readers using a manifest file. A reader's configuration is specific to that reader, and so a separate configuration must be given for each via the config file.

config.yaml is of the format:

takeoff:
server_config:
<AppConfig>
readers_config:
<ReaderName>:
<ReaderConfig>
<ReaderName>:
<ReaderConfig>

Where:

<Appconfig>
repository_path: str
<ReaderConfig>
model_name: str *required*
device: str *required*
consumer_group: str *required*
max_batch_size: int
cuda_visible_devices: str #e.g. "0,1,2,3" or "0"
quant_type: str
max_sequence_length: int
nvlink_unavailable: int