Version: Next

Accessing API Models

You can access models from API-providers like OpenAI through Takeoff. This makes it possible to build applications that seamlessly use open-source, self-hosted LLMs and proprietary, closed-source models like GPT4.

Simultaneous Hosting of OS & API Models

Why might you want to simultaneously use self-hosted models alongside proprietary ones?

Self-Hosted Fallback Options - Create fallback options that are self-hosted. This is useful for situations when API models are unavailable or when rate limits have been hit.
Specialist Models - Use small specialist models. These can be as good or even better than the larger, generalist models.
A/B Testing - Conduct A/B tests comparing the best open-source models with proprietary models. Collect preference data from these tests. This data can be used to fine-tune open-source models.

How to use API Models in Takeoff

You can use an API model in Takeoff exactly like you would use an open source model. You pass in the name of the model, the and set the device to api. This lets us know not to search the model cache or huggingface for the model weights. Below we give an example of how to use GPT-3.5-turbo from OpenAI.

GPT 3.5 Turbo Example

This is an example of using an OpenAI generation model . This model is reachable using the /generate_stream endpoint.

Command Line
Manifest

docker run -t --gpus all \
    -e TAKEOFF_MODEL_NAME=gpt-3.5-turbo \
    -e TAKEOFF_DEVICE=api \
    -e TAKEOFF_ACCESS_TOKEN=<open-ai-access-token>

takeoff:
  server_config:
    port: 3000
  readers_config:
    reader1:
      model_name: gpt-3.5-turbo
      device: api
      consumer_group: primary
      access_token: <open-ai-access-token>

Text-Embedding-3-Small Example

This is an example of using an OpenAI text embedding model. This model is reachable using the /embed endpoint.

Command Line
Manifest

docker run -t --gpus all \
    -e TAKEOFF_MODEL_NAME=text-embedding-3-small \
    -e TAKEOFF_DEVICE=api \
    -e TAKEOFF_ACCESS_TOKEN=<open-ai-access-token>

takeoff:
  server_config:
    port: 3000
  readers_config:
    reader1:
      model_name: text-embedding-3-small
      device: api
      consumer_group: primary
      access_token: <open-ai-access-token>

GPT4V Example

This is an example of using an OpenAI vision model. This model is reachable using the /image_generate_stream endpoint.

Command Line
Manifest

docker run -t --gpus all \
    -e TAKEOFF_MODEL_NAME=gpt-4-vision-preview \
    -e TAKEOFF_DEVICE=api \
    -e TAKEOFF_ACCESS_TOKEN=<open-ai-access-token>

takeoff:
  server_config:
    port: 3000
  readers_config:
    reader1:
      model_name: gpt-4-vision-preview
      device: api
      consumer_group: primary
      access_token: <open-ai-access-token>

Supported Model Providers

Currently only OpenAI models are supported. We plan to add support for additional API providers like Cohere, Anthropic, and Mistral in the near future.

Accessing API Models

Simultaneous Hosting of OS & API Models​

How to use API Models in Takeoff​

GPT 3.5 Turbo Example​

Text-Embedding-3-Small Example​

GPT4V Example​

Supported Model Providers​