Skip to main content
Version: 0.21.x

Generate from image (Buffered)

POST 

/image_generate

Generate from image (Buffered)

The /image_generate endpoint is used to communicate with the LLM. Use this endpoint when you want to send an image to a multimodal LLM and receive a text response, all at once. See the /image_generate_stream endpoint if you want to receive a stream of responses from the LLM, token by token.

This endpoint takes a multipart input, with two required fields:

  1. 'json_data': should contain json data, matching the format used for the /generate and /generate_stream endpoints.
  2. 'image_data': a stream of bytes, representing an image file.

Multipart requests support is built into most common HTTP clients.

To send a batch of requests with the same image, the text field of the json payload can be either a string, or an array of strings. Only one image can be supplied per request - to supply a set of generation requests each to different images, send them in quick succession and rely on automatic batching.

Request​

Body

required

    image_data binaryrequired

    json_data

    object

    required

    JSON generation payload, used in /generate, /generate_stream, /image_generate, /image_generate_stream

    constrained_decoding_backend stringnullable
    consumer_group stringnullable
    image_paths string[]nullable
    json_schema nullable
    max_new_tokens int64nullable
    min_new_tokens int64nullable
    no_repeat_ngram_size int64nullable
    prompt_max_tokens int64nullable
    regex_string stringnullable
    repetition_penalty floatnullable
    sampling_temperature floatnullable
    sampling_topk int64nullable
    sampling_topp floatnullable

    text

    object

    required

    Input Text used for ease of users not to have to use the clunky PayloadText. Mapping provided below to convert InputText to PayloadText.

    oneOf

    string

Responses​

Takes in a JSON payload and returns the response all at once.

Schema

    text

    object

    required

    Input Text used for ease of users not to have to use the clunky PayloadText. Mapping provided below to convert InputText to PayloadText.

    oneOf

    string

Loading...