📄️ Chat Template
The `reader_id` can be found as the model's key in the `live_readers` object, returned by the status endpoint.
📄️ Classify
To send a batch of requests all at once, the text field can either be a string, an array of strings,
📄️ Detokenize
The detokenization endpoint takes in a list of tokens, and returns the decoded version of the text: the way the text is seen by the model
📄️ Embed
To send a batch of requests all at once, the text field can be either a string, or an array of
📄️ Generate from image (Streamed)
The `/image_generate_stream` endpoint is used to communicate with the LLM. Use this endpoint when you want to send an image to a multimodal LLM, and
📄️ Generate from image (Buffered)
The `/image_generate` endpoint is used to communicate with the LLM. Use this endpoint when you want to send an image to a multimodal
📄️ Generate (Streamed)
The `/generate_stream` endpoint is used to communicate with the LLM. Use this endpoint when you want to
📄️ Generate (Buffered)
The `/generate` endpoint is used to communicate with the LLM. Use this endpoint when you want to
📄️ Check health of all readers
The `/healthz` endpoint is used to check if the server is running, and whether its ready to receive requests. A 200
📄️ Status
Returns all information about the app and its current status.
📄️ Tokenize
The tokenization endpoint takes in a text, and returns the tokenized version of the text: the way the text is seen by the model