Version: 0.12.x

takeoff_client

This module contains the TakeoffClient class, which is used to interact with the Takeoff server.

TakeoffClient Objects

class TakeoffClient()

init

def __init__(base_url: str = "http://localhost",
             port: int = 3000,
             mgmt_port: int = None)

TakeoffClient is used to interact with the Takeoff server.

Arguments:

base_url str, optional - base url that takeoff server runs on. Defaults to "http://localhost".
port int, optional - port that main server runs on. Defaults to 8000.
mgmt_port int, optional - port that management api runs on. Usually be port + 1. Defaults to None.

get_readers

def get_readers() -> dict

Get a list of information about all readers.

Returns:

dict - List of information about all readers.

embed

def embed(text: Union[str, List[str]],
          consumer_group: str = "primary") -> dict

Embed a batch of text.

Arguments:

text str | List[str] - Text to embed.
consumer_group str, optional - consumer group to use. Defaults to "primary".

Returns:

dict - Embedding response.

classify

def classify(text: Union[str, List[str], List[List[str]]],
             consumer_group: str = "classify") -> dict

Classify a batch of text.

Text that is passed in as a list of list of strings will be concatenated on the innermost list, and the outermost list treated as a batch of concatenated strings.

Concatenation happens server-side, as it needs information from the model tokenizer.

Arguments:

text str | List[str] | List[List[str]] - Text to classify.
consumer_group str, optional - consumer group to use. Defaults to "classify".

Returns:

dict - Classification response.

generate

def generate(text: Union[str, List[str]],
             sampling_temperature: float = None,
             sampling_topp: float = None,
             sampling_topk: int = None,
             repetition_penalty: float = None,
             no_repeat_ngram_size: int = None,
             max_new_tokens: int = None,
             min_new_tokens: int = None,
             regex_string: str = None,
             json_schema: Any = None,
             prompt_max_tokens: int = None,
             consumer_group: str = "primary",
             image_path: Optional[Path] = None) -> dict

Generates text, seeking a completion for the input prompt. Buffers output and returns at once.

Arguments:

text str - Input prompt from which to generate
sampling_topp float, optional - Sample from set of tokens whose cumulative probability exceeds this value
sampling_temperature float, optional - Sample predictions from the top K most probable candidates
sampling_topk int, optional - Sample with randomness. Bigger temperatures are associated with more randomness.
repetition_penalty float, optional - Penalise the generation of tokens that have been generated before. Set to > 1 to penalize.
no_repeat_ngram_size int, optional - Prevent repetitions of ngrams of this size.
max_new_tokens int, optional - The maximum number of (new) tokens that the model will generate.
min_new_tokens int, optional - The minimum number of (new) tokens that the model will generate.
regex_string str, optional - The regex string which generations will adhere to as they decode.
json_schema dict, optional - The JSON Schema which generations will adhere to as they decode. Ignored if regex_str is set.
prompt_max_tokens int, optional - The maximum length (in tokens) for this prompt. Prompts longer than this value will be truncated.
consumer_group str, optional - The consumer group to which to send the request.
image_path Path, optional - Path to the image file to be used as input. Defaults to None.
Note - This is only available if the running model supports image to text generation, for example with LlaVa models.

Returns:

Output dict - The response from Takeoff containing the generated text as a whole.

generate_stream

def generate_stream(text: Union[str, List[str]],
                    sampling_temperature: float = None,
                    sampling_topp: float = None,
                    sampling_topk: int = None,
                    repetition_penalty: float = None,
                    no_repeat_ngram_size: int = None,
                    max_new_tokens: int = None,
                    min_new_tokens: int = None,
                    regex_string: str = None,
                    json_schema: dict = None,
                    prompt_max_tokens: int = None,
                    consumer_group: str = "primary",
                    image_path: Optional[Path] = None) -> Iterator[Event]

Generates text, seeking a completion for the input prompt.

Arguments:

text str | List[str] - Input prompt from which to generate
sampling_temperature float, optional - Sample predictions from the top K most probable candidates
sampling_topp float, optional - Sample from set of tokens whose cumulative probability exceeds this value
sampling_topk int, optional - Sample with randomness. Bigger temperatures are associated with more randomness.
repetition_penalty float, optional - Penalise the generation of tokens that have been generated before. Set to > 1 to penalize.
no_repeat_ngram_size int, optional - Prevent repetitions of ngrams of this size.
max_new_tokens int, optional - The maximum number of (new) tokens that the model will generate.
min_new_tokens int, optional - The minimum number of (new) tokens that the model will generate.
regex_string str, optional - The regex string which generations will adhere to as they decode.
json_schema dict, optional - The JSON Schema which generations will adhere to as they decode. Ignored if regex_str is set.
prompt_max_tokens int, optional - The maximum length (in tokens) for this prompt. Prompts longer than this value will be truncated.
consumer_group str, optional - The consumer group to which to send the request.
image_path Path, optional - Path to the image file to be used as input. Defaults to None.
Note - This is only available if the running model supports image to text generation, for example with LlaVa models.

Returns:

Iterator[sseclient.SSEClient.Event] - An iterator of server-sent events.

TakeoffClient Objects​

__init__​

get_readers​

embed​

classify​

generate​

generate_stream​

TakeoffClient Objects

init

get_readers

embed

classify

generate

generate_stream