Create reader
POST/reader
Create reader
Takes in a ReaderConfig payload and loads up new python reader
Request​
- application/json
Body
required
access_token stringnullable
backend stringnullable
batch_duration_millis int64nullable
consumer_group stringrequired
cuda_graph_cache_capacity int32nullable
cuda_visible_devices stringnullable
device stringrequired
disable_cuda_graph int32nullable
disable_paged_attention int32nullable
disable_static int32nullable
internal_gateway_ip stringnullable
log_level stringnullable
max_batch_size int64nullable
max_sequence_length int32nullable
model_name stringrequired
nvlink_unavailable int32nullable
page_cache_size stringnullable
quant_type stringnullable
tensor_parallel int32nullable
Responses​
- 201
- 422
Takes in a JSON payload and loads up new model and/or backend
- application/json
- Schema
- Example (from schema)
Schema
_reader_config
object
required
access_token stringnullable
backend stringnullable
batch_duration_millis int64nullable
consumer_group stringrequired
cuda_graph_cache_capacity int32nullable
cuda_visible_devices stringnullable
device stringrequired
disable_cuda_graph int32nullable
disable_paged_attention int32nullable
disable_static int32nullable
internal_gateway_ip stringnullable
log_level stringnullable
max_batch_size int64nullable
max_sequence_length int32nullable
model_name stringrequired
nvlink_unavailable int32nullable
page_cache_size stringnullable
quant_type stringnullable
tensor_parallel int32nullable
_reader_id stringrequired
{
"_reader_config": {
"access_token": "string",
"backend": "string",
"batch_duration_millis": 0,
"consumer_group": "string",
"cuda_graph_cache_capacity": 0,
"cuda_visible_devices": "string",
"device": "string",
"disable_cuda_graph": 0,
"disable_paged_attention": 0,
"disable_static": 0,
"internal_gateway_ip": "string",
"log_level": "string",
"max_batch_size": 0,
"max_sequence_length": 0,
"model_name": "string",
"nvlink_unavailable": 0,
"page_cache_size": "string",
"quant_type": "string",
"tensor_parallel": 0
},
"_reader_id": "string"
}
Malformed request body
Loading...