Takeoff 0.21.2 is released 🎉 Speak with us to find out more: hello@titanml.co
0.21.0
Prefix Coalescing for drastically improved throughput on workloads with varying degrees of shared prefixes, enabled by default.
Support for multiple images in image-to-text workloads.
Ability to dynamically scale LLM deployments based on usage metrics, including scale to zero."* Support for the new Transformers Tokenizer file format.
Improved robustness in the connection between the server and its readers.
Improved stability for image-to-text models to prevent out-of-memory errors.
Fixed a bug with idle multi-gpu instances failing.