Takeoff 0.21.2 is released 🎉 Speak with us to find out more: hello@titanml.co
- Introduced a new custom takeoff inference engine, which standardizes backend processes and offers an enhanced interface for generation models.
- In light of the unified backend, continuous batching now works for all generation models.
- Implemented GPU/CPU utilization tracking metrics.
- Released
takeoff_client
, a Python client package on PyPI for server interaction.
- Removed the option to select backends from the management frontend.
- Overhauled all documentation.
Add
API References
section.
- Added support for Mixtral