0.14.0
- Fully enabled SSD for static models
- Tokenization endpoint to get tokenized text for any live reader
- Support for Llava 1.6 models
- Introduce new AWQ kernel with significantly lower memory overhead.
- Updated LangChain integration, unified
TitanTakeoff
andTitanTakeoffPro
, integrations use management api to spin up models, added text embedding support with TitanTakeoffEmbed.