Skip to main content

0.14.0

  • Fully enabled SSD for static models
  • Tokenization endpoint to get tokenized text for any live reader
  • Support for Llava 1.6 models
  • Introduce new AWQ kernel with significantly lower memory overhead.
  • Updated LangChain integration, unified TitanTakeoff and TitanTakeoffPro, integrations use management api to spin up models, added text embedding support with TitanTakeoffEmbed.