Skip to main content

0.8.0

  • Add continuous batching for baseline, fast, compress-fast backend
  • Add licence validation for takeoff
  • Added loading readers to management frontend
  • Add the ability to cancel requests
  • Minor bug fix to speculative decoding
  • Minor bug fix to multigpu backend