0.13.2March 27, 2024 Fixed issue with multi-gpu inference with models that have a bias in their attention linear layers.