Model Memory Calculator
Overview
The TitanML Model Memory Calculator is an open-source tool that helps users determine whether their machine learning model can run on their available hardware. The tool supports two modes: a Standard Calculator and a more advanced Prefill Chunking Calculator. This document outlines the functionalities of both calculators, explains the underlying formulas, and demonstrates how to use them effectively.
You can also access the Model Memory Calculator directly on Hugging Face or on GitHub Pages.
Purpose of the Tool
This tool serves several purposes:
- Memory Management: Assists users in managing memory requirements efficiently to prevent out-of-memory errors.
- Optimization: Helps in optimizing batch sizes, model configurations, and deployment strategies based on hardware capabilities.
- Cost Management: Reduces cloud computing costs by optimizing hardware and model configurations.
The calculator caters to a wide range of users, including those who want to ensure that their current hardware supports a specific machine learning model and those looking to purchase hardware that meets their model requirements.
1. Standard Calculator
Introduction
The Standard Calculator allows users to estimate the memory footprint of their machine learning model based on key variables such as the number of model parameters, hidden size, number of layers, and hardware constraints. This basic estimation helps users gauge if their hardware can support the model.
Key Formulas
-
Model Memory Needed:
The memory required to store model parameters depends on the precision used. -
Available Memory:
The memory remaining after accounting for the model's memory usage. -
Model Memory Per Input:
Calculated based on the model’s hidden size and the number of layers. -
Maximum Input Size:
Determines the maximum input size the device can handle without exceeding memory limits.