Compute
Compute is the computational resources—processing power, memory, and energy—used to train and run machine learning models. It works by powering the mathematical operations that models depend on, and the more complex the model, the more computing power it requires to function effectively.
Compute has different contextual meanings.
- Training compute that powers the initial process of building a model, which includes feeding it data, tuning its parameters, and repeating this over days or months; and
- Inference compute, consumed each time a trained model responds to a query, which accumulates significantly at scale across millions of users.
The hardware used to deliver compute includes specialized processors such as Graphics Processing Units (GPUs), which are the most widely used, Tensor Processing Units (TPUs), and custom ASICs and NPUs built by companies like Nvidia.