TPU (Tensor Processing Unit)

A Tensor Processing Unit (TPU) is a custom-built processor built to speed up machine learning workloads, especially those based on neural networks. It is an application-specific integrated circuit (ASIC), meaning it is designed for a specific purpose rather than general computing. TPUs are optimized for the large matrix calculations that are common in deep learning, making them very efficient for training and running AI models.

Google began using TPUs internally in 2015 and later made them available to others through Google Cloud. They support popular machine learning frameworks such as TensorFlow, JAX, and PyTorch, and are widely used for tasks like training large language models, recommendation systems, computer vision, and speech processing.

Compared to CPUs and GPUs, TPUs focus on doing fewer types of operations while doing those operations extremely well. CPUs are flexible and good for many tasks, GPUs excel at parallel processing and are widely used in AI, while TPUs are purpose-built to maximize speed and energy efficiency for matrix-heavy AI computations.