Skip to main content
Back to Glossary
Infrastructure

GPU

A Graphics Processing Unit that accelerates AI model training and inference through parallel computation.


Technical explanation

A GPU, or Graphics Processing Unit, was originally designed for rendering video games and graphics. But it turns out the same architecture that's great for pushing pixels is also perfect for machine learning. GPUs excel at parallel processing, meaning they can handle thousands of simple calculations simultaneously. This makes them ideal for the matrix multiplications that neural networks rely on.

When you're training a model, you're essentially doing the same math operation across millions of data points. A CPU would handle these one at a time, which takes forever. A GPU splits the work across thousands of cores and finishes in a fraction of the time. That's why NVIDIA became such a big deal in AI. Their GPUs, especially the A100 and H100 series, are the workhorses behind most large language models.

For developers, understanding GPUs matters because they directly impact what you can build. Training a small model? A consumer GPU might work. Training something like GPT-4? You'll need clusters of enterprise GPUs and a massive budget. Cloud providers like AWS, Google Cloud, and Azure rent GPU time, which is how most startups access this compute power without buying hardware.

The main alternatives are TPUs (Google's custom AI chips) and newer entrants like AMD's MI series and Intel's Gaudi accelerators. Apple's M-series chips also pack neural engines, though they're more for inference than training. For most AI work today, NVIDIA GPUs remain the default choice.

Related Terms

More in Infrastructure