CUDA

Technical explanation

CUDA, which stands for Compute Unified Device Architecture, is NVIDIA's proprietary platform for GPU programming. It's the reason NVIDIA dominates AI infrastructure. When you run PyTorch or TensorFlow on a GPU, CUDA is doing the heavy lifting under the hood, translating your high-level code into instructions the GPU hardware can execute.

The platform includes a runtime, libraries, and development tools. Libraries like cuDNN (for deep learning) and cuBLAS (for linear algebra) are optimized to squeeze every bit of performance from NVIDIA hardware. Most popular ML frameworks have CUDA integration baked in, so developers usually don't write CUDA code directly. But it's there, making everything faster.

CUDA's dominance creates a lock-in problem. Code written for CUDA doesn't run on AMD or Intel GPUs. This has sparked efforts like ROCm (AMD's alternative) and oneAPI (Intel's approach), but ecosystem support lags behind. Most AI libraries test on CUDA first, and bugs on other platforms often go unfixed longer. It's frustrating, but that's the reality.

For developers, CUDA matters because GPU availability and cost depend on it. NVIDIA's moat isn't just hardware. It's the decade-plus of software optimization in CUDA. If you're doing serious AI work, you'll inevitably interact with CUDA, even if indirectly through higher-level frameworks.

Technical explanation

Related Terms

More in Infrastructure