NVIDIA GPU Pipeline — Zero to Expert

How a GPU Turns One Pixel
Into a Full Image

A zero-to-hero visual journey through NVIDIA GPU architecture. Watch data flow from your CPU through Streaming Multiprocessors, CUDA threads, and memory hierarchies — all the way to your screen.

CPULoads Image

PCIeTransfer

VRAMGPU Memory

SMCompute

DisplayOutput

Explore GPU Architecture Learn CUDA

Scroll

The Full Journey

From JPEG on Disk
to Pixels on Screen

Every image you see passes through 5 distinct hardware and software stages. Click each step to understand exactly what happens.

STEP 01~8MB for a 1080p image

CPU Loads the Image

Host → System RAM

Your CPU reads the image file from disk (JPEG/PNG), decodes it into raw pixel data (RGBA bytes), and stores it in system RAM. Each pixel is 4 bytes — Red, Green, Blue, Alpha.

snippet.cu

CUDA

uint8_t pixels[1920 * 1080 * 4]; // ~8MB
stbi_load("photo.jpg", pixels, ...);

1 / 5

Deep-dive into GPU chip anatomy

Memory Hierarchy

Speed vs. Size:
The Memory Pyramid

GPUs have 5 levels of memory, each trading speed for capacity. Understanding this pyramid is the key to writing fast GPU code.

▲ Faster & Smaller

▼ Slower & Larger

Registers

Per Thread

Latency

~1 cycle

Bandwidth

Unlimited

Capacity

256 KB / SM

Managed by

Programmer

Fastest storage. Each CUDA thread gets its own private registers — like a calculator's display. Disappears when the thread ends.

example.cu

float r = pixel.r; // stored in register

By the Numbers

CUDA Cores

in RTX 4090

Parallel compute units

0TB/s

VRAM Bandwidth

GDDR6X memory bus

Memory throughput

0 stages

Pipeline Stages

from disk to display

End-to-end processing

0×

Faster than RAM

Ready to go deeper?

Explore Every Transistor.
Understand Every Cycle.

Two interactive learning paths — pick your depth level and start exploring the hardware that powers modern computing.

Interactive

GPU Architecture Explorer

Click on individual SM components, watch data flow animations, and see the real math behind every pixel.

Explore Now

Code + Visuals

CUDA Deep Dive

Thread hierarchy, kernel launches, memory access patterns — learn to program the GPU from scratch.

Learn CUDA

How a GPU Turns One PixelInto a Full Image

From JPEG on Diskto Pixels on Screen