Triton Kernel Lifecycle

You'll be using this term more often.

What is a Kernel? 💡
A kernel is a small, highly-optimized program that runs on the GPU. It defines the operations that every thread in a grid will execute in parallel. Kernels are the building blocks of GPU computation.

Lifecycle

Here’s what happens when you write and run a Triton kernel:

Definition:
- You define your kernel in Python using @triton.jit.
JIT Compilation:
- Triton compiles your Python kernel into GPU-specific machine code (PTX for CUDA).
Kernel Launch:
- The compiled kernel is launched on the GPU, with a grid of thread blocks handling the workload.
Execution on GPU:
- The GPU’s Streaming Multiprocessors (SMs) execute the kernel, with threads running in parallel using SIMT (Single Instruction, Multiple Threads).
Results:
- Computation results are written back to GPU global memory and returned to the CPU.

In short: Python → PTX → CUDA → Blazing-fast GPU execution. 🚀

Getting Started

Newbie Kernels

Triton Kernel Lifecycle

Lifecycle

On this page