2024-06-30

GPUs Part 3 - Going from here

Hopefully, you have read part 1 and part 2 of Learning about GPUs series. This part provides an index of all the useful resources one can consider to get a more advanced understanding of GPUs.

Learning about the fundamentals

[Book] Programming Massively Parallel Processors, A Hands-on Approach By David B. Kirk, Wen-mei W. Hwu
1. This is the best resource to learn about parallel programming and GPUs. The first 4 chapters explain the fundamentals of GPU hardware and its programming model
[YouTube playlist] 12 to 14 videos in COS 436
CUDA Mode
1. Very good resource for learning about GPUs/CUDA/Triton. They also have a very active Discord
CUDA C++ programming guide
1. Official guide from Nvidia which can be used as a reference
[YouTube playlist] CUDA teaching center
1. Short series to get started in CUDA and get a refresher on GPU hardware

Notable Talks

Notable blogs

What every developer should know about GPU computing
1. Gentle introduction to the GPU programming model
What shapes do Matrix Multiplication Like?
1. Puzzles to test your understanding of GPU hardware
Making Deep Learning Go Brrrr From First Principles
How is LLaMa.cpp possible?

Programming tutorials

Tiled matrix multiplication in CUDA
Matrix multiplication in pure CUDA: How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog
GPU puzzles by Srush
Triton puzzles by Srush
LLM.c LLM training in raw C/CUDA

Citations

For attribution, please cite this as

@article{romit2024gpus3,
  title   = {GPUs Part 3},
  author  = {Jain, Romit},
  journal = {cmeraki.github.io},
  year    = {2024},
  month   = {June},
  url     = {https://cmeraki.github.io/gpu-part3.html}
}