..
GPUs Part 3 - Going from here
Written by Romit Jain
Hopefully, you have read part 1 and part 2 of Learning about GPUs series. This part provides an index of all the useful resources one can consider to get a more advanced understanding of GPUs.
Learning about the fundamentals
- [Book] Programming Massively Parallel Processors, A Hands-on Approach By David B. Kirk, Wen-mei W. Hwu
- This is the best resource to learn about parallel programming and GPUs. The first 4 chapters explain the fundamentals of GPU hardware and its programming model
- [YouTube playlist] 12 to 14 videos in COS 436
- CUDA Mode
- Very good resource for learning about GPUs/CUDA/Triton. They also have a very active Discord
- CUDA C++ programming guide
- Official guide from Nvidia which can be used as a reference
- [YouTube playlist] CUDA teaching center
- Short series to get started in CUDA and get a refresher on GPU hardware
Notable Talks
- GTC 2021 - How GPU Computing Works
- GPU Optimization session hosted by Chip Huyen
- GTC 2022 - How CUDA Programming Works - Stephen Jones, CUDA Architect, NVIDIA
- Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler
Notable blogs
- What every developer should know about GPU computing
- Gentle introduction to the GPU programming model
- What shapes do Matrix Multiplication Like?
- Puzzles to test your understanding of GPU hardware
- Making Deep Learning Go Brrrr From First Principles
- How is LLaMa.cpp possible?
Programming tutorials
- Tiled matrix multiplication in CUDA
- Matrix multiplication in pure CUDA: How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog
- GPU puzzles by Srush
- Triton puzzles by Srush
- LLM.c LLM training in raw C/CUDA
Citations
For attribution, please cite this as
@article{romit2024gpus3,
title = {GPUs Part 3},
author = {Jain, Romit},
journal = {cmeraki.github.io},
year = {2024},
month = {June},
url = {https://cmeraki.github.io/gpu-part3.html}
}