..
    
    
 
      
    GPUs Part 3 - Going from here
Written by Romit Jain
Hopefully, you have read part 1 and part 2 of Learning about GPUs series. This part provides an index of all the useful resources one can consider to get a more advanced understanding of GPUs.
Learning about the fundamentals
- [Book] Programming Massively Parallel Processors, A Hands-on Approach By David B. Kirk, Wen-mei W. Hwu
    
- This is the best resource to learn about parallel programming and GPUs. The first 4 chapters explain the fundamentals of GPU hardware and its programming model
 
 - [YouTube playlist] 12 to 14 videos in COS 436
 - CUDA Mode
    
- Very good resource for learning about GPUs/CUDA/Triton. They also have a very active Discord
 
 - CUDA C++ programming guide
    
- Official guide from Nvidia which can be used as a reference
 
 - [YouTube playlist] CUDA teaching center
    
- Short series to get started in CUDA and get a refresher on GPU hardware
 
 
Notable Talks
- GTC 2021 - How GPU Computing Works
 - GPU Optimization session hosted by Chip Huyen
 - GTC 2022 - How CUDA Programming Works - Stephen Jones, CUDA Architect, NVIDIA
 - Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler
 
Notable blogs
- What every developer should know about GPU computing
    
- Gentle introduction to the GPU programming model
 
 - What shapes do Matrix Multiplication Like?
    
- Puzzles to test your understanding of GPU hardware
 
 - Making Deep Learning Go Brrrr From First Principles
 - How is LLaMa.cpp possible?
 
Programming tutorials
- Tiled matrix multiplication in CUDA
 - Matrix multiplication in pure CUDA: How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog
 - GPU puzzles by Srush
 - Triton puzzles by Srush
 - LLM.c LLM training in raw C/CUDA
 
Citations
For attribution, please cite this as
@article{romit2024gpus3,
  title   = {GPUs Part 3},
  author  = {Jain, Romit},
  journal = {cmeraki.github.io},
  year    = {2024},
  month   = {June},
  url     = {https://cmeraki.github.io/gpu-part3.html}
}