CASTIEL2 Multi-GPU AI
The lesson materials
1M: Access to Leonardo
1A: Introduction to Deep Learning
2M: PyTorch Distributed Data Parallel
2A: Model parallelism with Pytorch
3M: PyTorch Lightning
3A: Fine-tuning neural networks
3A: Accelerate FSDP Fine-Tuning
4M: Computer Vision with CNN
4A: MLOps on HPC
5M: Introduction to Ray
5M: Retrieval Augmented Generation (RAG)
5A: Hyperparameter tuning
Reference
Quick Reference
Instructor’s guide
Why we teach this lesson
Intended learning outcomes
Timing
Preparing exercises
Other practical aspects
Interesting questions you might get
Typical pitfalls
Directives
CASTIEL2 Multi-GPU AI
Instructor’s guide
Edit on GitHub
Instructor’s guide
Why we teach this lesson
Intended learning outcomes
Timing
Preparing exercises
e.g. what to do the day before to set up common repositories.
Other practical aspects
Interesting questions you might get
Typical pitfalls