CASTIEL2 Multi-GPU AI Logo

The lesson materials

  • 1M: Access to Leonardo
  • 1A: Introduction to Deep Learning
  • 2M: PyTorch Distributed Data Parallel
  • 2A: Model parallelism with Pytorch
  • 3M: PyTorch Lightning
  • 3A: Fine-tuning neural networks
  • 3A: Accelerate FSDP Fine-Tuning
  • 4M: Computer Vision with CNN
  • 4A: MLOps on HPC
  • 5M: Introduction to Ray
  • 5M: Retrieval Augmented Generation (RAG)
    • Code
    • Setup and Configuration
      • RAG Data Input Files
      • RAG Tutorial Notebooks
        • Notebooks
    • Getting Started
    • Key Features
    • Additional Resources
  • 5A: Hyperparameter tuning

Reference

  • Quick Reference
  • Instructor’s guide
  • Directives
CASTIEL2 Multi-GPU AI
  • 5M: Retrieval Augmented Generation (RAG)
  • RAG Tutorial Notebooks
  • Edit on GitHub

RAG Tutorial Notebooks

Note

content/it/rag/notebooks

These notebooks guide you through implementing Retrieval Augmented Generation (RAG) pipelines.

  1. 0_download_data.ipynb: Download and prepare data for RAG pipelines

  2. 1_chunking_indexing_data.ipynb: Data chunking, embedding, and vector index creation

  3. 2_creating_a_chatbot.ipynb: Build a complete RAG-powered chatbot

Notebooks

Download Data

  • General
  • Imports
  • Env config
  • Scraping
  • Download data

Chunking and indexing data

  • General
  • Imports
  • Env config
  • Why do we need RAG systems?
  • Classical information retrieval approaches
  • Semantic search
  • Finding the right chunk size
  • Chunk and index all documents
  • Evaluating k-size in the retrieval step
  • The final retriever
  • Improving the system
  • A combined approach
  • Contextual retrieval
  • A few considerations about contextual retrieval
  • Final remarks

Creating a chatbot

  • General
  • Imports
  • Env config
  • The collections
  • Workflow
  • Agents and agentic RAG
  • Considerations
  • Langchain
  • Next steps
Previous Next

© Copyright 2026, The contributors.

Built with Sphinx using a theme provided by Read the Docs.