5M: Retrieval Augmented Generation (RAG)
Code
Note
This directory contains materials for implementing Retrieval Augmented Generation (RAG) pipelines using large language models.
Setup and Configuration
0_env_config.sh: Environment configuration script1_start_jupyter.sh: SLURM job script for starting Jupyter with RAG environmentrequirements_rag.txt: Python dependencies for RAG implementation
The data/ directory contains sample input and output files for RAG pipelines.
Data Files
Notebooks
Getting Started
To run the RAG environment on Leonardo:
Submit the job script:
sbatch 1_start_jupyter.shFollow the SSH tunnel instructions to access the Jupyter server
The environment includes:
vLLM endpoint for model serving
Pre-configured Hugging Face cache
All necessary RAG dependencies
Key Features
Multi-GPU support with tensor parallelism
vLLM for efficient LLM serving
ChromaDB for vector storage
LangChain integration