5M: Retrieval Augmented Generation (RAG)

Code

Note

This directory contains materials for implementing Retrieval Augmented Generation (RAG) pipelines using large language models.

Setup and Configuration

0_env_config.sh: Environment configuration script
1_start_jupyter.sh: SLURM job script for starting Jupyter with RAG environment
requirements_rag.txt: Python dependencies for RAG implementation

The data/ directory contains sample input and output files for RAG pipelines.

Data Files

RAG Data Input Files

Notebooks

RAG Tutorial Notebooks

Getting Started

To run the RAG environment on Leonardo:

Submit the job script:
```
sbatch 1_start_jupyter.sh
```
Follow the SSH tunnel instructions to access the Jupyter server
The environment includes:
- vLLM endpoint for model serving
- Pre-configured Hugging Face cache
- All necessary RAG dependencies

Key Features

Multi-GPU support with tensor parallelism
vLLM for efficient LLM serving
ChromaDB for vector storage
LangChain integration

Additional Resources