Wurzel

Wurzel is an open-source Python library built to address advanced Extract, Transform, Load (ETL) needs for Retrieval-Augmented Generation (RAG) systems. It is designed to streamline ETL processes while offering essential features like multi-tenancy, cloud-native deployment support, and job scheduling.

The repository includes initial implementations for widely-used frameworks in the RAG ecosystem, such as Qdrant, Milvus, and Hugging Face, providing users with a strong starting point for building scalable and efficient RAG pipelines.

Features

  • Advanced ETL Pipelines: Tailored for the specific needs of RAG systems.
  • Multi-Tenancy: Easily manage multiple tenants or projects within a single system.
  • Cloud-Native Deployment: Designed for seamless integration with Kubernetes, Docker, and other cloud platforms.
  • Scheduling Capabilities: Schedule and manage ETL tasks using built-in or external tools.
  • Framework Integrations: Pre-built support for popular tools like Qdrant, Milvus, and Hugging Face.
  • Type Security: By leveraging capabilities of pydantic and pandera we ensure type security

Naming