Backend Architecture in Wurzel¶
What are Backends?¶
Backends in Wurzel are powerful abstractions that transform your pipeline definitions into executable configurations for different orchestration platforms. Think of them as translators that take your high-level pipeline logic and convert it into the specific format required by your target execution environment.
Why Backends are Great¶
🚀 Write Once, Deploy Anywhere¶
Define your data pipeline logic once using Wurzel's intuitive API, then deploy it to multiple platforms without rewriting code. Whether you need local development with DVC, cloud-native execution with Argo Workflows, or future platforms like GitHub Actions - your pipeline logic remains the same.
🔧 Platform-Specific Optimization¶
Each backend is specifically designed to leverage the unique capabilities of its target platform:
- DVC Backend: Optimizes for data versioning, experiment tracking, and reproducible ML workflows
- Argo Backend: Leverages Kubernetes-native features like horizontal scaling, resource management, and cloud-native scheduling
🎯 Environment-Aware Configuration¶
Backends automatically handle environment-specific concerns:
- Container orchestration and resource allocation
- Storage and artifact management
- Scheduling and triggering mechanisms
- Security and access control integration
📈 Scalability Without Complexity¶
Start with simple local execution and seamlessly scale to enterprise-grade orchestration platforms. Backends abstract away the complexity of different deployment targets while preserving the power and flexibility of each platform.
How Backends Work¶
- Pipeline Definition: You define your pipeline using Wurzel's step classes and the
WZ
utility - Backend Selection: Choose the appropriate backend for your target environment
- Code Generation: The backend generates platform-specific configuration files
- Execution: Deploy and run using the native tools of your chosen platform
Available Backends¶
- DVC Backend: For data versioning and ML experiment tracking
- Argo Workflows Backend: For Kubernetes-native pipeline orchestration
Future Backends¶
Wurzel's extensible architecture supports adding new backends for:
- GitLab CI/CD: For generating
.gitlab-ci.yml
pipelines - GitHub Actions: To produce
workflow.yml
for GitHub-native automation - Apache Airflow: For DAG-based orchestration and scheduling
- LocalBackend: Execute steps locally without an external orchestrator
- Kubernetes CronJobs: Direct Kubernetes-native
CronJob
manifests