Skip to content

Getting Started

This guide covers setting up your development environment, understanding the development workflow, and running your first tests with Wurzel.

Development Setup

Prerequisites

Before starting, ensure you have completed the installation process. If you're setting up for development, you should have:

  • Python 3.11 or 3.12
  • Wurzel installed with development dependencies
  • Pre-commit hooks configured

Verify Your Installation

# Check that Wurzel is properly installed
python -c "import wurzel; print('Wurzel installed successfully')"

# Verify CLI access
wurzel --help

Development Workflow

Code Quality & Linting

Wurzel uses pre-commit hooks to enforce consistent code quality and formatting. These hooks run automatically on every commit to ensure code standards.

Set Up Pre-commit Hooks

If you used make install, pre-commit hooks are already configured.

Run Linting Manually

You can trigger the linting process manually at any time:

make lint

This runs all configured linters and formatters across the codebase.

Running Tests

Before submitting changes, ensure all tests pass:

# Run the complete test suite
make test

Test Structure

Wurzel's test suite includes:

  • Unit tests: Testing individual components
  • Integration tests: Testing component interactions
  • End-to-end tests: Testing complete pipeline flows

Documentation

Local Documentation Development

Preview documentation changes locally:

# Serve documentation locally (auto-reloads on changes)
make documentation

# Build documentation without serving
mkdocs build

The documentation will be available at http://127.0.0.1:8000/

Documentation Structure

Wurzel uses MkDocs for documentation management:

  • Source files: Located in docs/
  • Configuration: mkdocs.yml
  • Auto-generated docs: Built from docstrings

Development Guidelines

Commit Strategy

Wurzel maintains a clean Git history through a structured commit strategy.

Commit Message Format

Follow this structure for commit messages:

<tag>: <short description>

<longer description (optional)>

Commit Types

  • Breaking Changes: For changes that are not backward-compatible
  • Use tag: breaking
  • Features: For new features or enhancements that are backward-compatible
  • Use tags: feat, feature
  • Fixes and Improvements: For bug fixes, performance improvements, or small patches
  • Use tags: fix, hotfix, perf, patch

Allowed Tags

Ensure consistency by using these approved tags:

  • feat, feature, fix, hotfix, perf, patch
  • build, chore, ci, docs, style, refactor, ref, test

Examples

# Good commit messages
git commit -m "feat: add semantic text splitter for German documents"
git commit -m "fix: resolve memory leak in embedding generation"
git commit -m "docs: update installation guide with Docker instructions"

# Bad commit messages
git commit -m "updated stuff"
git commit -m "fixed bug"

Merge Strategy

  • All commits are squashed when merging into the main branch
  • This maintains a clean, readable project history
  • Focus on meaningful commit messages during development

Common Development Tasks

Adding a New Feature

  1. Create a feature branch:

    git checkout -b feat/your-feature-name
    

  2. Implement your feature following the development guides

  3. Add tests for your feature:

    # Add tests in tests/ directory
    python -m pytest tests/test_your_feature.py
    

  4. Update documentation if needed

  5. Run quality checks:

    make lint
    make test
    

  6. Commit using proper format:

    git commit -m "feat: add your feature description"
    

Fixing a Bug

  1. Create a fix branch:

    git checkout -b fix/bug-description
    

  2. Write a failing test that reproduces the bug

  3. Implement the fix

  4. Verify the test passes:

    python -m pytest tests/test_bug_fix.py
    

  5. Run full test suite:

    make test
    

Working with Dependencies

Adding New Dependencies

  1. Add to pyproject.toml in the appropriate section:

    dependencies = [
        "existing-package>=1.0.0",
        "new-package>=2.0.0",
    ]
    

  2. For optional dependencies:

    [project.optional-dependencies]
    your-extra = ["optional-package>=1.0.0"]
    

  3. Update installation:

    make install
    

Direct Dependencies

For packages that can't be installed via PyPI (like spaCy models):

  1. Add to DIRECT_REQUIREMENTS.txt:

    https://github.com/explosion/spacy-models/releases/download/de_core_news_sm-3.7.0/de_core_news_sm-3.7.0-py3-none-any.whl
    

  2. Document in installation guide if user-facing

Next Steps

Now that you have your development environment set up:

  1. Build Your First Pipeline - Learn core pipeline concepts
  2. Create Custom Steps - Build your own processing components
  3. Understand Data Contracts - Learn about type-safe data exchange
  4. Explore Backends - Understand deployment options

Additional Resources