LakeFS Storage
Store test datasets and results in LakeFS for versioned, cloud-based storage.What is LakeFS?
LakeFS provides Git-like version control for data lakes. Benefits:- Versioning: Track changes to test datasets
- Branching: Experiment with test variations
- Rollback: Revert to previous dataset versions
- Collaboration: Share test datasets across teams
Installation
Setup
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
host | str | Required | LakeFS server URL |
username | str | Required | LakeFS username |
password | str | Required | LakeFS password |
repo_id | str | Required | LakeFS repository ID |
tests_prefix | str | "tests/" | Path prefix for test datasets |
results_prefix | str | "results/" | Path prefix for results |
branch_name | str | "main" | LakeFS branch to use |
enabled_suites | list[str] | None | None | Filter to specific suites |
Repository Structure
Methods
load_datasets
Load test datasets from LakeFS:save_results
Save execution results to LakeFS:Working with Branches
Use Different Branches
Complete Example
Environment Variables
Use Cases
CI/CD Integration
Version test datasets alongside code
A/B Testing
Branch datasets for different test scenarios
Audit Trail
Track all changes to test datasets
Team Collaboration
Share datasets across distributed teams