Benchmarking

render-tag includes utilities to track the performance of the generation pipeline and predefined experiments to benchmark detector performance.

Standard Benchmarking Datasets

The project defines several "Gold Standard" datasets used for cross-version performance tracking (The Locus Bench). These are the most important datasets for ensuring detector stability.

Dataset Name	Config Path	Focus
Locus Pose Baseline	`configs/experiments/locus_pose_baseline.yaml`	Pose estimation stability under distance and angle sweeps.

Generating Benchmark Datasets

Benchmark datasets are defined as experiments and should be run using the experiment run command.

1. Fast Verification (Workbench)

Use the workbench renderer for rapid logic verification or if you only need bounding boxes without photorealistic noise.

uv run render-tag experiment run --config configs/experiments/locus_pose_baseline.yaml --renderer-mode workbench

2. Production Baseline (Cycles)

Generate high-fidelity data for training or final verification using the cycles renderer. Use --workers to parallelize the generation.

uv run render-tag experiment run --config configs/experiments/locus_pose_baseline.yaml --workers 4 --renderer-mode cycles

3. Uploading Resulting Subsets

After a benchmark generation completes, the results should be pushed to the Hugging Face Hub as a versioned subset:

uv run render-tag hub push-dataset \
    output/locus_pose_v1 \
    NoeFontana/render-tag-bench \
    --config-name pose_baseline_v1

Dataset Auditing & Quality Gates

After a benchmark generation completes, use the audit command to verify quality and check for statistical drift.

1. Run Audit

Generates a comprehensive quality report, including geometric coverage and integrity checks.

uv run render-tag audit run --dir output/locus_pose_v1

2. Compare Datasets (Drift Detection)

Compare two datasets to detect performance regressions or distribution shifts.

uv run render-tag audit diff --base output/baseline --experimental output/variant_a

Performance Tracking & Telemetry

render-tag includes a built-in telemetry system that monitors worker health and rendering throughput in real-time.

Automated Collection

The UnifiedWorkerOrchestrator automatically collects telemetry from all active workers. This data is used for: - Resource Guarding: Automatically restarting workers if VRAM exceeds thresholds. - Throughput Analysis: Measuring renders-per-second and total execution time.

Analysis

Telemetry is typically saved as telemetry.csv in the dataset output directory. You can analyze this data using the TelemetryAuditor or by inspecting the summary in the audit command output.

from render_tag.audit.auditor import TelemetryAuditor

auditor = TelemetryAuditor()
# ... (Orchestrator adds entries during run) ...
auditor.save_csv(Path("output/dataset_01/telemetry.csv"))

Optimizations

render-tag implements several performance optimizations:

ZMQ Hot Loop: Persistent workers avoid Blender startup overhead.
Mesh Pooling: Blender objects are reused across scenes.
Lazy Assets: HDRIs and textures are cached in VRAM.