vernier

A parity-preserving COCO-style evaluator for instance segmentation, panoptic segmentation, boundary IoU, OKS keypoints, semantic segmentation, and LVIS federated evaluation. Bit-exact against pycocotools==2.0.11, panopticapi, and lvis-api in strict parity mode; semantic mIoU is calibrated against a vendored mmsegmentation IoUMetric. See the per-paradigm matrix in the README §Status & validation for the full per-paradigm picture, plus a documented quirks survey for every place the reference implementations disagree with themselves.

Why vernier

pycocotools==2.0.11 is the reference for COCO evaluation and it is slow, unmaintained, and full of edge-case quirks that downstream tools either silently fix or silently inherit. Faster reimplementations exist, but each chooses its own quirk dispositions, leaving users to discover the divergences empirically. vernier takes a third path:

Auditable parity. Every divergence from pycocotools is filed in the quirks survey under ADR-0002 as either strict (bit-equal output, even when vernier's implementation is structurally different) or corrected (opt-in opinionated fix). The default is strict; corrected fixes are itemized so you know exactly when your numbers diverge from a reference run.
A unified evaluation toolkit. bbox / segm / keypoints AP, boundary IoU, panoptic PQ, semantic mIoU, and LVIS federated evaluation all live in one package, behind one Python API and one CLI. No more wrestling with fragmented pycocotools, boundary-iou-api, panopticapi, lvis-api, and mmsegmentation installs (each has a per-paradigm migration guide).
Drop-in shim. vernier.patch_pycocotools() swaps the COCOeval symbol in place — existing pycocotools-based scripts switch with one line (ADR-0007).
Rust core. The matching kernel is Rust with runtime SIMD dispatch via pulp; the FFI layer is data conversion only. The CLI ships as a static binary, so CI pipelines can call vernier without provisioning a Python interpreter.

Install

pip install vernier

For the CLI binary on its own:

cargo binstall vernier-cli
# Or compile from source:
cargo install vernier-cli

60-second example

from pathlib import Path
from vernier.instance import Bbox, CocoDataset, Evaluator

gt_bytes = Path("instances_val2017.json").read_bytes()
dt_bytes = Path("detections.json").read_bytes()

dataset = CocoDataset.from_json(gt_bytes)
summary = Evaluator(iou=Bbox()).evaluate(dataset, dt_bytes)
for line in summary.pretty_lines():
    print(line)

Three evaluation paradigms

vernier ships three sibling submodules — pick the one whose input shape matches your model's output:

import vernier

# Detections (bbox / segm / boundary / keypoints) with scores → AP fold
vernier.instance.Evaluator()

# RGB-encoded panoptic PNGs + segments_info JSON → PQ
vernier.panoptic.Evaluator()

# Single-channel class-id label maps → mIoU / FWIoU / pAcc / mAcc
vernier.semantic.Evaluator()

The submodules are mutually exclusive (different data models, different matching rules, different parity oracles). See Three paradigms for when to use which.

Where to go next

New to vernier? Start with Tutorials.
Migrating from pycocotools, faster-coco-eval, panopticapi, lvis-api, or mmsegmentation? See Migrate.
Comparing alternatives? How vernier compares is a per-library decision aid (when to pick vernier, when to keep what you have).
Curious about speed? Benchmarks carries the per-cell medians and methodology.
Looking for a specific recipe? See How-to.
Need API details? See Reference.
Want to understand the design? See Explanation or browse the ADRs.