`vernier.instance`

Detection / segmentation / boundary / keypoints evaluation. Built around the Evaluator config + evaluate(...) call, with a BackgroundEvaluator variant for overlapping the kernel with the rest of the training step.

Instance-segmentation / detection / keypoints evaluation surface.

Per ADR-0029, the AP-fold evaluation paradigm (bbox, segm, boundary, keypoints) lives under vernier.instance. Sibling to :mod:vernier.panoptic and :mod:vernier.semantic.

CompressedRLE

Bases: TypedDict

COCO compressed RLE shape (6-bit ASCII bytes, as emitted by pycocotools.mask.encode).

counts is the compressed bytes payload, validated as UTF-8 ASCII at ingest. size is (height, width) in COCO order.

Detections

Bases: TypedDict

One per-image detection batch in array form.

Fields are gated by iou_type:

bbox: image_id, boxes, scores, labels.
segm / boundary: above plus rles.
keypoints: image_id, boxes, scores, labels, keypoints.

Required dtypes (no silent promotion — opt in via cast_inputs=True):

boxes: float64 (N, 4) C-contiguous, xywh.
scores: float64 (N,).
labels: int64 (N,).
rles[i] (uncompressed dict): counts: uint32 1-D contiguous, size: (h, w).
rles[i] (compressed dict): counts: bytes (COCO 6-bit ASCII), size: (h, w).
rles[i] (bitmask): 2-D bool or uint8, shape (H, W), C- or F-order.
keypoints: float64 (N, K, 3) C-contiguous.

UncompressedRLE

Bases: TypedDict

COCO RLE shape on the array-ingest path (uncompressed counts).

counts is the uncompressed run-length array (uint32, contiguous). size is (height, width) in COCO order.

BackgroundEvaluator

BackgroundEvaluator(
    gt: bytes | CocoDataset,
    *,
    iou_type: Literal[
        "bbox", "segm", "boundary", "keypoints"
    ] = ...,
    parity_mode: Literal["strict", "corrected"] = ...,
    max_dets: list[int] = ...,
    use_cats: bool = ...,
    memory_budget_bytes: int | None = ...,
    dilation_ratio: float = ...,
    sigmas: dict[int, list[float]] | None = ...,
    queue_capacity: int = ...,
    worker_affinity: int | None = ...,
    worker_nice: int = ...,
    shutdown_timeout_seconds: float = ...,
    retain_iou: bool = ...,
    cast_inputs: bool = ...,
    rank_id: int | None = ...,
    record_latency_samples: bool = ...,
)

Background-evaluator surface (ADR-0014). Wraps a worker thread that owns the StreamingEvaluator<K>; every public method either sends on the channel or reads atomic counters. Not frozen — finalize() and __exit__ need to mutate state.

detections_seen `property`

detections_seen: int

Mirror of StreamingEvaluator::detections_seen(). Advisory.

images_seen `property`

images_seen: int

Mirror of StreamingEvaluator::images_seen(). Advisory — updated by the worker after each successful submit.

memory_used_bytes `property`

memory_used_bytes: int

Mirror of StreamingEvaluator::memory_used_bytes(). Advisory.

queue_depth `property`

queue_depth: int

Approximate count of Update messages waiting in the channel.

drain_latency_samples_ns `method descriptor`

drain_latency_samples_ns() -> list[int]

Drain the worker's accumulated submit-latency samples (B5).

Each sample is the wall-time of one submit() call's channel-send leg, in nanoseconds. The buffer is reset to empty on each call so subsequent submits keep accumulating; returns an empty list when the evaluator was constructed without record_latency_samples=True (the default) or after finalize has consumed the worker.

finalize `method descriptor`

finalize() -> Summary

Drain the queue, finalize the evaluator, and join the worker. Subsequent calls error with the "already finalized" message.

finalize_to_partial `method descriptor`

finalize_to_partial() -> bytes

ADR-0031 / ADR-0035: drain the queue, serialize the worker's final state as a partial blob, and shut the worker down. Subsequent calls raise "already finalized".

finalize_with_tables `method descriptor`

finalize_with_tables(
    per_image: bool = False,
    per_class: bool = False,
    per_detection: bool = False,
    per_pair: bool = False,
    per_pair_iou_floor: float = 0.1,
    per_pair_max_rows: int = 10000000,
    per_detection_with_geometry: bool = False,
) -> _TablesResult

Tables-aware finalize. Drains the queue and consumes the worker.

submit `method descriptor`

submit(
    detections: DetectionsInput,
    *,
    timeout: float | None = None,
) -> None

Submit a detection batch to the worker. Accepts either loadRes- shaped JSON bytes (legacy) or an ADR-0030 Detections dict / sequence of Detections dicts (numpy/DLPack). timeout controls backpressure:

None (default) → block until a slot is free
0.0 → single non-blocking attempt; raise QueueFullError if the queue is full
t > 0.0 → wait up to t seconds; raise QueueFullError on timeout

Breakdown

Python wrapper around [Breakdown] / [ClassGroupBreakdown].

axis `property`

axis: str

Axis name (e.g., "area", "vehicle_taxonomy").

buckets `property`

buckets: list[tuple[str, float, float]]

Range buckets as a list of (label, lo, hi) triples in construction order.

Raises AttributeError if this Breakdown was built via from_class_groups. Use class_groups instead.

class_groups `property`

class_groups: list[tuple[str, list[int]]]

Class-id groups as a list of (label, class_ids) pairs in construction order.

Raises AttributeError if this Breakdown was built via from_ranges. Use buckets instead.

kind `property`

kind: Literal['range', 'class_groups']

Variant discriminator: "range" for from_ranges-constructed breakdowns, "class_groups" for from_class_groups-constructed ones. Use this to dispatch in validators that accept a Breakdown of a specific shape.

from_class_groups `builtin`

from_class_groups(
    axis: str, groups: Sequence[tuple[str, Sequence[int]]]
) -> Breakdown

Construct from class-id-keyed groups.

groups is a sequence of (label, class_ids) pairs, one per group. Group order on input determines the group axis index (first pair is index 0). Strict partition discipline is enforced — no class id may appear in two groups.

Raises ValueError on:

empty groups;
any group with empty class_ids;
duplicate group labels;
the same class id appearing in more than one group.

from_ranges `builtin`

from_ranges(
    axis: str, buckets: Sequence[tuple[str, float, float]]
) -> Breakdown

Construct from f64-keyed buckets.

buckets is a sequence of (label, lo, hi) triples, one per bucket. [lo, hi] is closed on both ends per ADR-0016 (quirk D6); an annotation whose key sits exactly on a boundary lands in both adjacent buckets.

Raises ValueError on:

empty buckets;
NaN or infinite lo / hi;
lo < 0;
lo > hi;
duplicate bucket labels.

CocoDataset

Parsed-once COCO ground-truth dataset.

Construct with [PyDataset::from_json]; pass to the evaluate_*_summary_with_dataset family. Reusing the same instance across evaluate calls reuses the GT-side derivations the cached kernels populate on first use (per ADR-0020). The handle is frozen — its identity is the cache key.

boundary_cache_len `property`

boundary_cache_len: int

Observability-only: count of GT annotations whose boundary band is currently cached (ADR-0020). Useful for debugging or tests that need to assert cache reuse; not a stable contract, and the value can change shape as new cache slots are added.

category_frequency `property`

category_frequency: (
    Mapping[int, LvisFrequencyLiteral] | None
)

Per-category frequency tag as the LVIS single-letter form ("r" / "c" / "f"; quirk AB1). None when this dataset is not federated.

is_federated `property`

is_federated: bool

True when this dataset carries LVIS federated metadata — equivalent to pos_category_ids is not None. Cheap shortcut for orchestration code that gates behaviour on the federated flag.

neg_category_ids `property`

neg_category_ids: Mapping[int, frozenset[int]] | None

Per-image negative-category set (quirk AA2). None when this dataset is not federated.

not_exhaustive_category_ids `property`

not_exhaustive_category_ids: (
    Mapping[int, frozenset[int]] | None
)

Per-image not-exhaustive-category set (quirk AA3). None when this dataset is not federated.

num_annotations `property`

num_annotations: int

Number of GT annotations carried by the dataset.

num_categories `property`

num_categories: int

Number of categories.

num_images `property`

num_images: int

Number of images.

pos_category_ids `property`

pos_category_ids: Mapping[int, frozenset[int]] | None

Per-image positive-category set (quirk AA1, derived from GTs at load). None when this dataset was loaded via [Self::from_json] (COCO path) rather than [Self::from_lvis_json].

segm_cache_len `property`

segm_cache_len: int

Observability-only: count of GT annotations whose segm bbox+area derivation is currently cached (ADR-0020). Same caveats as [Self::boundary_cache_len].

clear_cache `method descriptor`

clear_cache() -> None

Drops every cached GT-side derivation. Reset point for users who want to free memory between long-lived training cycles without dropping the dataset itself.

from_json `staticmethod`

from_json(gt_json: bytes) -> CocoDataset

Parses a COCO ground-truth JSON payload into a reusable [Dataset] handle. Raises ValueError on malformed JSON.

from_lvis_json `staticmethod`

from_lvis_json(gt_json: bytes) -> CocoDataset

Parses an LVIS v1 ground-truth JSON payload into a reusable [Dataset] handle. The handle exposes the federated metadata (pos_category_ids, neg_category_ids, not_exhaustive_category_ids, category_frequency) the orchestrator reads to apply LVIS evaluation semantics (ADR-0026).

Raises ValueError on malformed JSON, on the disjointness violations of quirk AA7 (a category in both not_exhaustive and neg, or a neg category that has GT on the same image), or on missing frequency tags (quirk AB6).

Migration guide for users coming from lvis-api: docs/explanation/lvis-migration.md. Lead with the silent- federated-semantics gotcha — loading LVIS-shaped JSON via Dataset.from_json (the COCO loader) silently drops the federated extras and produces systematically lower AP under COCO semantics.