Annotation Queues¶
An Annotation Queue is the central object in the annotations system. It defines what to review (the items), how to review it (the schema), who reviews it (the assignees), and how many reviews each item needs.
Annotation Schema¶
Before creating a queue, you need to decide what fields reviewers will fill in. These fields are defined as part of the queue's schema.
Field Types¶
| Type | Description | Options |
|---|---|---|
| integer | Whole number input | Optional min/max constraints |
| float | Decimal number input | Optional min/max constraints |
| string | Free-text input | Optional max length |
| choices | Dropdown with predefined options | List of allowed values (required) |
Each field has a name and an optional description to guide reviewers.
Example schema:
| Field Name | Type | Description |
|---|---|---|
helpfulness |
integer (1–5) | How helpful was the assistant's response? |
tone |
choices (professional, neutral, inappropriate) | Describe the tone of the conversation |
notes |
string | Any additional observations |
Creating a Queue¶
Navigate to Annotation Queues in the left sidebar and click New Queue.
| Field | Description |
|---|---|
| Name | Unique name for this queue within your team |
| Description | Optional context for reviewers |
| Schema | Add one or more annotation fields (see above) |
| Assignees | Team members who will review items in this queue |
| Reviews required | Number of independent reviews needed per item (1–10, default: 1) |
Multiple reviews
Setting Reviews required to more than 1 is useful when you want multiple reviewers to independently annotate the same item — for example, to measure inter-rater agreement or to get a consensus rating. An item is marked Completed only when it has received the required number of reviews.
Queue Status¶
| Status | Meaning |
|---|---|
| Active | Open for annotation |
| Paused | Temporarily closed; reviewers cannot submit new annotations |
| Completed | All items have been reviewed |
| Archived | Queue is closed and no longer active |
Adding Items¶
Items can be added to a queue in three ways.
1. Session Selector¶
From the queue detail page, click Add Sessions. This opens a filterable table of experiment sessions. You can then choose how many sessions to add:
| Mode | Description |
|---|---|
| Selected only (default) | Add only the sessions you've checked |
| All matching filters | Add every session matching the current filter criteria |
| Sample | Add a random percentage of sessions matching the current filters |
Note
A confirmation prompt is shown for bulk operations (all matching, or large samples).
Sessions that are already in the queue are excluded from the selector.
2. Import from Evaluations Dataset¶
Click Import from Dataset on the queue detail page to add sessions from an existing evaluations dataset. This is useful when you want to annotate the same set of sessions you used for automated evaluations.
3. From a Session Detail Page¶
On any session detail page, use the Add to Queue button to add that session directly to one or more annotation queues.
Monitoring Progress¶
The queue detail page shows a progress summary:
- Total items in the queue
- Completed items (reached required number of reviews; for multi-reviewer queues, an authoritative annotation has also been selected)
- Flagged items
- Overall review progress as a percentage (reviews done / reviews needed)
For multi-reviewer queues, the summary also surfaces two additional counts:
- Resolved — items that have an authoritative annotation, shown as
X / N items resolved - Awaiting resolution — items where all required reviews are in but no authoritative annotation has been selected yet (see Resolving Multi-Reviewer Conflicts)
Aggregate Scores¶
After annotations are submitted, aggregate scores are automatically computed and displayed on the queue detail page.
| Field Type | Aggregates Shown |
|---|---|
| Numeric (integer, float) | Mean, median, min, max, standard deviation |
| Categorical (choices) | Mode, distribution percentages per option |
Aggregates are recomputed after each annotation submission, so you always see up-to-date stats.
Multi-reviewer aggregation
For multi-reviewer queues, aggregates prefer the authoritative annotation per item when one is set. Items without an authoritative pick fall back to averaging across all submitted annotations for that item.
Exporting Results¶
From the queue detail page (requires queue management permissions), you can export all submitted annotations:
| Format | Description |
|---|---|
| CSV | Spreadsheet-friendly, one row per annotation |
| JSONL | One JSON object per line, suitable for programmatic processing |
The export includes the item details, reviewer, and all annotation field values. It also includes flagged items that have no reviewer annotations, so every flagged item appears in the export regardless of review status.
Each exported record contains the following fields:
| Field | Description |
|---|---|
| Item details | Source data for the reviewed item (e.g. message content, session metadata) |
| Reviewer | The team member who submitted the annotation |
| Annotation field values | One value per schema field defined on the queue |
session_id |
External UUID of the session linked to the annotation item |
flagged |
Boolean indicating whether the item is flagged |
flagged_reason |
Full list of flag entries recorded on the item |
is_authoritative |
Boolean indicating whether this annotation is the authoritative answer for the item (always true on single-reviewer queues; set per-item by a queue admin on multi-reviewer queues) |
Note
In CSV exports, flagged_reason is JSON-serialized as a string. Use JSONL format if you need to process flag entries programmatically without parsing JSON within a field.
Managing Assignees¶
Use Manage Assignees on the queue detail page to add or remove reviewers. Assignees receive access to the queue and can see it in their annotation queue list.
Note
If a queue has no assignees, any team member with the annotation permission can annotate items in it. Once assignees are added, only they can submit annotations.