Skip to content

feat: Add orchestrator queue health metrics#77

Draft
morgan-wowk wants to merge 1 commit intometrics-pipeline-outcomesfrom
metrics-queue-health
Draft

feat: Add orchestrator queue health metrics#77
morgan-wowk wants to merge 1 commit intometrics-pipeline-outcomesfrom
metrics-queue-health

Conversation

@morgan-wowk
Copy link
Collaborator

@morgan-wowk morgan-wowk commented Feb 2, 2026

Tracks orchestrator queue processing activity and errors without costly COUNT(*) queries.

Metrics added:

  • orchestrator_queue_processing_errors_total: Counter of queue processing errors by queue_type
  • orchestrator_executions_processed_total: Counter tracking queue sweeps by queue_type and whether work was found

These metrics enable monitoring of orchestrator health, processing rates, and error rates without database overhead.
Processing rates can be derived from the found_work labels to understand queue activity patterns.

Copy link
Collaborator Author

morgan-wowk commented Feb 2, 2026

Tracks orchestrator queue processing activity and errors without costly COUNT(*) queries.

Metrics added:
- orchestrator_queue_processing_errors_total: Counter of queue processing errors by queue_type
- orchestrator_executions_processed_total: Counter tracking queue sweeps by queue_type and whether work was found

These metrics enable monitoring of orchestrator health, processing rates, and error rates without database overhead.
Processing rates can be derived from the found_work labels to understand queue activity patterns.
@morgan-wowk morgan-wowk changed the base branch from metrics-pipeline-outcomes to graphite-base/77 February 3, 2026 06:55
@morgan-wowk morgan-wowk changed the base branch from graphite-base/77 to metrics-pipeline-outcomes February 3, 2026 06:55
if _orchestrator_processing_errors_counter is None:
_orchestrator_processing_errors_counter = meter.create_counter(
name="orchestrator_queue_processing_errors_total",
description="Total number of orchestrator queue processing errors",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This resets after every restart/deploy, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I wanted to see a graph of number of errors in last few months to look at the trends?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants