Guide

Quick Start

Stand up one real, testable workflow before you model the rest of the organization.

Before you start

  • Pick one real service.
  • Pick one alert source for that service.
  • Pick one team that should own the page.
  • Pick one runbook you would want during the first live incident.

Do this

  1. In Catalog, create one system and one function if the function changes ownership or routing.
  2. In Alerts, create one source and save one real sample.
  3. Map summary, severity, status, dedupe, environment, and destination.
  4. In Coverage, create one schedule for the owning team.
  5. Test the page path from the schedule and user escalation policy.
  6. In Incidents, start one test incident and scope it to the same service.
  7. In Runbooks, create one short runbook for a repeated response step.
  8. In Retrospectives, close the loop with one written retro and one action item.

Check it worked

  • The alert lands on the right system or function.
  • Coverage > On-call shows the expected responder now.
  • A test page follows the expected channel order and timing.
  • A test incident opens with the right scope and environment.
  • The runbook launches from the incident.
  • The retrospective captures the action item you want to follow up.

If it does not work

  • If the alert lands in the wrong place, fix mapping and replay before changing schedules or runbooks.
  • If paging looks wrong, check ownership, schedule layers, and user escalation steps in that order.
  • If the incident feels empty, tighten the source mapping and add one useful runbook before expanding the model.

Next