Troubleshooting — Network/Power Failover Triage (Generic Example)

Version: v1.0 — 2026-01-01 · Sanitized portfolio sample — not operational guidance.

Troubleshooting Triage Decision tree

Symptoms

  • Telemetry offline
  • Remote commands time out
  • Intermittent connection / flapping alarms

Fast checks (2 minutes)

  1. Confirm monitoring platform is healthy (not a platform outage).
  2. Check whether multiple sites are affected (scope).
  3. Check last known time of good telemetry (start of incident).

Decision tree

  • If multiple sites are affected → treat as platform/network issue; escalate.
  • If only one site affected → continue.
  • If last update < 5 min ago → wait 5 min and re-check (avoid thrash).
  • If last update > 10 min ago → initiate comms recovery steps (if permitted).

Recovery steps (low-risk first)

  1. Re-poll / refresh telemetry once.
  2. Check comms device status (generic indicator if available).
  3. If allowed: restart comms service (single action) and monitor for 10 minutes.
  4. If still offline: escalate to field tech with incident summary.

Verification

  • Telemetry stable for 10 minutes.
  • No repeated comms alarms during verification period.
  • Closeout notes entered (what/when/why).

Escalation package (copy/paste)

Summary:
- Symptoms:
- Scope (single site vs widespread):
- Last known good telemetry:
- Fast checks completed:
- Actions attempted:
- Current status:
- Safety concerns:
- Requested next steps / owner:

Changes

  • v1.0 — Initial published sample.