Troubleshooting — Network/Power Failover Triage (Generic Example)
Version: v1.0 — 2026-01-01 · Sanitized portfolio sample — not operational guidance.
Troubleshooting
Triage
Decision tree
Symptoms
- Telemetry offline
- Remote commands time out
- Intermittent connection / flapping alarms
Fast checks (2 minutes)
- Confirm monitoring platform is healthy (not a platform outage).
- Check whether multiple sites are affected (scope).
- Check last known time of good telemetry (start of incident).
Decision tree
- If multiple sites are affected → treat as platform/network issue; escalate.
- If only one site affected → continue.
- If last update < 5 min ago → wait 5 min and re-check (avoid thrash).
- If last update > 10 min ago → initiate comms recovery steps (if permitted).
Recovery steps (low-risk first)
- Re-poll / refresh telemetry once.
- Check comms device status (generic indicator if available).
- If allowed: restart comms service (single action) and monitor for 10 minutes.
- If still offline: escalate to field tech with incident summary.
Verification
- Telemetry stable for 10 minutes.
- No repeated comms alarms during verification period.
- Closeout notes entered (what/when/why).
Escalation package (copy/paste)
Summary:
- Symptoms:
- Scope (single site vs widespread):
- Last known good telemetry:
- Fast checks completed:
- Actions attempted:
- Current status:
- Safety concerns:
- Requested next steps / owner:
Changes
- v1.0 — Initial published sample.