Wolf Incident Postmortem

post by jefftk (jkaufman) · 2023-01-09T03:20:03.723Z · LW · GW · 13 comments

Contents

  Incident #210
    Status
    Summary
    Impact
    Root causes
    Trigger
    Resolution
    Detection
    Action Items
    Lessons Learned
      What went well
      What went wrong
      Where we got lucky
    Timeline
None
13 comments

Incident #210

Status

Complete, one action item outstanding.

Summary

Sentinel consumed by wolf after repeated false alarms.

Impact

Loss of sentinel. No flock impact.

Root causes

Sentinel generated noisy alerts due to premature deployment, incomplete training, and overly monotonous task. Oncalls failed to respond to true positive due to alert fatigue.

Trigger

Wolf.

Resolution

Gathered flock. Deployed replacement sentinel.

Detection

Sentinel did not report at end of shift.

Action Items

Priority Action Item Type Status
P0 Gather flock mitigate complete
P0 Deploy replacement sentinel mitigate complete
P1 Update playbook for wolf alerts prevent complete
P2 Update remaining sentinels prevent complete
P2 Revise sentinel training program prevent complete
P2 Investigate equipping sentinels with flutes or slings prevent in progress

Lessons Learned

What went well

What went wrong

Where we got lucky

Timeline

All times local

March 3rd:

March 4th:

March 5th:

March 6th:

Comment via: facebook, mastodon

13 comments

Comments sorted by top scores.

comment by Raemon · 2023-01-09T05:24:49.514Z · LW(p) · GW(p)

Man this is some fuckin' poetry.

comment by jimrandomh · 2023-01-09T16:45:13.854Z · LW(p) · GW(p)

This historical incident report fails to mention the true root cause, which has since been addressed: Wolves were not yet locally driven to extinction.

Replies from: jkaufman, lahwran
comment by jefftk (jkaufman) · 2023-01-09T17:46:41.874Z · LW(p) · GW(p)

I thought the true root cause was that people were still raising animals for human consumption?

comment by the gears to ascension (lahwran) · 2023-01-10T02:59:35.014Z · LW(p) · GW(p)

in sufficiently complex systems, there is often no single root cause, even if you could see the entire causal graph. This seems like a case where that applies to me.

... but also, what jeff said

comment by DragonGod · 2023-01-09T14:23:42.016Z · LW(p) · GW(p)

I don't understand what I just read.

Replies from: swarriner
comment by swarriner · 2023-01-09T14:32:17.287Z · LW(p) · GW(p)

It's the "Boy who cried wolf" fable in the format of an incident report such as what might be written in the wake of an industrial disaster. Whether the fictional report writer has learned the right lessons I suppose is an exercise left for the reader.

comment by FeepingCreature · 2023-01-09T16:36:46.475Z · LW(p) · GW(p)

See also: Swiss cheese model

tl;dr: don't overanalyze the final cause of disaster; usually it was preceded by serial failure of prevention mechanisms, any one or all of which can be improved for risk reduction.

Replies from: None
comment by [deleted] · 2023-01-10T01:56:02.109Z · LW(p) · GW(p)

Yeah but false positive. Every time anyone mentions all the ignored warnings they never try to calculate how many times the same warning occurred and everything was fine?

It's easy to point to O rings after the space shuttle is lost. But how many thousand other weak links were NASA/contractor engineers concerned about?

comment by Metacelsus · 2023-01-10T23:39:32.126Z · LW(p) · GW(p)

OK, but why equip sentinels with flutes?

Replies from: jkaufman
comment by jefftk (jkaufman) · 2023-01-11T00:15:14.791Z · LW(p) · GW(p)

To make the task less monotonous. This is also a major benefit of slings.

comment by Oliver Sourbut · 2023-01-16T21:53:14.765Z · LW(p) · GW(p)

Oh boy, this is terrifyingly familiar from my oncall days!

comment by greylag · 2023-01-09T07:01:53.119Z · LW(p) · GW(p)

(Epistemic status: lyrics)

I’m not too clear about what you just spoke. Is that a parable, or a very subtle joke?

Replies from: aphyer
comment by aphyer · 2023-01-09T14:16:47.853Z · LW(p) · GW(p)

If you're making false claims of your incomprehension, it's clear that you've missed the moral dimension. When you truly can't get what someone is saying, remember today and the games you were playing. It takes people effort to give added proof...and they won't put that in for the boy who cries wolf.