Skip to content

FM: design for diagnosis inputs/analysis preparation phase #10073

@hawkw

Description

@hawkw

RFD 603 kind of punted on designing this but we need to figure out how we are going to handle determining what inputs need to be passed into the FM analysis code and how they are collected and accessed if we are going to actually perform FM analysis.

Some notes:

  • The current inventory is an input, but that's one of the easier bits (I hope)
    • We probably need something that guards against doing FM analysis using an older inventory collection than the one in the parent sitrep?
  • FM-driven observations of health endpoints/other states
    • Eventually the sitrep will have to be able to request that something goes and polls a health endpoint, in response to things like data loss ereports (and checking up on ongoing cases)
    • Requesting an observation is an output from one sitrep and the result of the observation is an input to a subsequent one
    • Querying Oximeter could probably fit in here unless it makes more sense for it not to?
    • We gotta figure out how to model this part eventually...
  • Ereports
    • We need some way to be aware of which ereports are new since the current sitrep
      • This could just means tracking which ones aren't in cases, but that could become an expensive query
      • RFD 603 suggested doing this by tracking the last seen ENA for each reporter restart ID, and querying anything newer than that or with new restart IDs.
      • That could work but would require some finagling. In particular we want to be able to not keep around "old" restart IDs when a thing has restarted again, so that whatever tracks this doesn't grow unbounded.
      • Or, perhaps we could just track this in memory while we are loading a sitrep --- as we load the ereports, also track the latest ENA from each reporter. This way we are stuffing less state in the database forever...
    • But, we will eventually also want some way to be able to look back at historical ereports that we already saw but which are still in the db
      • The easiest way to do this is to "just query the database if you want to look at older ereports", but we would like to decouple the analysis logic from CRDB, and we would really like analysis to be a pure function of inputs to outputs...
      • @smklein suggested that perhaps we model querying historical ereports as a type of observation request, as discussed above.
        • This feels conceptually very tidy but means we have to figure out observation requests before we can query old ereports. Which is maybe fine...
      • This may not be needed immediately so we can probably figure it out after we figure out some of the other stuff

Metadata

Metadata

Assignees

Labels

fault-managementEverything related to the fault-management initiative (RFD480 and others)
No fields configured for Enhancement.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions