Component Coverage Standard¶
Classification: CONFIDENTIAL — Internal Use Only Document:
governance/component-coverage-standard.md· v1.0 · 2026-05-12 · GPUS-IT
1. Purpose & scope¶
This standard requires that every host, hypervisor, network appliance,
storage device, power device, and managed cloud service documented
anywhere in the GPUS-IT infrastructure docs portal must also exist in the
central inventory
(inventory.yaml at the repo root) and appear in each of the three
operations portals at a level appropriate to its monitoring posture.
Coverage is verified by a pre-build script. Drift fails the Cloud Build
pipeline. Exceptions require a dated entry in .coverage-exceptions.yaml.
The standard's purpose is to prevent infrastructure that exists on paper but is invisible to operations — and infrastructure that exists in operations but is undocumented.
2. The three-portal coverage requirement¶
A documented infrastructure entity is covered when all three are true:
- mkdocs-portal — referenced from at least one page under
docs/architecture/,docs/infrastructure/,docs/hostregistry/, ordocs/response-plans/. Thedocumented_infield on the inventory entry lists which pages. The coverage script verifies the listed pages exist and contain the entity ID (case-insensitive). - status-site — appears as a card in the appropriate category section,
showing a
monitoring_statusdot. - soc-site — appears in the appropriate tile under the
wdc(or future location-scoped) tab, with at least one telemetry source named in itsmonitoring_intentfield.
A PR that adds an infrastructure document without the matching inventory entry, status-site card, or soc-site tile is incomplete and must not merge.
3. The three monitoring states¶
| State | Meaning | Required fields | UI |
|---|---|---|---|
| live | Telemetry actively wired and reporting today. | monitoring_intent (the active source) |
Green dot |
| planned | Documented and committed; telemetry not yet wired. | monitoring_intent (the eventual source) |
Striped-yellow dot |
| unmonitored | Exists, but no telemetry intent. | justification (why) |
Grey dot |
| decommissioned | No longer in service; display-only. | decommissioned_reason |
Strike-through, non-interactive |
planned is the default for newly-documented infrastructure. Promotion
to live happens when the named telemetry source produces data the
relevant backend can ingest.
4. The inventory file¶
inventory.yaml at the repository root is the single source of truth.
See Inventory Schema for the full schema, allowed
field values, and required-field matrix.
The inventory feeds:
- mkdocs-portal indirectly via documentation references
- status-site + soc-site via
inventory.jsonbaked at Docker build time - status-backend + soc-backend via the generated
servers.py(linux_hostssection only — non-SSH entities don't appear)
5. The coverage check¶
The script at scripts/check-component-coverage.py runs as a pre-build
step in every service's Cloud Build pipeline. It verifies:
- Inventory schema integrity (required fields, enum values, conditional
fields like
justificationanddecommissioned_reason). - Referential integrity — every reference in
powers:,powered_by:,fed_from:,hosted_on:,vms:resolves to a known entity or to anexternal_referencesentry. documented_inpaths exist and mention the entity ID.- Identifiers cited in
docs/hostregistry/*.csvappear in the inventory. - The committed
status-backend/servers.pyandsoc-backend/servers.pymatch what would be generated frominventory.linux_hosts. - Site
inventory.jsonartifacts (if present) matchinventory.yaml.
Drift unresolved by an exception fails the build, blocking deploy.
Triggers wired to inventory changes¶
Inventory-only edits (inventory.yaml or .coverage-exceptions.yaml at
the repo root) fire builds on five Cloud Build triggers:
gpus-deploy-mkdocs-portalgpus-deploy-status-sitegpus-deploy-soc-sitegpus-deploy-status-backendgpus-deploy-soc-backend
Each of those triggers includes inventory.yaml and
.coverage-exceptions.yaml in its includedFiles list, so an
inventory-only commit rebuilds every consumer that bakes
inventory.json or runs the coverage check.
gpus-deploy-security-backend is out of scope for the coverage
standard at present — it has neither the coverage-check pre-build step
nor the inventory-bake step. It is intentionally excluded from the
includedFiles patch.
To extend coverage to a new service, both the service's
cloudbuild.yaml pre-build step and the Cloud Build trigger's
includedFiles need updating; missing either half leaves the standard
documented but not enforced.
6. Exceptions — .coverage-exceptions.yaml¶
Coverage exceptions go in .coverage-exceptions.yaml at the repo root.
Every exception requires an expires date in ISO format. Indefinite
exceptions are not permitted — the schema forces conscious renewal.
exceptions:
- finding_id: csv:meraki-hostregistry.csv:wdc-wap-1
expires: 2026-06-15
rationale: Sample exception entry.
owner: rajesh.chhetry@greenpeace.us
Expired exceptions are ignored — the underlying finding becomes a build failure again.
7. Workflow — adding new infrastructure¶
- Add an entry to
inventory.yamlunder the appropriate category, withmonitoring_status: plannedand amonitoring_intentvalue. - Add a documentation page (or extend an existing one) under
docs/architecture/and list its path in the inventory entry'sdocumented_in:field. - If the entity is a Linux host, run
scripts/regenerate-servers-py.pyto regenerate both backends'servers.py. - Add a card entry to the status-site front-end (read from
inventory.jsonat runtime). - Add a tile entry to the soc-site front-end (read from
inventory.jsonat runtime). - Run
python3 scripts/check-component-coverage.pylocally before pushing. - Push the branch. The Cloud Build pre-build step verifies coverage.
8. Workflow — promoting from planned to live¶
- Wire the telemetry source named in
monitoring_intent. - Verify the relevant backend ingests the data.
- Change the inventory entry's
monitoring_statusfromplannedtolive. - Commit + push. The portals' UI flips the status dot from striped yellow to green on next build.
9. Compliance alignment¶
| Framework | Reference |
|---|---|
| CIS Controls v8 | Control 1 — Inventory & Control of Enterprise Assets |
| CIS Controls v8 | Control 12 — Network Infrastructure Management |
| NIST CSF 2.0 | ID.AM-01 — Hardware assets managed |
| NIST CSF 2.0 | ID.AM-02 — Software platforms managed |
| NIST SP 800-53 | CM-8 — Information System Component Inventory |
| NIST SP 800-171 | 3.4.1 — Establish and maintain baseline configurations |
| PCI-DSS v4.0 | 9.5 — Physical security of media and systems |