Skip to content

Inventory Schema — inventory.yaml

Classification: CONFIDENTIAL — Internal Use Only Document: governance/inventory-schema.md · v1.1 · 2026-05-14 · GPUS-IT


1. Purpose

inventory.yaml at the repository root is the single source of truth for every host, hypervisor, network appliance, storage device, power device, and managed cloud service referenced anywhere in the GPUS-IT infrastructure documentation and portals.

This page documents its schema, allowed field values, and the build-time transformations that derive consumer artifacts from it.

2. Top-level structure

version: 1
last_updated: "YYYY-MM-DD"

inventory:
  linux_hosts:        { <ID>: {  } }
  hypervisors:        { <id>: {  } }
  power_devices:      { <id>: {  } }
  network_devices:    { <id>: {  } }
  storage:            { <id>: {  } }
  cloud_services:     { <id>: {  } }

external_references:  { <id>: {  } }

The top-level inventory: key holds the six managed-infrastructure categories. The sibling external_references: key is a separate allow-list for cross-reference targets that exist as concepts but are not managed as first-class inventory entities (e.g. workstations, PSU partial references, unidentified rogue assets).

3. ID conventions

Category ID case Examples
linux_hosts UPPERCASE SKY, RAIN, OAK
hypervisors lowercase water, fire, flower
power_devices lowercase, snake_case for compounds pickle, fennel, power_strip_1
network_devices snake_case wdc_fw_primary, wdc_stack_0, gl5_firewall
storage snake_case nas, visuals_storage_exp_1, ocean
cloud_services snake_case gpus_forms_db
external_references snake_case stone, kvm_switch, fire_psu_1

The Linux-host UPPERCASE convention is preserved because servers.py (a generated artifact, see §6) keys on those names and several front-end literals already use them.

4. Required fields per entity

Universal (all entities in the five inventory categories)

Field Type Required Notes
desc or role string one of these Short human description.
location enum yes wdc, gcp, mdec, dc_apartments, oakec
monitoring_status enum yes See §5
monitoring_intent enum yes See §5
documented_in list[path] yes Doc paths referencing this entity. Each path ends in .md or .csv — see §8.

Category-specific

Category Additional required fields
linux_hosts ip, fqdn, user, services, tags, prom_ip
hypervisors fqdn, ip, hypervisor, vms
power_devices type (one of ups | ups_extension | pdu | power_strip_unmanaged), powers
network_devices type (one of firewall | switch | wireless_ap), model
storage type (one of nas | storage | storage_controller | vm)
cloud_services type (e.g. cloud_sql, cloud_run, pub_sub, cloud_storage_bucket)

Optional but common

ip, mac, serial, mgmt_port, dashlane, email_alerts, hardware_model, hypervisor, powers, powered_by, fed_from, hosted_on, vms, criticality, eol_note, risk, note, justification, decommissioned_reason, action_required.

Relaxations by status

Entities with monitoring_status: unmonitored or decommissioned skip some required fields that don't apply:

  • unmonitored skips model, ip, fqdn (unmanaged devices often have none).
  • decommissioned skips vms, powers, ip, fqdn, hypervisor, model.

The coverage check enforces these relaxations automatically.

5. Enum values

monitoring_status

Value Meaning
live Telemetry actively wired and reporting today.
planned Exists or committed to exist; telemetry not yet wired. UI renders striped-yellow status dot.
unmonitored Exists, no telemetry intent. Requires justification. UI renders grey status dot.
decommissioned No longer in service. Requires decommissioned_reason. UI renders strike-through, not interactive.

monitoring_intent

Value Applies to
ssh+prometheus Linux hosts
vmware_exporter+syslog ESXi hypervisors
snmpv3 APC PDU / UPS (default)
snmpv2c Legacy APC PDUs (Fennel exception)
snmp+dsm_syslog Synology NAS / storage
meraki_api+syslog Meraki firewalls / switches / APs
snmpv3+syslog Non-Meraki firewalls (GL5)
cloud_sql_metrics+iam_audit_log Managed Cloud SQL instances
none Unmonitored entities (requires justification)

criticality

Optional but recommended for new entries from 2026-05-14 forward. The coverage check rejects values outside this set. Existing pre-2026-05-14 entities are being backfilled — see mkdocs-portal/CLAUDE.md known-issues for owner/progress.

Value Meaning Example entities
critical Outage causes immediate revenue loss, production data loss, or org-wide work stoppage. Production Cloud SQL, primary firewalls, hypervisors hosting active workloads.
high Outage degrades a major function or breaks a single site's connectivity. Site firewalls, core switches, primary NAS.
medium Outage affects a localized function or single team. Access switches, site APs in shared workspaces.
low Outage affects individual convenience or non-essential coverage. Residential APs, conference-room AV.

6. Generated artifacts

Artifact Generated from Generator When
servers.py (repo root) inventory.linux_hosts scripts/regenerate-servers-py.py Pre-commit (human)
status-backend/servers.py inventory.linux_hosts scripts/regenerate-servers-py.py Pre-commit (human)
soc-backend/servers.py inventory.linux_hosts scripts/regenerate-servers-py.py Pre-commit (human)
status-site/inventory.json inventory.yaml Cloud Build step Per deploy
soc-site/inventory.json inventory.yaml Cloud Build step Per deploy

The three servers.py files remain committed for local development. The two inventory.json files are baked into the site Docker images at build time and not committed.

7. Optional fields

risk (free text)

Documents known risk associated with the entity. Read by humans only; coverage script does not act on it. Example: SPOF callouts, missing credentials, legacy model warnings.

justification (free text)

Required when monitoring_status: unmonitored. Documents why monitoring is deferred. Coverage check fails if missing on an unmonitored entity.

decommissioned_reason (free text)

Required when monitoring_status: decommissioned. Documents what removed the entity from service. Coverage check fails if missing on a decommissioned entity.

8. documented_in — accepts .md and .csv

The documented_in: field is a list of relative paths pointing at documentation pages or host-registry CSVs that reference the entity.

  • .md paths: the coverage check verifies the file exists and contains the entity ID as a case-insensitive substring of its content.
  • .csv paths: the coverage check verifies the file exists and that some column value in some row contains the entity ID (case-insensitive). CSV rows are treated as legitimate documentation — hostregistries are part of the docs portal nav and are authoritative for hostname/MAC/serial data.

A page-existence-but-not-mentioned warning is informational unless escalated via .coverage-exceptions.yaml.

9. Referential integrity

powers:, powered_by:, fed_from:, hosted_on:, and vms: reference IDs from other entities. The coverage check (§10) verifies every reference resolves to either an inventory entity OR an external_references entry.

10. Coverage check

The script at scripts/check-component-coverage.py (see component-coverage-standard.md) enforces:

  1. Every entity has all required fields (with status-based relaxations).
  2. Every monitoring_status: unmonitored has justification.
  3. Every monitoring_status: decommissioned has decommissioned_reason.
  4. Every entity has documented_in with at least one path that exists.
  5. Every doc path in documented_in mentions the entity ID (case-insensitive content match for .md, column-value match for .csv).
  6. Every reference in powers: / powered_by: / fed_from: / hosted_on: / vms: resolves to a known ID.
  7. The committed servers.py files match what would be generated from inventory.linux_hosts.
  8. Identifiers cited in docs/hostregistry/*.csv appear in the inventory (excluding workstations, desktops, printers, and other non-WDC scope).

Drift fails the build unless covered by an active exception in .coverage-exceptions.yaml.