DNS Failover Procedure¶
Classification: CONFIDENTIAL — Internal Use Only
Scenario: SKY is down¶
RAIN automatically assumes all DNS queries when SKY is unreachable. No manual intervention is required for the failover itself.
Recovery steps (restore SKY):
- Diagnose SKY — check VM console in ESXi
- If OS is recoverable, SSH in and restart BIND:
sudo systemctl restart named - If OS is unrecoverable, restore from ESXi snapshot (< 5 min) or rebuild from config backup (< 30 min)
- After SKY is restored, verify zone transfer from SKY to RAIN:
rndc notify wdc.us.gl3 - Confirm both servers are authoritative:
dig @192.168.120.1 sky.wdc.us.gl3 && dig @192.168.120.2 sky.wdc.us.gl3 - Run Post-Change Checklist
Scenario: Both SKY and RAIN are down¶
- Add critical hosts to
/etc/hostson affected workstations as emergency fallback - Restore SKY from snapshot first (primary); RAIN will re-sync automatically on startup
- Verify zone replication before removing
/etc/hostsoverrides
Verify DNS Health¶
# Forward resolution (both servers)
for s in 192.168.120.1 192.168.120.2; do
echo "=== $s ==="
dig @$s sky.wdc.us.gl3 +short
dig @$s rain.wdc.us.gl3 +short
done
# DNSSEC validation
dig @192.168.120.1 wdc.us.gl3 DNSKEY +dnssec
Dns Failover · v1.1 · 2026-03-14 · GPUS-IT · Classification: CONFIDENTIAL — Internal Use Only