This week I had the privilege of having a broken machine at work and trying to fix it in a whole host of ways.
The problem was that last Friday my machine was working and on Monday, after a reboot, it wasn’t working.
Literally nothing changed.
Except for a reboot.
So how would that break things?
Well, as it turns out we have service discovery via DNS server records — SRV records. We look them up with standard name server call. No problem.
But then you don’t realize that the config files you’re looking at are domains.
Because we don’t use complete domain names.
And you don’t know that the machine changed names.
So the root cause of a problem is that the DNS search path changed. It worked when it was originally set up. When the name and domain of the machine was changed the /etc/resolv.conf didn’t change.
When the machine was rebooted, the files were re-made.
And broke the name resolution.
Which broke the service discovery.