Marimosa and The Legend of the Single Point of Failure
Marimosa look after the systems we’ve built after the go-live, this what we like to call our Sidekick Support!
However we don’t always host the servers, sometimes the client wishes to do that themselves. This is fine until something happens and a server needs rebooting. This happened to a client of ours recently.
The system became slow and we were notified 20 minutes after the problem started. Within a minute we’d ascertained that one of the servers was under duress and was inaccessible. Whilst it may become unblocked in a few minutes we recommended that it be rebooted. However, only one person with the client business had the credentials to log into the hosting portal to reboot the server, and he was still on his way home from work (these things never happen during the day do they?)
As soon as Mr Credentials got home, he rebooted the servers and we made sure the system was operational as expected. The process took less than 3 minutes once the reboot was initiated.
Whilst the moral of the tale might be that more than one person should have access to do business or technology critical tasks, I think it is relevant to business as a whole.
Are there any single points of failure within your business? Does one person just crack on with a mission-critical job that no one else really knows how to do?
Does your operational software run on a “server” under someone’s desk?
Just some food for thought.