Linux and Open Source articles, topics and discussion with your questions answered by dmourati, a Linux expert with over eight years production experience.
Wednesday, January 30, 2008
Why Sysadmins Suck at Documentation
I just read an interesting post titled System Administration and Documentation at The Branch. Allen writes that documentation is missing from the standard identify, troubleshoot, resolve methodology applied by most sysadmins.
Sysadmins suck at documentation. We can't be bothered. We think documentation is beneath us. We are the gatekeepers of information.
Want your application in the DMZ to be able to talk to the backend? Better go see your sysadmin (or netadmin) because without them you are dead in the water.
"Open a ticket!," we bark.
Not writing documentation gives us power. Power to hold over our users and our colleagues. Job security. Power to be needed.
Without us, the workplace grinds to a halt. Need a new user account?
"File a ticket, our sysadmin is out today. He will get to it when he gets back from the conference."
Hours go by, work piles up. The ticket queue is now full with requests from the past days office absence.
"Ha, all those users are hopeless without me. Let's see who I've kept waiting while I was enjoying my precious PTO."
"Hmm, Fran needs a new printer...low priority, she can wait. James can't get his application to talk though the firewall, what a loser. Aha, here's one, install this new Wiki package for inline editting. That sounds like fun, off I go working on that. Where's that coffee?"
Meanwhile, Fran and James are stuck. They are patient people, but they filed their issues yesterday, while we were out. By lunchtime, they are wondering what is up. If they are feeling up to it, they may swing by your office to "get an update."
"Update, you shout, ya, I'm working on it."
"Okay, do you have any ETA, I'm kinda stuck without that printer," Fran says.
"I'll get to it."
Meanwhile, more work piles in. You're off in Wiki-land reading *gasp* documentation.
"How do I install this plugin. What version am I running, let's see. Gosh, these developers suck, why couldn't they make this easier to follow..."
By funneling all this work through us, we have succeeded in creating a log jam. The work we find interesting is often time-consuming and often not time-critical. This makes for a bad situation.
The users suffer. They are stuck because we have taken away their ability to be productive. Fran still needs that printer, James still needs connectivity.
If any of the above scenarios sound familiar it is because they are true. This is how things go in a real office. What can we learn from this pattern? What is the way out? How can a guy take a day off without grinding things to a halt?
Organizations don't get very far with just one admin. They usually add help over time to meet the growing needs. This could be for IT, networking, security, systems administration, database administration or any related job description. The point is, we are part of a larger team.
So, how do we leverage this team to make our lives easier.
Let's go back to the Wiki example and forget about Fran and James. We'll come back to them.
So, the trusty sysadmin has the time now to figure out all about the fancy Wiki upgrade he wants to make. Download this plugin, check. Note the version number, check, unzip the file, check. Go to the page and try the new feature. Hey, it works, cool done, right?
To be truly done you have to move the ball forward.
What does that mean?
You have to make sure that the same task can be performed easier the next time.
How do we do that?
You guessed it, documentation.
The irony was we were just mucking around in a Wiki. What a perfect place to write down the steps on how we did the upgrade.
It can start out simple. Just write down the steps. This is a great start and will provide 80% of what you need. (See the 80/20 rule for more info).
But, how can we get it to 100%? The answer is with more care. We have to write down not only the what but the why. Why are we installing this new feature? Who needs it? What are its benefits.
The how is usually easier. We can link to the Wiki where we downloaded the plugin. We can copy/paste the instructions we used to unzip the file and the steps we took during installation. We can link to our new page showing the newly installed functionality.
What are the benefits of all this work?
Allen touches on it when he talks about "less experienced, and thus cheaper labor" but misses the key word:
Documentation enables delegation.
Solving a problem the first time is always the hardest. If we take the time to write down what we've done, we create power through delegation.
Now, back to Fran and her printer.
Hers is not the first printer set up in this office is it?
Of course not.
Rewind back to day 1 when we setup the office. What did we do to setup the printers?
Well, something like this:
Pick a model
Give it a name
Give it an IP address
Set it up in the domain
Add a share
Install some drivers
If we had written those steps down the first time around, we got 80% of what we need to delegate that task to the new IT guy or other team member. Heck, even the DBA could setup a printer if given the right documentation.
Over time, we fill in that 80% to get it to 100%. Edit, link, re-write, clarify. It will get there. So the DBA couldn't quite get the printer setup correctly following your instructions? You missed something. "Edit the wiki, you tell him." That grows the documentation and makes it that much easer for the next printer, and the next one.
In the end game, Fran may be the one setting up her printer herself. How cool is that? Self service.
Same thing goes for James and his firewall issue. This was not the first firewall change made in the company history. No, someone in some back room somewhere banged out many a cryptic invocation to make that firewall do their bidding for their own needs. It has happened many times.
Is there documentation on that?
"Security considerations!," your firewall guy will chime. I don't trust anyone else making these changes so I'm not writing anything down for you guys. This is simplemindedness. Allen also brings up my favorite scenario, "getting hit by the proverbial bus." So, what happens what that indispensable firewall guy doesn't show up to work tomorrow?
With a little bit of coaching, the IT guy, DBA, sysadmin, or whoever else can make that change and James can be on his way.
Let's tackle self-service on the firewall another day though.