Did you ever have that helpless feeling when something on your computer goes wrong? Most regular users have that feeling daily. My wife was petrified to buy something on a web site yesterday because she didn't see the little lock in the corner. I was (naturally) thrilled to hear that she was even looking for the little lock. It turns out the site had some embedded images that were not secure - even though the page itself was secure.
That's a pretty typical user experience. You see something you don't expect, but you have no context or experience to tell you why something is behaving the way it is. But what if you are supposed to be the expert....
No one working in IT can escape troubleshooting. It my be as benign as trying to fix a paper jam to as onerous as trying to find that cursed memory access violation in your program. The most experienced programmer or network engineer may find himself scratching his head and saying to him or herself, "It never did that before." When this happens to you, there are a few simple rules to keep in mind.
Although you may not see it right now there definitely is a reason that the problem occurred. If you are the one who's been tasked with solving the problem, chances are you will be able to find and understand that reason eventually. That fact should give you confidence. You should be able to tackle the problem with an upbeat attitude, knowing you will eventually be able to resolve it. If you don't believe that you will be able to solve the problem you will give into helplessness. Helplessness is not a useful emotion when it comes to IT. Computers and applications are quite cold and pitiless when you cry and whine and bemoan your plight. Instead, lay aside frustration and negative emotions and tell yourself, "This problem has a solution and I'm going to find it... and when I do they will think I'm a genius!!" Trust me, a positive attitude makes you a much more productive troubleshooter.
Did you see the shotgun troubleshooter? He's the guy who walks up to a server and just starts running patches and updating drivers and the like. He figures if he does enough things for the server that one of them is bound to work. This troubleshooter is sometimes effective (yes it's true!). He's sort of the like the Physicians Assistant I see when I can't get into see my doctor. She's not sure if I have an infection but she's going to give me anti-biotics all the same.
The problem with this approach is that it does not add to your pool of knowledge and does not definitively answer the big "why" question. Granted, sometimes it's important to get something up and running now and you have to begin to try things. Most of the time however, a trouble shooter is an investigator. His task is to gather all the evidence and then arrange his guesses in order of most to least likely (or sometimes from easiest to most challenging to fix). Then he tries to take action on each of his guesses and tests the results before moving on.
This is the basic task of the troubleshooter. Think Analytically, then attempt one fix at a time. Sometimes this means updating a driver, rebooting and testing, then updating a piece of software, rebooting and testing and on and on till you find the one that needs repair. Patience is your friend. When you are done you will have a catalog of experience (and some notes) and next time you face the same issue you are likely to be able to solve it more quickly - making you a better troubleshooter.
Don't start tearing things apart, changing INI files and deleting registry entries. Start out with your notepad (or your I-paq if you've mastered the stylus). Before making a change, write down what it was. This is especially true of changing settings, registry entries and Network address configurations (ever try to get the right numbers down for a non-standard subnet mask? ouch!). Sure, it takes longer, but think of how much shorter it will be if you have to put it back the way it was. In the words of Sam Gamgee's Gaffer, "Short cuts make long delays."
Hey - it's not lazy to do the "easy stuff" first - it's just good thinking. It takes 30 seconds to change a keyboard and 3 hours to reinstall the operating system. Before you tear that box apart and rip out the hard drive, doesn't it make sense to eliminate the small possibilities first?
I'm a minimalist. I like clean simple web sites with lots of readable text. I skip over the garish flash intros and the marketing copy and go right to the support or product pages (ha). Most PC's are so loaded down with programs and TSR's that it's a wonder they work at all. I worked on one PC where the user was having trouble and just kept installing virus software hoping it would go away. The box had PCillian, Macafee and Symantec all competing to scan files and resources. Sometimes the first thing you do is turn off the screen saver, disable power management, unplug the scanner, fax and coffee maker (just kidding about the coffee maker - but wouldn't that be cool!) and deinstall the widgets and gadgets that have accumulated. Get rid of spy ware, clean out the file cache, check the services for unneeded stuff and in general get the machine back to doing what it's main task is (unless it's main task is collecting spyware).
This is especially true of servers! The default Windows Server installation comes with lots and lots of things enabled that you don't need. For example, I can't tell you how many web servers I've seen with the spooler service running. I've been in a data center were all the servers where running the 3-D pipes screen saver - thought it looked cool (it does - it's way cool). The best server has the best proc, ram, drive system and bandwidth available - and as minimal peripherals as possible. Why put a 128 meg Geoforce Video card in there with TV output? Think of the heat and overhead? Of course, if the techs want to play doom on the lan late at night it's pretty useful ....
Remember that feeling of helplessness? It's counterpart is the Viola! complex. This is when you are so desperate for a clue that you jump on the first idea that hops in your brain and fail to pursue alternatives. Remember your first Job is to "catalog" all the possibilities and arrange them in a sensible order (easiest to hardest, most to least likely, or some combination). If you latch onto your first idea you may miss something simpler or more obvious.
It sounds trivial and it may go against your opinion of yourself as a full-fledged guru (or maybe your a man's man who doesn't read directions on riprinciple), but get the documentation for the devices or application you are working with. The web is a great resource for this. I found 5 year old manuals for a used gas stove the other day (don't ask). It seems like every manual in the world is published to PDF. In addition there are support forums and chat rooms and email lists. The resources are endless. A big part of what you do as a troubleshooter doesn't depend on what you know, but on how well you are able to find things out! One word of caution - don't always trust everything you see on the web. My Mom still thinks Bill gates is sending her emails and wants to give her $10,000 :)
My last piece of advice is so obvious I hesitate to use it. Find out the last thing that was changed. About 80% of the time your answer lies in discovering what was the last update, software, driver or device that was added to as system. The user isn't on board with this. To him or her, everything was fine and suddenly the monitor started going crazy. That little felix the cat emoticon running around the screen that Aunt Sally sent this morning has nothing to do with it! Happy troubleshooting.