Hardware Failure: a Network Admin’s Nightmare

Posted on March 30th, 2006 in X-Files by Bayo Oyekole

Yes. It’s that thing that is most likely to keep a network admin up at night or give him the horrors. Complaints from customer service, or clients themselves (who by some stroke of luck happen to have your mobile number) calling you at 11pm to say “Hey, I can’t connect to the net! what kind of internet service is this?” And you know straight away that either:

1. Power has failed and the man in charge of changing over to the Generating set has slept off
2. There is no more fuel in the Generator (oops! somebody’s going to get a query tomorrow)
3. A cable has slipped from its socket (almost improbable but possible. no problem)
4. Worst of all: your network hardware has failed!

If it’s number 4, you have to spring out of bed and get to the server room. Start tracing: is it the access point? is it the ethernet cable? is it the router? did a power adapter blow up?

I have had a horrible week. Sleeping in the office (staying awake and slowly going crazy, more like). First, someone starts to run bittorrent on the shared 128K link, choking up the bandwidth and blocking 12 other users out. (the MRTG logs tell me this).

MRTG Traffic Logs

Then for some reason, the harddisk on the internet gateway goes kaput. In the middle of the night! on a weekend. The madness begins. First I have to find a distro of FreeBSD that will run as a liveCD and can save config files to floppy. Then I found out that the floppy drive is dead. I try to switch to another PC while i put up the faulty one.

And i realise that i have to take out the three NICs (yes: WAN, LAN and PRIVATE) from the bad PC to the good one. Then i find that the good PC has a SCSI disk and FreeBSD refused to touch it [for some crazy reason]. So i start hunting for a spare ATA disk pending the time we can order a new drive. Ah, found one in the store. Now, to transfer the image to disk - I dont have blank disks to write an ISO from my store of Linux ISO’s.

This is a really bad day. I start looking at my options. Its almost 5 am. By 7 am I will start getting calls (shit!) . I remember i have m0n0wall, a FreeBSD distro that I can transfer to an IDE drive as a byte-for-byte image. I have to open up my PC, slave the newly found drive in it, and boot to Linux (I could have used physdiskwrite.exe from the m0n0wall site, but i dont have internet access right now, see?)
So I go:

gunzip -c /media/hda5/downloads/iso/generic-pc-1.22.img | dd of=/dev/hdX bs=16k

After transferring 440 blocks of data (or something, my brain is quite frazzled now). The process is complete. So i put the new disk into the troublesome PC, and bootup. At last, i see a bootup screen and the familiar FreeBSD bootloader. After a while i get a menu (nice!)

monowall menu

Configuring the interfaces was cool, and the box is up and running again. Phew. it’s 9 am already.

I think i am going to get an embedded PC from Soekris, maybe the net4801. At least that one wont have a failed hard disk, and m0n0wall is optimized for them; this current setup is a waste of power, space and a good 12GB hard disk!

I can sleep now.

Post a comment