Maybe some of you noticed but my blog, vendors, etc… stopped working for a comple of days. Today i’m going to tell you what exactly happened:
Sunday evening, June 13:
I Decided after a classic kernel upgrade that i had to reboot the ethernia server briefly, I warned everybody on voice chat, it was supposed to take only a minute or so.
What happened? I’m not entirely sure, somehow in the last 15 days i “broke” something in the system, or maybe it simply was a bad kernel config? To this day i don’t know and i probably will never know.
I quickly rebooted the machine in rescue mode and prodded at various things, earlier kernel version, and various other things. I simply couldn’t bring the machine back online in normal mode.
I dropped a mail to the support to get a KVM plugged so i could see what kind of startup error I was having, visited the newsgroups in search for an answer, late in the night, after checking the price of a new machine i decided that is they refused to assign me a KVM i would take my business somewhere else and get a new machine.
In the meantime i start downloading the last backups of the server.
I finally get to bed around 3am.
Monday, June 14:
I wake up around 9am, no answers from my ISP, i finish getting the last files that didn’t get through and around noon i rent a new machine. Their site say they have spare ones available under 1 hour.
- Roadblock 1: I’m a new customer, new credit card, they have to verify it manually, which brings me to 2PM
- Roadblock 2: My old ISP finally answer, saying they don’t have any KVM in this sector and to try again later… fuck off.
- Roadblock 3: One of their office drone finally let my card check in. But now they want copies of proof of adress, ID card, credit, card…
2-3 hours later i finally get the damn machine, i start by my usual manual installation method, pop into rescue, partition the disk and grab a stage3 Gentoo tarball.
After a couple hours doing stupid errors nd somewhat progressing i finally get to boot the barebone system. And well it does boot, and goes online… BUT IT DOESN’T LET ME SSH IN! WHich makes for a pretty useless server, and no way to check what happen because aparently even the logging facility isn’t running.
Running out of patience i swallow my pride and install the ISP’s gentoo ghost. I didn’t want a 64 bit system so i figured i would use a 2007.0 and upgrade from here.
That was yet another bad move, the 2007.0 worked fine but wouldn’t upgrade, i tried doing an intermediate upgrade through a 2008.0 portage, but no luck with that either.
So there I am partitioning AGAIN, installing a gentoo 10.1 64bit, it’s getting pretty late but the system is running. So i ssh in and start transfering the backups from my home computer, which takes a LONG … LONG time…
That’s still today actually, i’ve been compiling and upgrading the system at the same time i’m uploading the huge Mysql backup, installing apache, mysql php, suphp, sudo, webmin, well… most of the things i use.
That’s when i started deploying the backups that i relaised that the backups where… well most of them, pretty OLD. Somehow the backup cron job had been failing for a while and i didn’t realize it. Which made me feel quite sad at the perspective of another 24 hours of upload.
That’s when my good friend Leo dropped on my poor uneducated head another of her absolutely awesome suggestions, using scp from the old machine to the new machine. The results where amazing, transfering the Mysql database took 2 minutes instead of 8 hours.
This is really what saved my day.
As i’m talking, ethernia is back up and running, most of the hosted sites are working and those who don’t are like this solely because the dns servers take a while to refresh.
I’m crossing fingers that the machine will reboot properly the next time, because next step is to upgrade the kernel…