In another chapter of "The Cloud Never Crashes", I woke up Sunday to one of my AWS instances that was 'crashed' with a notice of "Amazon EC2 Instance scheduled for retirement". Retirement? What does that mean? I went to check my email and realized that the "retired" instance was the email server. Doh! It took me a little while to figure out what they meant. It means this "An instance is scheduled to be retired when AWS detects irreparable failure of the underlying hardware hosting the instance." This serves as a good reminder that the cloud is really someone else's server.
In theory this is an easy fix. The instructions at Amazon claims that stopping and restarting the instance will launch it on new hardware. In practice I could not get the instance to stop. This is where having physical hardware and a power cord to pull would have been nice. Failing to get the instance to stop I could not detach the EBS root volume. Even force detaching the EBS root volume didn't work. This is where daily snapshots of EBS volumes comes in handy. I was able to launch a new EC2 instance and then convert the last snapshot to an EBS volume and attach that to the new EC2 instance. Then I moved the elastic IP from the "Retired" instance to the new instance and hit "start'. Full recovery!
Now I'm left with a hanging EC2 instance that is still "Stopping" and an EBS volume that I cannot use, detach, delete etc. I tried reissuing stop commands a couple times. Eventually I noticed a "Force Stop" option. I do not remember seeing this on earlier attempts. I do not know if this shows up after the first failed stopped attempt or after several. I'm not sure, but I think that sends a trained monkey into the datacenter to pull the power cord. In any case it worked. This let me detach my EBS volume. From there was was able to stop the new instance, detach the EBS volume and attach my original EBS root volume. Now I have full recovery and I was able to clean up the loose ends.
Amazon Web Service has given us a new euphemism. Retired means It's Dead Jim!
CF Webtools is an Amazon Web Services Partner. Our Operations Group can build, manage, and maintain your AWS services. We also handle migration of physical servers into AWS Cloud services. If you are looking for professional AWS management our operations group is standing by 24/7 - give us a call at 402-408-3733, or send a note to operations at cfwebtools.com.