Today saw two huge announcements from Amazon Web Services. The vast majority of Mashery runs on AWS infrastructure, and today they announced two features we've been asking for since they launched. Both will save us a ton of money and improve the reliability of the service we provide our customers.
The first new service is what Amazon calls "Elastic IP addresses". Amazon loves the term Elastic - it is the "E" in "EC2". Everyone at Mashery calls them Static IP Addresses, but either way, we're thrilled they exist. Why is this so huge? Well, we at Mashery have dozens of EC2 server instances up at any given time. And from time to time, we need to shut down one instance and replace it with another. It's one of the great things about EC2 - we don't have "maintenance" downtime for a server; we merely shoot the old one and activate a new one.
The problem is that with a new instance comes a new IP address. That is easily remapped if it is only called internally, from other EC2 instances, but if it's being accessed from outside of EC2, the external URL needs to be remapped to the new IP address. This is done by changing your DNS settings, and these changes take considerable time to propagate to all the millions of nameservers around the world. Until the new address has propagated, any calls coming to that URL will try to hit the old (now dead) EC2 instance.
Before Elastic IP addresses, you had two choices:
- If you could plan the switchover in advance, you could either have both servers up and taking calls at the same time - one at the old IP, and one at the new IP - until all the nameservers were updated and the old server stopped seeing any traffic; at that point, you could shut it down. Of course, if the instance just died spontaneously, you didn't have that luxury. And like any server anywhere, sometimes EC2 instances do just die.
- You could have a very short TTL (Time to Live) on your DNS record. The TTL tells the nameservers around the world how often to check in with your DNS to see if the address has changed. In theory, if you have a 24 hour TTL on your DNS, each nameserver (there are over 11 million nameservers in the world) that handles a call to that URL will check with your DNS if the IP address it has on file is older than 24 hours. 24 hours is fine if your IP addresses don't ever change without warning. But if you have to replace an EC2 instance handling critical traffic, you would have an "outage" of up to 24 hours until all the DNS servers reach the 24 hour expiration.
So at Mashery, we set our TTLs to 2 minutes. That way, all the nameservers (well, those that actually obey the stated TTL) would never be more than 2 minutes out of date. The problem, of course, is that our DNS has to handle calls from millions of nameservers, and a lot of them were hitting us every two minutes. We pay for DNS services from an outsourced DNS provider, and they charge by how many DNS inquiries we receive each month. It got very expensive, but we had no choice.
But with Elastic IP addresses, we can keep the same (static) IP address, and map from an old EC2 instance to a new one. The DNS does not need to be updated at all, and we can increase our TTLs. Longer TTL = fewer DNS calls = we save money.
Their second announcement is simpler to explain, though no less important. Amazon calls it "Availability Zones". We call it "multiple datacenters". Until this announcement, Amazon would randomly assign newly activated EC2 instances to a particular physical location; we had no way of knowing where it was. Now, Amazon has explicitly said they have more than one data center in Virginia, and you can specify which of your instances should be in one datacenter, and which should be in the other. Those that frequently talk to each other should be together. And now it is finally possible to ensure redundant infrastructure in more than one physical facility, so if one goes down you still have access to the other. Both performance and reliability should improve.
Plus, if you read the fine print, Amazon says that EC2 "currently" only exposes availability zones in one region (a region being a particular geographic area or country). Which means that someday (hopefully soon), they will expose more regions. So we can quickly, easily and inexpensively provide a geographically distributed service through Amazon (currently, Mashery relies on Limelight Networks for its clients who need to have our services provided from somewhere other than Virginia).
Thanks for the new features, Amazon, and bring on those new regions!