Amazon RDS: Poison or Pill
October 29, 2009
As soon as read the AWS newsletter about Amazon RDS, I started looking for a Megaphone to start shouting at folks – keep away! Amazon RDS or Relational Database Service places Amazon into the mire of shared hosting and AW users into a position of false confidence. Harsh words considering, overall, I feel Amazon’s service offerings are best-in-class. AWS offerings have historically pushed the envelope with regard to practical usage-based computing, something which ancient providers such as Sun and IBM have attempted to accomplish for decades; in this case I define practical as both usable and cost effective for small and large tasks. Up until now such systems weren’t trivialized to x86 hardware and required special programming considerations, access to academic institutions and/or a large budget. By combining SLA-supported x86 virtualization alongside application services such as S3, SQS, and SimpleDB, AWS has provided a usage-based on-demand computing solution which is simpler than task-based computing and as secure and reliable as virtualized or shared hosting. With it’s on-demand nature AWS is a cost effective for everything from small tasks to those requiring a datacenter of processors.
So why is Amazon RDS so bad, so much that you shouldn’t use it?
Well, there’s not an easy answer, the better question is to ask yourself: Why do you think AWS will be better than your own MySQL deployment? There is no right answer because almost any answer will probably, one day, bite you in the ass. Hard. I mean data loss, and it won’t be Amazon’s fault.
RDBMS systems and applications which depend on them are built from the ground up to rely on persistence, integrity, and static data models (schema). In contrast AWS has been built for distribution, decentralization, and the “cloud”. For Amazon, this service is somewhat of a U-turn from their original direction and has also placed a stamp on their forehead which says “That MySQL Guy” which is not good — I have nothing against mysql, however, as a de facto entry-level (free open source) software, it has accrued a strong following of immature software. Such software has nothing to do with the basic purposes of AWS or MySQL but has everything to do with how Amazon’s support and engineering staff will be spending their time which is supporting users and software which aren’t built for the cloud.
I hope that RDS won’t be a situation of butterflies & hurricanes but here’s a quick list of why the relative cost of RDS is high both for Amazon (the company) and all of it’s AWS users:
- Cost for Amazon (operations, engineers, and products)
- MySQL, like most open source systems, has been historically buggy software with a trailing release+testing+production schedule which requires continuous testing between production releases for large deployments (such as RDS).
- MySQL has a large set of features which vary across releases and which share equal presence in production; in other words, Amazon will need to cater to providing production support for multiple versions, not just the latest stable version.
- Amazon has no control over features and capabilities of MySQL and is thus limited to what MySQL provides; while MySQL provides many “good things”, Amazon will still be obligated to maintain through the bad. This is a shared disadvantage of AWS Map Reduce via Hadoop however, those are mostly mitigated because Map Reduce is such a low-level distributed system.
- MySQL is very flexible and itself scales very well however it doesn’t do so by itself and requires a significant effort to be properly configured for the data being managed. All the folks who don’t know this will default into thinking Amazon will do this for them and will be disappointed when it doesn’t “just work”. Whether they ditch RDS or bug Amazon’s support, either way, it’s not a positive situation.
- Cost for AWS (primarily EC2) users
- Potential degradation of service and support for EC2 instances
- With RDS available Amazon can defer issues with regard to running MySQL on EC2 instances to a recommendation for RDS — this will be a terrible waste of time for both parties.
- MySQL is a very centralized system and by transitioning the decision of where MySQL resides in the AWS cloud from the user to Amazon, Amazon will be further centralizing the impact of MySQL on the cloud. Whereas users will randomly have MySQL deployed across any EC2 instance, Amazon will be appointing MySQL to specific hardware; this is based on the assumption that Amazon is clustering RDS deployments onto local hardware and not randomly deploying instances in the cloud. This is somewhat of a compromise for security and adds significant SLA risks (read: cost) to Amazon. In short, when a MySQL cluster dies – a LOT of folks are going to be VERY unhappy – their support tickets will be a burden to staff and their requests for credits will be a financial cost. Moreover, support staff will be yielding priority to these customers over other services because of the implicit severity.
- Increased cost
- RDS instances cost >10% more than regular instances and only come with the added benefit of backups — something which every system should already have in place. If you do choose to delegate the task of backups to RDS, you’re paying extra for a task you’ve already thought about doing yourself.
- Cost of keeping your database, it’s backups, and it’s history all within AWS is multiplicative and if you grow to the point where you’re ready to move off you’ll be charged to transfer all the data to an external system. While this is a subjective cost it’s still worth pointing out; if folks aren’t already doing backups right, they’ll likely not know that cost effective database backups make use of binary logging facilities, not filesystem snapshots, and use significantly less disk space (and thus I/O).
- False confidence
- As I’ve mentioned before, letting other folks control your backups for you is a mistake. Failure is a matter of when, not if, and you’ll be in better control of responding if you understand what you’re dealing with. Just because RDS is doing you’re backups doesn’t mean you’re safe.
- RDS users will expect MySQL to scale on-demand as everything else works that way with AWS and it’s just not that simple. Scaling a database requires analysis and a balanced combination of server settings, data normalization, and indexes; all of these things will still be the user’s responsibility and Amazon’s solution of “throw hardware at it” is a haunted path to send it’s users down.
- Potential degradation of service and support for EC2 instances
Overall, I feel that Amazon could quickly cannibalize the value and quality of AWS if they (continue to) introduce trivial services. Supporting open source software they have no control over is a significant increase in relative support and operations cost. Amazon seems to be approaching this by making the cost of RDS instances more than EC2 which is a mistake because the real cost is the lost opportunity of engineers spending their time on systems which are more efficient for cloud computing – Amazon could charge 3 times an EC2 instance and their engineers would still be better off building technologies for cloud-based systems and not centralized RDBMS-dependent web applications.
Where I feel Amazon has fallen short the most, is that RDS only provides single-instance MySQL support and nothing more. No load balancing, replication, Hadoop integration, or any other form of data abstraction which could make it functional in a cloud computing context. Not implementing these features is a very clear indicator that AWS is focused more on short term revenue generating feature rather than cost effective cloud computing systems or improving the shortfalls of legacy centralized system.
With all this said, I have to consider the possibility of this being a good move for Amazon. I present the potential issues with RDS simply to warn folks from relying on it as a crutch, and, to point out the new direction AWS has veered is into choppy waters. There are several aspects of RDS which will give Amazon insight into correlations among and between the varying systems of data storage and processing – comparing SimpleDB, MapReduce, MySQL, and general resource consumption could shed light onto how their cloud is being used at a higher level than processors and bandwidth. Last, Amazon might be aware that MySQL is a crutch and is putting the service out there as a way to wean folks off of centralized systems.
OpenSolaris: Just call it “Open Source”
April 26, 2008
Genetic Open Source doesn’t sound too bad.
When it comes down to it, nobody knows what Sun is doing with OpenSolaris. No different than any other company which must reinvent itself every five years, Sun is changing the way they do things. What’s different is that open sourcing a product creates an irreversible social event in the lifecycle of the product being released.
What’s difficult is that we’re all so used to Apache, BSD, Mozilla, and MySQL – other open source systems which have been around for more than a decade. Not only do these products have maturity in their communities but the products themselves are mature. While Sun has a mature product with OpenSolaris, the product is a newborn to open source.
Let’s also not forget that Linux is still just a kernel. OpenSolaris should be considered with regard to suse and redhat rather than Linux. And while yes, kernel development is being done with OpenSolaris, that is not relevant to the inevitable result of a community developing an Operating System and not just the kernel.
Sun is putting their OS out there and saying “hey, we’re putting our code where our mouth is, now you can too”.
If it’s anything like Mozilla, let’s not forget Firefox was a rebel project (called Phoenix); as such, I doubt we can expect anything out of OpenSolaris from Sun, the company. The best we can expect will be from a side project which Sun may or may not take under it’s wings. Neither was possible before.
Regarding TCO report and Suncritters, let’s not forget that, Sun has to make money too. The only thing RedHat has that Sun doesn’t is experience with an open source operating system. One question worth asking in that context is, what type of support did RedHat provide when it first started? The answer is free and by mailing lists; then RedHat became commercial (<1yr) and of course has 24/7 support now. OpenSolaris is a new OS for Sun, relative of course – Solaris has been around awhile, but as an OS built through-and-through by a community, it’s new. In fact, as you’ve pointed out – OpenSolaris doesn’t have a large community yet other than Sun engineers – why are your expectations so high (link)?. Moreover, TCO analysis is all crap with regard to open source and ‘end users’ typically don’t understand any better. How many folks who download open source actually modify the code? Last, the linked survey is from 2005 and not to be a chump, but, Open Source has exploded and changed dramatically since that survey. Firefox wasn’t even big yet and MySQL 5 was still beta.
Organic or not, marketing or not, community or not, OpenSolaris is still an open source Unix operating system. Open sourcing code is great, no matter how it’s done.
Let’s just call OpenSolaris “Open Source”.
Re: What Sun was trying to do with OpenSolaris
April 25, 2008
OpenSolaris vs Conceived Intentions
As from /., Ted at thunk.org has compiled a rant post “essay” with regard to Sun’s OpenSolaris community.
About Ted:
I’m a systems programmer working at IBM. This allows me to get paid for what I used to do for fun, which is definitely neat hack. I’ve worked on the Linux kernel since 1991, and am probably the first Linux Kernel developer in North America. I am currently on the board of the Free Standards Group, as well as Usenix, where I organize the annual Linux Kernel Summit, which brings together the top 75 Linux Kernel Developers in the world every year. The Kernel Summit takes place in Ottawa, Canada, right before the annual Ottawa Linux Symposium.
Ted mentions Roy’s watching the ripples post, which is a great infomercial on day.com and it’s “open source” developers and their “closed source” CRM; criticizing Sun is all the rave and there are no substantial suggestions for what Sun should do with regard to Sun’s community. I won’t bother mentioning Ted is a Linux developer working for IBM and complaining about Sun. While entertaining, I would much rather see fights between the PostgreSQL and Sun’s MySQL folks just because I like to mis-pronounce PostgreSQL and MySQL is quirky.
I really don’t understand the blanket criticism. Somebody please let me know. Comments about trademarks = delete. It’s Sun’s toy, ok? If they want to share it, it’s up to them as to how and who they want to share it with. If Sun screwed you in some way, their execs are being open source media whores, or you’ve just gotten shit from Sun (and I don’t mean a 1996 newsgroup post), then please do tell your story. Complaining about Sun not liking your idea is about as entertaining as adopting a new Linux task scheduler or arguing over the transactional functionality of MySQL TRUNCATE command.
Open Source (Linux) vs. Open Source (Sun)
Let’s get back to the topic of “open source” – the synthetic kind. Since blog posts are essays now, and comments are syntactic qualifications which require research and justification, I’m going to throw a new term out there. Please use wikipedia, google, or anything else you would like to verify this newfound term, invented right here, and today – “Synthetic Open Source Community“.
Personally, I think non-organic is a little raw and vague – is it alien, poison, silicon-based? Synthetic works because while we’re not dealing with chemicals, we are dealing with “Computer Science” and we might as well qualify open source developer contributions in tandem with corporate oversight as flaming a chemical process.
So let’s not call Sun’s OpenSolaris community non-organic, we’ll call it synthetic. Right now Sun is trying to get bang for their buck and they’re going to use “open source” to do it. Linux is “open source” and OpenSolaris is “open source”; in that aspect, OpenSolaris is ‘just like’ Linux. Not really, but where do you see any Sun exec expressing and/or advertising Solaris this way anyways?
If you’ve run across some poo-poo posts on OpenSolaris, I recommend two posts (below) by Stephen at RedMonk. He concisely sums up presents both “sides” of the “just like Linux” adjunct. The other place to look are the opensolaris mailing list archives. I would recommend `-trademark -legal` when searching so you get posts about OpenSolaris and not the trademark whining.