If you’ve been an Oracle data professional for any length of time, you’ve undoubtedly had a conversation about reducing your total number of databases by consolidating applications schemas from each individual database into separate schemas in one monolith database. Certainly, in the days before shared dictionaries, pluggable databases, and thin cloning this argument could be made easily because of the high price of the database license. Consolidation is a winning argument, but doing it at the cost of data agility turns it into a loser. Let’s explore why.
The argument For Schema Consolidation
Essentially, the pro-side of this argument is cost driven. Using 18 schemas in 1 database instead of 18 databases with 1 schema each means: • Fewer database licenses – and this usually this holds true even if you use CPU-based pricing. • Fewer hosts – even if you need a slightly larger host to handle the traffic, its usually cheaper than N individual hosts – especially when you consider the managed cost. • Less Storage – The binaries and dictionary objects are often exactly similar, but we end up storing them on disk and in memory N times, and then backing them up to disk even more times. • Less CPU and I/O – 18 databases simply require more CPU and I/O than 18 schemas in one database even if you push the same traffic to the 18 schemas. • Less Database Management Overhead – fewer databases to manage means less time managing.
The argument Against Schema Consolidation
The con-side of this argument is very sensitive on conditions. In some cases, I’ve seen these cons really amount to very little cost and effort. In other cases, even without the newer technologies, the cost was so high that the approach was abandoned. Key things that usually cause Consolidation efforts trouble include: • Namespace Clobber – especially when you are consolidating applications that weren’t designed that way, all sorts of namespace conflicts can arise. I’ve seen synonyms, links, and packages wreak havoc on a consolidation effort, sometimes even requiring large re-coding because of the nature of the beast. • No Isolability – traffic, resources, and limits are no longer fully isolated. A lot of traffic to one schema can affect the performance of another. A huge update may rob you of the ability to get good response rates. A crash in one application can cause a domino effect – whether the fail begins at the database tier or the app tier. One failure affects all. • Conjoined Backup and Restore – Often, the database is backed up as a collective, and restored as one. So, great care must be exercised when only a single schema related to a single app needs a restore. Of course, you can work around this by creating separate tablespaces for each schema, and then using tablespace PIT Recovery, but that itself takes time and resources. • Risky Planned and Unplanned Outages – If you’ve got to patch the database, everybody goes down. If someone pulls the plug on your database host in the data center, everyone is hosed. • Resource Pool Management Complexity – if you’ve only got one database, then you’ve probably got one storage pool. So, unless you’re very carefully carving and monitoring it (which itself takes time and resources), you can cause all sorts of unintended side effects. • One Attack Surface – If 18 apps share a database, then they share an attack surface. An exploit against one is an exploit against all. • More Management Headaches – A lot more focus, concern, and worry about Quotas, Security, backing up/restoring schemas, isolation measures, and control. This is such a headache that a lot of work has gone into automation.
The Tide has Turned
First, the benefits aren’t as strong as they used to be. The marketing around Oracle 12c provides ample evidence that the same amount of work over a collection of databases takes up 6x less hardware. Pluggable databases, or databases with shared dictionaries make the cost side of the equation significantly less attractive. Thin cloning technology neutralizes most of the rest of the equation as it provides a way to have copies of the database at almost no cost, virtually eliminating the storage argument.
Then there are the cons. And, this is where I content that we have systematically ignored or under estimated the value of Data Agility.
The value of Data Agility
Getting the right data to the right person at the right time is such a key value for organizations because there are so many obstacles to doing it. And, instead of understanding the value of agility, we’ve spent a lot of time, energy and effort finding solutions to minimize the impact of not having that agility. Like what, for example? Letting developers code on smaller sets of data OR live with older data OR write their own custom “rollback” scripts. Encouraging testers to accept the idea that a rewind has to take 8 hours, or that they have to wait for a refresh because they might clobber someone else’s app, or that they can’t test in their schema today because another application is at a “critical” stage. Telling BI folks that they can’t get their data refreshed because other apps can’t be taken offline, and it just takes too long to ship the whole schema over. Telling all of the apps that they have to be down like it or not because we can’t patch one schema at a time.
Using Schema consolidation saves money at the cost of data agility, and shifts the burden in ways where we’ve been trained not to miss it, or where we think it’s an IT problem.
Delphix thin cloning lets you keep your individual databases, but pay the price of a consolidated one. Developers can code on a full and fresh set of data at a fraction of the cost and never write custom rollback scripts again. Testers can rewind in minutes without having a huge resource drain, avoiding wait times mostly required to avoid secondary effects outside their app. BI folks never have to go offline to get fresh data, and refresh is a minutes long operation every time. Patch and Upgrade can not only be accomplished on a rolling basis, but using intelligent refresh, can be performed once and distributed to all sorts of downstream systems.
So what?
If you’re considering schema consolidation, look hard at the ROI. What used to make sense even 3 years ago is completely upside down now. Pluggable databases destroy the cost argument at the memory tier, and Delphix Thin Cloning does the same at the storage tier. Schema Consolidation just doesn’t make the economic sense it used to make.
Comments