What’s the true price of data?

kyle Hailey
Nov 6, 2013
4 min read

By: Woody Evans

Data is Scarce

Data faces a scarcity problem. In particular, we have a data clone scarcity problem. Our need for data clones for Backup, Analytics and Development environments/QA is virtually unlimited and growing. But, we live in world of limited data. We don’t have enough disk space and bandwidth and hours in the day to get the data where we want it, or to maintain the value of our data by keeping it fresh. And because we have insufficient resources, we’re forced to make trade-offs.

We subset our database copy so we can save on disk space. We don’t backup development because the disk usage and network load would be overwhelming. We only refresh the data once a quarter because the cost to refresh is so high. The trade-off is that sometimes our code breaks in production (when it’s most expensive) because we never tested against full datasets. Or, we spend an extra 15 days of effort on our project when our developer accidentally blows away a key table in the development database and there’s no backup. Or, we don’t discover key use cases because we’re not using the freshest data, and thus we repeat our coding cycle.

Data Clones are like any other commodity, and so we can establish a Supply and a Demand curve for it. What exactly do these curves mean? The supply curve demonstrates what we already know – if we throw more resources and money at the data cloning problem, then our IT shops are happy to accommodate us and make more clones for us. The demand curve also demonstrates what we know – as the cost of making the clone rises, our application projects are less and less willing to invest to get more copies because of the expense. And finally, we arrive at our happy medium – our equilibrium – where the price of the number of clones we want matches the price of the number of clones for which we’re willing to pay.

Thin Cloning and the Data Supply Curve Database Thin cloning technology allows us to gain the full value of a database copy with a fraction of the cost. As such, it represents a shift in the Cloned Data Supply Curve. Because of the technology shift, we’re able to supply significantly more clones at the same price. Our graph of Supply and Demand curves for cloned data now looks something like this:

We should make three key observations about the shift in the supply curve. First, the Supply Curve has radically shifted to the right. At any give price, we are able to produce significantly more data clones because thin cloning technology can remove 90%+ of the cost burden. Second, the slope of the Supply Curve is much flatter meaning that the marginal cost of additional data clones is also significantly less. Third, our equilibrium point has moved far to the right. All things being the same otherwise, this means that we can either reduce the cost of our application project by delivering the same cloned data that we need now at a fraction of the cost. Or, we can have measurable impact on project agility either producing more features and capabilities on the same schedule, or delivering on the current project faster.

The Application Feature Curve Almost every technology leader I speak with knows that the bulk of their application projects are not on target. Industry studies at IDG Research reflect this, concluding that in an average company, 28 of a CIO’s 46 projects are behind schedule or over budget. What CIOs are unwilling to believe is that 85% of that delay is directly related to getting the right data to the right team at the right time. But, the evidence is incontrovertible.

If we were to treat (idealized) Application features like a resource, we would be able to graph their supply and demand as well:

Like any Supply Curve, the supply of Application Features is dependent on prices, technology, expectations, and the number of suppliers. Each of these determinants can be seen through the lens of the data being supplied to our applications. And, Thin cloning informs each of those determinants in some way. First, since the cost of a full-feature full-size database clone is so much less with thin cloning, we certainly have a change in price. Second, as evinced in our previous section, thin cloning certainly provides a technology shift.

What CIOs aren’t considering Here is where we encounter many of the things CIOs aren’t considering. Thin cloning causes expectation changes. For example, most IT shops operate on the premise that refreshes should be limited because they are so expensive. But, with thin cloning, not only are refreshes cheap, it’s actually less expensive to keep fresher data. Why? The cost of a thin clone is often measured as a divergence from a baseline and not whether the application’s last backup was incremental or full. So, the closer to the baseline, the less that is stored the fewer errors crop up because of out of date data, the less rework has to be done. For one of our clients, just the errors caused by having stale data accounts for 20% of production errors. Thin cloning also simplifies the clone creation process sufficiently that people we hadn’t previously considered, such as Developers or App Project Managers, or even Business end users can mange the creation of their own data clones, effectively increasing the number of suppliers of data clones.

Moving the Application Feature Curve Thin Cloning moves the supply curve for data clones, which in turn moves the application feature supply curve as well. Moreover, the quantity of features delivered moves in the same way as the data curve. Thin cloning affects all of the major factors of supply, and the data we’re collecting from Major Fortune 500 clients proves this out.

At Delphix, our thin clone technology is reaping an average of 30% application feature benefit to our clients. That means some teams deliver the same work 30% faster or some teams deliver 30% more work in the same time. In some of the most extreme cases, customers are delivering 550% of the features they used to deliver.

And, when I explain how data virtualization attacks the traditional bottlenecks for data delivery, it becomes easy for CIOs to believe how much is being saved. The bottom line is this: Thin Cloning is the answer to your data scarcity problem, and Delphix is the tool to deliver it.

What’s the true price of data?

Recent Posts

Comments