Copy on Write
Copy on write is a storage or filesystem mechanism that allows storage or filesystems to create snapshots at specific points in time. Whereas Clonedb is a little known and rarely used option, storage technologies are widely known and used in the industry. These snapshots maintain an image of a stroage a specific point in time. If the active storage makes a change to a block, the original block will be read from disk in its original form and written to a save location. Once the block save is completed, the snapshot will be updated to point to the new block location. After the snapshot has been updated, the active storage datablock can be written out and overwrite the original version.
Figure 4. This figure shows storage blocks in green. A snapshot will point to the datablocks at a point in time as seen on the top left.
Figure 5. When the active storage changes a block, the old version of the block has to be read and then written to a new location and the snapshot updated. The active storage can then write out the new modified block.
Using storage snapshots, an administrator can snapshot the storage containing datafiles for the database and use the snapshot to create a clone of a source database. With multiple snapshots, multiple clones with shared redundant blocks can be provisioned.
On the other hand, if the source database is an important production environment then creating clone databases on the same storage as the production database is generally not a good practice. A strategy that allows the cloned database files to be stored off of the production storage environment will be more optimal for performance and stability.
EMC Snapshot with BCV
EMC has a number of technologies that can create database thin clones. In the simplest case the clone databases can share the same storage as the source databases using snapshots of the storage. The storage snapshot can be taken and used to make a thin clone. EMC supports up to 16 writeable storage snapshots allowing up to 16 thin clones of the same source datafiles (while sharing the same storage as the source database). If the source database consists of several LUNs then snapshots must be taken of the LUNs at the same point in time. Taking consistent snapshots of multiple LUNs at the same point in time requires the EMC Timefinder product that will manage taking snapshots of multiple LUNs at the same point in time.
Taking load off of production databases and protecting production databases from possible performance degradation is an important goal of cloning. By taking snapshots of the production LUNs one incurs an extra read and extra write for every write issued by the production database. This overhead will impact both production and the clone. On top of the extra load generated by the snapshots, the clones themselves create load on the LUNs because of the I/O traffic they generate.
In order to protect the performance of the production database, clones are often provisioned on storage arrays that are separate from production. In the case where production LUNs are carved out of one set of isolated physical disk spindles and another set of LUNs are carved out of a separate set of physical spindles on the same array, it may be acceptable to run the clones within the same array. In this case, Business Continuance Volumes (BCV) can be used to mirror production LUNs onto the LUNs allocated for the clones. Then shapshots can be taken of the mirrors and those snapshots can be used for thin clones; or, in order to protect the production LUNs from the overhead generated by snapshots, the BCV mirrors can be broken and the LUNs allocated for cloning can be used to start up thin clone databases. Filesystem snapshots can be used to clone up to 16 thin clone databases using the LUNs mirrored from production.
More often than not, however, snapshots are taken of BCVs or the BCVs are broken and then copied to a second non-production storage array where snapshots can be taken and clones provisioned off of the snapshots. In this case, though the EMC environment is limited to only 16 clones and if those clones are from yesterday’s copy of production, then a whole new copy of production has to be made to create clones of today’s copy of production. This ends up taking more storage and more time, which goes against the goal of thin cloning.
EMC’s goal has been backup, recovery, and high availability as opposed to thin cloning; however, these same technologies can be harnessed for thin cloning.
The steps to set this configuration up on EMCs system are:
Create BCVs and then break the BCVs
Zone and mask a LUN to the target host
Perform a full copy of the BCV source files to target array
Perform a snapshot operation on target array
Startup database and recover using the target array
Figure 6. Timefinder is used to snapshot multiple LUNs from the production filer to the non-production filer to be used for thin provision clones.
EMC is limited to 16 writeable snapshots and shapshots of snapshots (also known as branching) is generally not allowed. On some high-end arrays it may be possible to take a single snapshot of a snapshot, but not branch any deeper.
EMC VNX
While copy on write storage snapshots are limited to 16 snapshots, there are other options available in order to increase the number and to enable branching of oclones. EMC has another technology called VNX which improves upon previous Snapview snapshots. The VNX technology:
requires less space
has no read+write overhead of copy on first write (COFW)
makes snapshot reads simpler
supports clones of clones (branching)
When the older Snapview snapshots were created they required extra storage space at creation time. The newer VNX snapshots don’t require any extra storage space when they are created. The older COFW feature caused more writes for the storage than before the snapshot was in place. With newer VNX Snapshots the storage writes become Redirect on Write (ROW) where each new active storage modification is written to a different location with no extra read or write overhead.
Another benefit of VNX is how blocks are read from the source LUNs: in the older Snapview, reads from snapshot had to merge data from the storage with the Reserve LUN Pool (RLP) where the original data blocks that have been modified are kept. With the newer VNX the snapshot data is read directly from the snapshot source LUN.
EMC’s Timefinder capability is also no longer necessary with VNX. Up to 256 snapshots can be taken in a VNX environment, and snapshots can be made of multiple LUNs simultaneously without needed additional software capabilities to create a consistent copy.
Despite all the improvements on VNX, VNX is still considered a lower end storage solution as compared to Symmetrix arrays that have all the short comings described above.
VNX relaxes some of the constraints of the older Snapview clones; however, in both cases the problem of efficiently bringing new changes from a source array to arrays used for development still exists. After a copy is brought over to a target array from source database LUNs, changes on the source (fresh data) cannot easily be brought over to the target array without a full new copy of the source database. Multiple point in time snapshots are also difficult, as having a target database on the development array share duplicate blocks with another version of the target database (different point in time) is impossible with this architecture. Instead, multiple copies will take up excess space on the target array, and none of the benefits of block sharing in cache or on disk will apply if multi-versioned clone databases are required.
EMC Snapshots with SRDF and Recover Point
A major challenge of both BCVs and VNX is keeping the remote storage array used for clones up to date with the source database. EMC has two solutions to this challenge; each provides a way of continuously pulling in changes from the source database into the second storage array in order to keep it up to date and usable for refreshed databases:
Symmetrix Remote Data Facility (SRDF)
RecoverPoint
SRDF streams changes from a source array to a destination array on Symmetric storage arrays only.
RecoverPoint is a combination of a RecoverPoint Splitter and a RecoverPoint appliance. The splitter splits writes, sending one write to the intended destination and the other to a RecoverPoint appliance. The splitter can live in the array, be fabric based, or host based. Host based splitting is implemented by installing a device driver on the host machine and allows RecoverPoint to work with non-EMC storage; however, because the drivers are implemented at the OS level the availability will depend on the operating system that has been ported. The fabric based splitters currently work with Brocade SAN switches and Cisco SANTap. Fabric splitters open up the usage of RecoverPoint with non-EMC storage. The RecoverPoint appliance can coalesce and compress the writes and send them back to a different location on the array or send them off to a different array either locally or in another datacenter.
One advantage of RecoverPoint over SRDF is that SRDF will immediately propagate any changes from the source array to the destination. As with all instant propagation systems if there is a logical corruption on the source (for instance, a table being dropped), it will immediately be propagated to the destination system. With RecoverPoint changes are recorded and the destination can be rolled back to before the point in time of the logical corruption.
SRDF could be used in conjunction with Timefinder snapshots to provide a limited number of consistent point-in-time recovery points for groups of LUNs. RecoverPoint on the other hand can work with consistency groups to guarantee write order collection over a group of LUNs, and provides continuous change collection. RecoverPoint tracks block changes and journals them to allow rolling back target systems in the case of logical corruption or the need to rewind the development system.
Figure 7. EMC SRDF or RecoverPoint can propagate changes from source filer LUNs to the target filer dynamically, allowing better point in time snapshotting capabilities.
Using SRDF or RecoverPoint allows propagation of changes from a source array to a target array. On the target array, clones can be made from the source database at different points in time while still sharing duplicate blocks between the clones no matter which point in time they came from.
In all these cases, however, there are limits to the snapshots that can be taken as well as technical challenges trying to get the source changes to the target array in an easy and storage-efficient manner.
More information on EMC snapshot technologies can be found via the following website links:
http://chucksblog.emc.com/chucks_blog/2013/04/are-snaps-dead.html
http://www.emc.com/collateral/software/white-papers/h4175-recoverpoint-clr-operational-dr-wp.pdf
EMC snapshot limits 8-16 per LUN: http://www.emc.com/collateral/software/white-papers/h1349-emc-clariion-snapview-snapshots-snap-sessn-knwldgbk-wp.pdf
EMC VNX snapshots 256 per LUN: http://www.emc.com/collateral/software/white-papers/h10858-vnx-snapshots-wp.pdf
Summary
With EMC, thin cloning can only be achieved by using backup technology; in essence, the process has to be architected manually in order to support databases. How can the same goals be achieved but with database thin cloning specifically in mind? See the following blogs on Netapp, ZFS and Delphix.
Addendum
I’ve been getting questions about how EMC compares with Delphix. Delphix offers technology that is completely missing from EMC arrays
EMC historically only supports 16 snapshots and no branching. EMC has no tools to transfer changes of a database from the production storage to the development storage. In theory one could use either SRDF which only works between compatible Symmetrix arrays for sending changes from one to the other or they could use Recover Point. Recover Point requires two additional appliances to capture changes on the wire and then play them onto different storage. Neither is setup for databases specifically to take into account things like file system snapshots with putting the database in hot backup mode. I haven’t met anyone with EMC that thinks that EMC could do much of what Delphix does when we explained what we do.
We have 3 parts
Source sync
initial full copy
forever incremental change collection
rolling window of save changes with older replace data purged
DxFS storage on Delphix
storage agnostic
compression
memory sharing of data blocks (only technology AFAIK to do this)
VDB provisioning and management
self service interface
rolls, security, quotas, access control
branching, refresh, rollback
Of these EMC only has limited snapshots which is a part of bullet 2 above but for bullet 2 we also have unlimited, instantaneous snapshots that work on any storage be it EMC, Netapp or JBODs. Also if one is considering a new SSD solution like Pure Storage, Violin, Fuision IO etc, only Delphix can support them for snapshots. We also compress data by 1/3 typically along data block lines. No one else AFAIK is data block aware and capable of this kind of compression and fast access. There is no detectible overhead for compression on Delphix.
No one in the industry does point 1 above of keeping the remote storage in sync with the changes.Netapp tries with a complex set of products and features but even with all of that they can’t capture changes down to the second.
Finally point 3, provisioning. No one has a full solution except us. Oracle tries to with EM 12c but they are nothing without ZFS or Netapp storage, plus their provisioning is extremely complicated. Installation takes between 1 week to 1 month and it’s brand new in 12c so their are bugs. And it does’t provide provisioning down to any second nor branching etc.
Delphix goes way beyond just data
SAP endorsed business solution
EBS automated thin cloning of full stack – db, app, binaries
Application stack thin cloning
Delphix customers have seen an average application development throughput of 2x.
One SAP was able to expand their development environments from 2 to 6 and increased the project output from 2 projects every 6 months to over 10.
Points to consider
• Storage Flexibility: EMC cloning solutions only work with EMC storage – increasing lock-in at the storage tier. In contrast, Delphix is storage vendor agnostic and can be deployed on top of any storage solution. As companies move towards public clouds, influence over the storage tier vendor diminishes. Unlike EMC, Delphix remains relevant on-premise and in the cloud (private or public).
• Application Delivery: Database refresh and provisioning tasks can take days to weeks of coordinated effort across teams. The sheer effort becomes an inhibitor to application quality and a barrier to greater agility. Delphix is fundamentally designed for use by database and application teams, enabling far greater organizational independence. Delphix fully automates various functions like refreshing and promoting database environments, re-parameterizing init.ora files, changing SIDs, and provisioning from SCNs. As a result, with Delphix, database provisioning and refresh tasks can be executed in 3 simple clicks. The elimination of actual labor as well as process overhead (i.e. organizational inter-dependencies) has allowed Delphix customers to increase application project output by up to 500%. In contrast, EMC cloning products increase cross-organizational dependencies and are primarily designed for storage teams.
• Storage Efficiency: While EMC delivers storage efficiency simply through copy on write cloning, Delphix adds intelligent filtering and compression to deliver up to 2-4x greater efficiency (even on EMC storage!). Additionally, most customers realize more value from other Delphix benefits (application delivery acceleration; faster recovery from downtime etc.) that EMC does not offer or enable.
• Data Protection and Recovery: While EMC only allows for static images or snapshots of databases at discrete points in time, Delphix provides integrated log shipping and archiving. This enables provisioning, refresh, and rollback of virtual copies to any point in time (down to the second or SCN) with a couple of clicks. It also enables an extended logical, granular recovery window for edge-case failures and far better RPO and RTO compared disk, tape or EMC clones. Many Delphix customers have wiped out the cost of backup storage as well as 3rd party backup tools for databases with this Delphix “Timeflow” capability.
• 2nd Level Virtualization: Delphix can create VDBs (virtual databases) from existing VDBs, which is extremely valuable given the natural flow of data in application lifecycles from development to QA to staging etc. For example, a downstream QA team may request a copy of the database that contains recent changes made by a developer. EMC cloning tools can only create first generation snapshots of production databases and do not reflect the real need or data flow within application development lifecycles.
• Integrated Data Delivery: Many enterprise applications (ex: Oracle EBS, SAP ECC etc.) are comprised of multiple modules and databases that have to be refreshed to the same point in time for data warehousing, business intelligence, or master data management projects. Delphix uniquely supports integrated and synchronized data delivery to the exact same point in time or to the same transaction ID.
• Resource Management: Delphix offers resource management and scheduling functionality such as retention period management, refresh scheduling, and capacity management per VDB that is lacking in EMC’s cloning products. For example, some VDBs for a specific source database may be retained for a few weeks while specific quarter ending copies can be retained for extended durations (for compliance). Delphix also supports prioritizing server resources allocated to process IO requests per VDB. This is important in environments where DBA teams must meet SLAs that vary by lines of business or criticality of applications.
• Security and Auditability: Physical database copies and EMC clones alike constantly proliferate and increase the risk of audit failures and data breaches when sensitive data is involved. Delphix delivers a full user model, centralized management, retention policies (for automated de-provisioning), and complete auditing for VDBs. Delphix also integrates with homegrown and 3rd party masking tools so virtual copies can be centrally obfuscated – avoiding tedious masking steps per copy.
• V2P (Virtual to Physical): In the event that customers experience downtime across primary and standby databases, Delphix can quickly convert a VDB (from any point in time) from virtual to physical form to minimize the cost and risk of downtime. This provides an extended recovery layer and also a quick path to creating physical copies for other purposes like performance testing.
Comments