High-Fidelity Data & The Myth Busters

High-fidelity. For audio, it means minimal distortion in sound reproduction. If you’re old enough, you’ll remember the “Hi-fi” as a high-end vinyl record player. Before “stereo” the best sound available was “Hi-fi”. Audiophiles still use the term to define sound that is reproduced accurately, exactly.

vinyl_recordHigh-fidelity data serves the same objective. Accuracy. Like the most precise concert recording, it fully reproduces the original production data set in every detail. It’s complete. And like concert recording, it hasn’t been so simple to achieve. Until now.

The essence of rapid, well-crafted application development is carried by the quality of test data. So why do developers use low-fidelity test data? Aged subsets of production data? The leading reasons are space, expense, security, time – and a few myths.

Space: IT doesn’t have the storage space to create 10 or 15 production data copies.

Expense: There’s no staffing, hardware or software budget to create and maintain multiple production copies across multiple purposes and applications.

Security: Production data contains proprietary and personal information that must be masked for protection from unauthorized access.

Time: Even if space, expense, and security are dealt with, time and resources required for creating, masking, refreshing and processing endless copies is just too overwhelming.

As application developers have tapped the power of on-demand compute access supplied by private or public cloud, access to complete production data sets has remained a challenge. Development teams have been forced to accept an inadequate approach that subsets production data to create test copies. It’s a partial solution that stretches development time and impacts project quality. And still, there are the myths.

It’s common for Development and QA teams to have difficulty reproducing problems found in user acceptance testing (UAT). It happens despite care in provisioning Dev and QA environments with identical production OS, patches and App binaries. Bugs surface so late in UAT testing because Dev, Build and QA environments don’t typically have copies of production databases. They have data subsets or dummy data. For example, production and UAT might have a 30 TB Oracle database, but Dev & QA have a 100 GB subset.


This low-fidelity test data doesn’t expose boundary conditions or assumptions that software developers might make. Beyond time delays, this drives up expenses for finding & reproducing the problem, then fixing and testing the patch late in release cycles. Low-fidelity test data creates one of the biggest challenges for every development methodology, from Waterfall to Agile, to DevOps. And it frustrates all involved.

The DevOps world has a an extensive tool set to manage code check-ins, build creation, VM instantiation from templates, build deployments, and automated test case execution. But there are few vendors and meager tools that focus on ensuring that test environments are supplied with high fidelity test data. Why?


The difficulty maintaining high-fidelity test data to test environments is supported by eight myths. Created and repeated over 20 or 30 years, the myths have calcified as software development fact. Let’s look at an example where production has a 30 TB Oracle DB and there are 10 test environments, including sandbox, unit testing, build testing, QA tests, & UAT.

Watch Actifio bust these myths.

Myth # 1: 30TB X 10 test environments means 300 TB of storage needed.

Not any more. Actifio creates virtual copies of production database that feed virtual copies to test environments. Ten copies of a 30TB database still only occupies 30TB storage. Aside from some extra to save changes made by testers, that’s it.


Myth # 2: Virtual copies cannot be writable and scalable, and they’re read-only.

With Actifio, virtual copies are not only writable and scalable, all 10 test environments can execute read and write IOs simultaneously. Without any interference. All test environments, especially UAT mimic production storage protocols. Fibre Channel or iSCSI for provisioning – your choice.

Myth # 3: There’s too much process in creating and maintaining production database copies.

Self-service provisioning of virtual databases eliminates complex process. Authorized Dev, QA and UAT testers can login to Actifio, and provision their own virtual data. No DBA or complexity involved.

Myth # 4: Sensitive data content prevents use of production data in test environments.

Not a problem with the automation provided by Actifio to mask sensitive data using masking scripts defined by IT Ops or DBAs. Testers see only masked copies.

Myth # 5: IT will lose control of production data access.

Using Actifio, administrators can specify WHO can access WHAT copies on WHICH test servers. They have complete CONTROL and multiple user authentication tools.

Myth # 6: Copy creation impacts production.

After the initial copy is created, Actifio extracts only changed blocks from production databases. Production impact is negligible. At the same time, test data can be refreshed with regular frequency, giving testers steady access to the latest virtual, masked, refreshed copy.

Myth # 7: High-fidelity test data can’t be fed to test machines in Continuous Integration (CI) environments.

Not only can Actifio be integrated with CI tools like EnterpriseBee, Jenkins or other workflow engines, APIs and CLIs can provision masked virtual databases automatically. Integration that enables unmatched automation like this:

11:00 p.m.: Developer checks in code
11:15 p.m.: Peer review done
11:20 p.m.: Incremental build starts
11:30 p.m.: Incremental build done
11:35 p.m.: Jenkins job deploys build on test machines
11:40 p.m.: Jenkins job invoke scripts that invoke Actifio APIs to provision or refresh virtual databases on multiple test machines
11:45 p.m.: Automated test scripts are executed on multiple test machines against virtual DBs
12:00 a.m.: 70% functionality, performance testing done

So, within an hour of someone checking in code, the majority of testing is done on copies of masked production databases. All fully automated.

Myth # 8: It’s very expensive and unsecure to store production copies in AWS.

Actifio delivers secure, masked copies in AWS. It also keeps bandwidth and storage costs low with dedup and data virtualization.

Time to Deliver

High-fidelity test data means delivery of:

On-time projects
Self-service & convenience for testers
High-fidelity test environments that catch bugs early
Improved application quality
Solid data security control
CI Automation
Safe and economic use of private or public clouds for testing

The virtual data access provided by Actifio means that test cases will be different – more timely, comprehensive and effective. Queries can be optimized. Analytics streamlined. Actifio has reduced application development times by as much as 25%. Support teams are seeing similar reductions in quality issues. It’s like giving the teams three months extra time every year. Suppose sales had an extra quarter per year to sell, but not your competition? Priceless.

Actifio delivers a better way to develop, test, release and support applications. Hi-fi production data creates a fast, secure and scalable structure that consumes minimal storage. It’s a high value, cost-effective, self-service Development system. A myth buster.

Click here for a video where 3 copies of 1 TB Oracle DB are provisioned in 5 mins.

Click here for white paper on Actifio Test Data Management in AWS.

[hs_action id=”13357″]