Data Compliance

Compliance Without Compromise

GDPR asks for data privacy by design which is exactly what Delphix is doing. The Data Platform designs data masking into automated data delivery.

Jes Breslaw

Mar 12, 2018

Iiiiiiiin the blue corner, the Animal of Automation.. Dev 'special' Ops.

Iiiiiiin the red corner, the Professor of Protection Geee Deee Pee Arghhhhh!!

OK - I'm being a little unfair on both, but the reality is these two sides have very different agendas. DevOps wants you to attack with speed, GDPR wants you to defend and block but the trouble with any form of IT defence is that it normally carries a tax on performance. Your AV slows down your PC, your VPN slows down connectivity, encryption delays the speed of access. No-one would argue the importance of these security measures, but how do you balance the need to deliver fast and often with the importance of protecting personal data?

Who is the champ, agility or security?

FACT: software development requires good data. It needs to test changes against real-world information in order to prevent bugs and defects. Managing and delivering test data is already a slow and manual process. Anonymising or 'masking' data adds a further tax on performance.

Traditional masking tools puts agility on the ropes.

Compromise 1: Lower quality

Old-school tools take a long time to mask large multi-terabyte databases and as a result, recommend subsetting data i.e. mask 10% of your database and it takes 1/10th of the time. Synthetic data is another workaround entirely removing the need to use production data. Whilst in some circumstances subsetting and synthetic data is necessary and useful when it comes to pre-production testing, complete and current datasets are needed to cover all possible scenarios. The alternatives mean more defects, lower levels of quality and production issues.

Compromise 2: More complexity

Subsets and synthetic data are problematic in complex environments. If you have an application that requires integrated data from multiple sources and database types, creating the links and maintaining consistency is massively complex, manual and time-consuming. This delays projects and makes it hard to automate and also requires subject matter experts who understand how the data is stored and related in each of the integrated systems.

Compromise 3: Increased costs

Traditional tools require you to allocate large amounts of storage and infrastructure. Data needs to be copied, then masked and copied again and then delivered somewhere as a third copy. Again they scream from the expensive seats "subsetting is the answer" but the reality is very different for the reasons outlined above. The large amount of manual work also means expensive people spending more time doing the same repetitive work.

No more fighting over data

What's needed is a completely different approach that keeps you secure AND lets you go fast. Delphix Data Masking works as part of the Delphix Data Platform. Yes, that is a lot of D's but stay with me. The platform begins by taking a backup of all your productions databases which from that point on is continuously updated. Within your protected zone we then apply the masking policy you've built using the (super fast and easy) profiling tool to create a virtualized masked copy of the latest production data. This has no storage footprint and still contains integration between your different data sources meaning common fields in each source are masked to the same value even between say an Oracle and SQL database!

diagram 1

From here DBA's, developers, testers, analysts - in fact, anyone with permission, can request via self-service a Data Pod containing their own complete copy of the masked data. This copy is also virtualized so still requires no additional infrastructure. The masked data can even be automatically replicated into the cloud.

GDPR asks for data privacy by design which is exactly what Delphix is doing. The Data Platform designs Data masking into automated data delivery. This now means:

  1. You have visibility and control over all copies of data

  2. Data is delivered to users/teams in minutes

  3. Personal data is removed enabling GDPR compliance

  4. Data is complete and current

  5. All data sources are masked consistently to aid integration testing

  6. Costs are reduced as masking is no longer dependent on additional infrastructure and manual intervention

Oh, and If you need to use data subsets or context-aware synthetic data the platform can ingest these in much the same way allowing you to maintain low-storage-footprint versions along with the self-service and automation benefits.

aaaaaaaand the winner is... ...a draw!

Delphix: Compliance without compromise