What is Data Virtualization?
Data virtualization is technology that helps IT organizations more efficiently secure, manage, and deliver application data. Instead of relying on complex, manual processes to control and deliver application data, data virtualization allows IT to automatically deliver virtual copies of production data for non-production use cases. Common scenarios for leveraging data virtualization include application development, testing, reporting, archiving, or data migration.
Data Virtualization is VMWare for the Data Layer
Just as server virtualization unlocks efficiencies at the compute layer, data virtualization drives similar benefits for data in repositories such as databases, datawarehouses, and file systems. Data virtualization technology decouples application data from physical hardware, allowing end users to access storage-efficient virtual data copies though a self-service model.
How Does Data Virtualization Work?
Data virtualization platforms work in three key steps. First, data virtualization software installed on-premise or in the cloud collects data from production sources and stays synchronized with those sources as they changes over time. Next, the data virtualization platform serves as a single point of control for administrators to secure, archive, replicate, and transform data. Finally, it allows users to provision fully-functional virtual data copies that consume significantly less storage than physical copies.
Key Benefits of Data Virtualization
Speed and Agility Benefit–Legacy infrastructure and high-touch manual processes often bottleneck data delivery. In large organizations, the end-to-end process of provisioning a new copy of production data can take days or weeks. With data virtualization, end users can access full copies of multi-terabyte data sources in minutes. Moreover, data virtualization offers additional control over those copies: users can refresh, bookmark, rewind, integrate, and branch data copies to improve enterprise agility and collaboration.
Data Efficiency Benefit–Over 90% of the data in environments used for development, testing, and analytics is redundant. Data virtualization platforms consolidate this data into a single compressed and de-deduplicated footprint before sharing data blocks across all downstream environments. Rather than making and moving new data blocks, data virtualization solutions intelligently share common data blocks to drive storage efficiency.
Data Security Benefit–By automating data delivery, data virtualization solutions reduce administrative touchpoints that drive privileged user access risk. In addition, coupling data virtualization with data masking solutions allows IT to secure and deliver virtual data without exposing confidential information.
Key Use Cases of Data Virtualization
Application Development / DevOps–Data virtualization can provide read-writeable virtual data copies that can be quickly spun up or torn down. They can be shared among teams or branched and versioned just like code, eliminating dependencies on physical ticketing systems to deliver key data.
Test Data Management (TDM)–Data virtualization can complement or replace traditional test data management solutions such as subsetting or synthetic data generation. Fast delivery of full datasets can compress testing cycles and increase software quality.
Backup and Disaster Recovery–Continuous data protection, granular recovery-point accuracy, and significantly reduced storage requirements makes data virtualization an idea fit for backups.
Cloud or Datacenter Migration–By more easily provisioning data for testing and validation environments, data virtualization eliminates dependencies between migration teams and production teams, reducing downtime and accelerating migration.
Packaged Application Projects–Delivery of high-quality production data to development teams accelerates implementations, customizations, and upgrades for packaged applications such as ERP.
Things Data Virtualization Is Not
Server Virtualization – Server virtualization transformed data centers by enabling higher utilization of both server infrastructure and IT resources. Data virtualization affects the data underlying enterprise applications, bringing the efficiency of server virtualization to the data layer.
Service Virtualization – Service virtualization technologies simulate the behavior of applications that are inaccessible for testing purposes because they are too complex, not yet fully functional, or outside of organizational control.
Storage Cloning – Enterprise storage arrays provide efficient, read-write snapshots. However, they lack the functionality and transactional awareness that data virtualization offers to solve the complex data delivery requirements for application projects.
Data Federation – These solutions provide an abstraction layer that maps multiple autonomous database systems into a single federated database, often for the purposes of analytics or reporting. While this may also be referred to as "data virtualization" these solutions do not aim to provide similar functionality or benefits.
Replication – Replication solutions provide a mechanism to move data from one place to another, often for analytics or data protection purposes. However, they provide neither the storage-efficiency nor the agility benefits of data virtualization.
Application Virtualization – encapsulates software from the underlying operating system, allowing it to run consistently regardless of the environment of installation. This provides consistency and ease benefits.
Desktop Virtualization – gives end users the experience they need to perform their work, while removing dependencies on a particular PC, sometimes storing key data remotely. In their simplest implementations, they increase ease of use; more sophisticated solutions also decrease data risk.
Network Virtualization – creates logical networks that are independent of site-specific constraints, improving data security or making distant sites available for local operations.
Storage Virtualization – eliminates dependencies on physical storage, allowing efficient use of resources and flexible management.