The Cancer "Moonshot": Big Data Analytics

In a commentary in the Wall Street Journal, former Senator, Dr. Thomas Coburn wrote about the promise of Big Data Analytics as a key weapon in our "Moonshot" war on cancer.

In a commentary in the Wall Street Journal, former Senator, Dr. Thomas Coburn wrote about the promise of Big Data Analytics as a key weapon in our “Moonshot” war on cancer. President Obama has called on America to undertake an initiative similar to the one which landed a man on the moon, in order to find a cure for cancer in the next 10 years. Vice President Joe Biden, a three-time cancer survivor, stated that this was a bold but attainable achievement.

Dr. Coburn mentioned a recent report from the American Cancer Society showed cancer mortality being reduced by over20% during the past 20 years. During this time, what started out as a “death sentence diagnosis” is now one that is often treatable, with many patients are living longer thanks to better treatments and earlier detection. In this way, science has been the game changer and has given new meaning to the lives of those suffering from cancer as well as their families.

This is a remarkably exciting time for healthcare discoveries. Teixobactin is the first new class of antibiotic discovered in decades. This is significant as many disease-causing bacteria have evolved resistance to current antibiotics. Possibly one of the most significant discoveries ever, CRISPR/Cas9 gene-editing can edit genes much like a word processing program can edit text.

However, despite the remarkable advances being made, Dr. Coburn points out that the many privacy regulations both here and globally may cause us to enter this race at a disadvantage. First and foremost is the requirement to maintain patient privacy under the Health Insurance Portability and Accountability Act (“HIPAA”). To be compliant HIPAA 164.312(a)(1) specifies that organizations must implement technical policies and procedures to allow access only to those persons and business associates that absolutely require access to Personal Health Information (“PHI”).

HIPAA does provide an option for us, though. Section 164.514 provides a standard and specification for the de-identification of PHI. It states that health information is not individually identifiable if it does not identify an individual and if there is no reasonable basis to believe it can be used to identify an individual.

But what are the chances that patient information can be re-identified, once it had been de-identified? There have been some notable cases of re-identification. In one instance, researchers from the Whitehead Institute were able to identify individuals from the 1000 Genome Project. In another, Massachusetts Governor William Weld was able to be identified, using a dataset released by the Massachusetts Group Insurance Commission and a voter list.

Does this mean that we throw our hands up and give up? No ! We did not give up when there were obstacles in the moon race and we cannot give up now.

New tools make discovery of PHI on databases faster and easier. Finding where PHI exists, in sometimes hidden places, is a major part of the battle. De-identification has become much faster and easier, giving researchers the ability to run many de-identification scenarios in a short time and without an army of developers. Finally, distributing this de-identified data is now faster and easier than ever. Researchers can now do data refreshes themselves in less than an hour on multi-Terabyte data stores.

The problem is clear, the need is apparent and the time is right. Healthcare Researchers can use new tools and techniques to discovery ways to cure and even prevent cancer as well as a host of other diseases in our lifetimes. We have the tools for researchers to speed up their analysis, maintain privacy and achieve amazing results.

Joe Santangelo
Delphix Corp.