By: Bill Tolson on December 4th, 2017

Print/Save as PDF

My Healthcare Data is Where?

Archive2Azure | Archive2Azure, Information Governance | FastCollect | Healthcare

Blog 11292017_v3.jpgAccording to IDC, healthcare data is one of the fastest growing segments of the digital universe – growing from 153 exabytes in 2013 to an estimated 2,314 exabytes in 2020, a 48% annual growth rate. So where will the healthcare industry put all of this critical and sensitive data and how long must it be held?


What’s driving this data growth?

Diagnostic devices such as CT Scanners, MRI machines, and X-ray machines generate huge amounts of imaging data. And as the technology improves, the images get even bigger. The challenge with these images is they must be saved for diagnostic activities and regulatory compliance. A patchwork of country and local regulations specify how long healthcare providers must maintain patient data including diagnostic images. Complicating the data storage issue, several years ago new healthcare regulations directed all patient health records must be converted to electronic formats.

A Picture is worth megabytes

The need for data storage is immense and growing rapidly. Over 600 million imaging procedures are performed each year by U.S. based  health-care providers including CT scans, X-rays, ultrasounds and MRIs.

Hospitals are generally required to keep images for seven years, but many keep them much longer. They also retain backup copies as part of disaster-recovery planning and to comply with the federal health-care privacy laws. As a result, image archives are increasing by as much as 40% annually.

The IoT and healthcare

As the Internet of Things (IoT) has gained a sizable foothold in business environments, the healthcare industry has also experienced a growing incursion. These connected devices generate growing amounts of data which are causing data management and storage problems for the healthcare industry. Examples of healthcare IoT devices are:

  • Patient monitors
  • Drug delivery systems
  • In-room devices such as monitors and controls
  • RFID readers
  • Tracking devices and sensors for physiological measurements
  • Video cameras

These and future devices will generate huge amounts of data subject to regulatory and legal requirements. The question is; can these new data sources be captured, secured, indexed and searched, exported, and managed for the long term? Another issue the healthcare industry is facing is how to handle the expanding variety of data formats from these devices. Can the different data formats be stored in a common archive and made easily searchable?

Dark data is valueless (and dangerous) data

Because much of this IoT data is not yet stored centrally, as much as 80% is considered “dark data” because it is cannot be easily managed, searched, and exported for analysis. To be useful as well as meet regulatory requirements, data needs to be captured, secured, tagged, and stored so that it can be managed and searched efficiently. Because data is now coming from numerous dissimilar sources, it will either need to be converted into a common format or, be managed by a system that can recognize and work with the varying data formats.


The healthcare industry is still not protecting its data appropriately. According to IDC, 93% of all healthcare data reaches the level of stringent regulatory protection, however, they estimate only 57% is “somewhat protected” and 43% is not “adequately protected.” A lack of effective security dramatically raises liability for healthcare organizations. The consequences of a data breach and loss includes extremely large financial penalties, legal costs, and negative public opinion. For example, the regulatory penalties from the two most recognizable U.S. healthcare regulations, HIPAA and HITECH, include huge fines for privacy violations, e.g. with the introduction of the HITECH act, the maximum penalty per identical violation per calendar year is now $1.5 million.

With the quantities of data being generated by the healthcare industry, the regulatory climate and the specter of very public lawsuits again raises the original question; where will the healthcare industry store and protect all of this critical and sensitive data? In today’s climate, his question carries a much higher priority to solve.

Secure, scalable, inexpensive: the cloud is the only viable solution

As I have alluded to, healthcare data storage requirements are quadrupling every three years. This tidal wave of electronic medical information, including unstructured data, the IoT, and imaging records, will continue to put tremendous strain on individual healthcare data centers. In reality, the cloud offers the only viable solution for the out-of-control growth of healthcare industry data.

So if the cloud is the eventual destination for all medical data, what should healthcare organizations consider when creating their overall cloud strategy?

The first step in beginning a cloud strategy should be to fully understand what data is being generated and where it resides. As I mentioned earlier in this blog, up to 80% of healthcare data is “dark” - because it is spread across numerous single point repositories and cannot be easily managed, searched, or exported for analysis, regulatory request, or eDiscovery.

Questions to address before you start purchasing technology include:

Strategy related questions:

  • Where are all of the devices and locations where your organization’s healthcare data can be found?
  • What type of storage is it residing on?
  • Does the data storage meet current regulatory requirements?
  • What is the fully loaded annual cost of storing, securing and searching your data?
  • What would these costs (estimated) be if you moved to a cloud solution?
  • What would it cost (and what are the benefits) of centralizing all healthcare data so it can be automatically moved to a cloud solution?
  • Should you migrate all of your current data to the cloud, or keep it in your current on premise system and only move new data to the cloud?

To migrate or not to migrate – the answer is obvious

The last question about data migration is an important one due to the potential cost associated with migration versus the cost of keeping data on premise, for what could be many years. This decision will directly affect your TCO and ROI calculations.

What is the fully loaded annual cost of storing, managing, and securing large amounts of sensitive data locally versus the cloud plus the required data migration? In reality the comparison is straight forward. The fully loaded cost of on premise storage including management and security is approximately $0.15 to $0.30 per GB per month. The cost of non-proprietary cloud solutions which meet healthcare industry requirements will run between $0.005 (5 tenths of a cent) to $0.07 per GB per month. Many cloud solutions provide extremely high levels of security (including data encryption at rest), data management capabilities, and geographical data redundancy. The cost of data migration can differ widely, ranging from free, to $7,000 per TB – the average being in the $1,000 per TB range. and depending on the cloud archive provider, even lower

In almost every case, the TCO and ROI will quickly highlight that data migration of your current data stores to the cloud will produce a highly positive ROI.

Once you have answers to the above questions, you can begin building your cloud strategy. After you have completed your strategy, you should discuss it with your legal team and insist on documented approval. I once had a General Counsel tell me that your legal team’s approval is your personal insurance policy in case legal or regulatory issues arise later.

The next step is to begin choosing the technology and vendor. You should proceed into the technology phase by addressing the questions below:  

Technology and process related questions

  • Who are the biggest cloud suppliers in the healthcare industry?
  • Which one has the most healthcare industry references?
  • Can you choose and interview the references?
  • What are their SLA’s compared to other vendors?
  • Do they offer differing levels of storage i.e. Hot, Cool, Cold?
  • Do they offer geographically redundant storage (GRS)?
  • Have they ever been hacked and how long did it take them to realize it?
  • Do they offer encryption of data at rest?
  • Do they require you to provide them your encryption keys?
  • What data formats can they work with?
  • Does their system index all the data types your organization will encounter?
  • Is the archived data easily searchable?
  • Can you quickly apply litigation holds?
  • Does the system provide the ability to create retention/disposition policies?
  • Can copies of the data be easily exported?
  • Does the system offer granular access and functionality controls?
  • Are all actions within the system audited and reportable?
  • What is the fully loaded cost per GB of storage and management?
  • Does the cloud provider undergo regular security audits and re-certification?
  • Does the cloud provider completely understand their (and your) obligations under all applicable regulatory laws?
  • Do they have systems or partners in place to help migrate your current data in a legally defensible manner?

The answers to these questions will help you choose the best healthcare cloud provider and technology.

Microsoft and Archive360 Healthcare cloud solutions

Archive360 has partnered with Microsoft to offer a cloud managed archiving solution based on Azure cloud services - perfect for the healthcare industry. Azure offers a cloud platform for solution providers, such as Archive360, to offer native, connected intelligent data management and archiving applications to provide proactive, personalized healthcare data management across people, companies, and devices. And because Azure conforms to global standards in security and compliance, and because Archive360’s Archive2Azure management services is built on top of Azure, you can be assured that sensitive healthcare data stored within your company’s own Azure instance is secure.

Azure offers geographically redundant storage so your sensitive data is always replicated to ensure durability, high availability, and compliance. As well, Azure provides three cost effective storage tiers – Hot, Cool, and Cold so you can dynamically direct specific data to the most appropriate storage tier.

Archive2Azure provides the management and archiving capability that allows you to set granular retention/disposition policies, create customized - on-demand indexing, set encryption capabilities, assign access controls, and produce detailed reports based on system-wide auditing.

Archive360 also offers legally defensible data migration capabilities to the Microsoft Cloud with its FastCollect  solution.

For more information on Archive360’s healthcare archiving and data management solutions, please contact us.

A new Archive360 healthcare solution brief titled "Managing and Maintaining Unstructured Healthcare Data in the Azure Cloud" is now available.

For additional information on Archive2Azure and FastCollect, you can check out these related blogs:


About Bill Tolson

Bill is the Vice President of Global Compliance for Archive360. Bill brings more than 29 years of experience with multinational corporations and technology start-ups, including 19-plus years in the archiving, information governance, and eDiscovery markets. Bill is a frequent speaker at legal and information governance industry events and has authored numerous eBooks, articles and blogs.