Does Archiving in Native Format Matter?
Many companies faced with a need to archive data (usually email) due to regulatory requirements, eDiscovery responsibilities, or business requirements, look for solutions based on capabilities, cost, vendor reputation, security, and regulatory requirements.
In the past, companies in need of archiving solutions purchased one of the many on premise or cloud-based solutions that met their needs. However, many of these archiving solutions actually converted the data so as to enable more efficient storage, indexing, and search. The problem with data conversion is the data can be corrupted or metadata changed or lost nullifying its “golden copy” or copy of record status. In most cases, this is not really a problem… unless you are anticipating or are in fact, involved in litigation.
eDiscovery and Data Conversion
In the case of actual or anticipated litigation, by converting data, you may in fact be inadvertently destroying evidence. With that said, let’s take a minute to revisit the data responsibilities around eDiscovery.
The 2006 revisions to the Federal Rules of Civil Procedure (FRCP) established the concept of anticipated litigation. FRCP Rule 37(e) states “If electronically stored information that should have been preserved in the anticipation or conduct of litigation is lost because a party failed to take reasonable steps to preserve it, and it cannot be restored or replaced through additional discovery, the court:
- may order measures no greater than necessary to cure the prejudice (meaning ; or
- only upon finding that the party acted with the intent to deprive another party of the information’s use in the litigation may:
- presume that the lost information was unfavorable to the party;
- instruct the jury that it may or must presume the information was unfavorable to the party;
- dismiss the action or enter a default judgment.
In reality, companies are free to store or archive data in any way they choose, unless they choose a method in an obvious attempt to thwart eDiscovery. The litigation hold responsibility arises when companies should reasonable anticipate future litigation. Up to that point, data can be converted, deleted, or changed without the risk of eDiscovery repercussions. But, once anticipation of possible legal action arises, data must be secured in the state (including all metadata) it was in at the time the litigation hold responsibility came into effect.
This is a long way of saying that archives that convert original data during the archiving process need to be scrutinized and considered for temporary suspension after the litigation hold responsibility appears if the archive copy is the only copy of record.
A related issue occurs when companies responding to an eDiscovery order need to migrate responsive data out of an archive that has converted data. To respond, the data must be converted back into the original format – which carries risk of data corruption and loss. If not handled correctly, the migration process can violate the legal requirement to keep potentially responsive data unchanged from the format it is in at the time of litigation start.
Archives that store and manage data in its original native format nullify this risk.
Conversion and Data Analytics
An obvious challenge with archived data that has been converted is that of running data analytics processes against it. Data analytics (DA) is the process (via specialized systems and software) of examining large data sets in order to draw conclusions about the information they contain. Data analytics is primarily conducted in business-to-consumer (B2C) and business to business (B2B) applications. Organizations collect and analyze data associated with customer activities such as purchasing practices and customer support, business processes, market economics and other activities. Large data sets are categorized, stored and analyzed to study purchasing, usage, and problem trends as well as numerous other patterns.
The issue with using data analytics applications with archives where the data has been converted is unless the converted format is a standard such as PST or EML (most archives that convert data use a proprietary format), the data analytics application will be able unable to utilize the converted data, invalidating the value of the data set and DA software.
Again, data archived in its original format is much more usable by data analytics applications.
Data Conversion enables Data Ransom
Companies that store content in proprietary cloud-based archives are more susceptible to being charged large amounts of money to remove their data for any reason, for example due to vendor dissatisfaction. These cloud archives use the excuse that they must reconvert the data back to its original format before they can allow it to be moved. They make excuses that this reconversion process will take a great deal of time and cost. In fact, some cloud archive vendors will charge huge amounts of money to perform this reconversion process - sometimes eclipsing the monthly storage cost by a factor of 20 or more. In reality, they are holding your data for ransom hoping you will not be willing to pay the exorbitant costs and leave the data in their archive.
Obviously, when dealing with cloud archive vendors, you need to ask two questions:
- Do you store data in its original format or do you convert it?
- Can I move my data out at any time without charge or penalty?
In other words, avoid archives that do not store your content in its original format and charge you ransom to remove it.
Archive360 and Microsoft
Archive2Azure is the industry’s first managed cloud archive specifically designed for long-term archiving of compliance data on the Microsoft Azure Platform. A major differentiator is that the archived data is stored in your Azure instance with complete access and control, not some proprietary cloud archive.
Archive2Azure enables organizations to consolidate unstructured while lowering the cost for archival cloud storage. This includes legacy email archives, journal folders, inactive or departed employee work files, PSTs, file share content, backups, system generated data, and eDiscovery/compliance data. With infinite scalability, Archive2Azure delivers long-term, secure compliance retention and management at a great price.
And most importantly, Archive2Azure always stores archived content in its original (native) format, and NEVER charges to export data.
About Bill Tolson
Bill is the Vice President of Global Compliance for Archive360. Bill brings more than 29 years of experience with multinational corporations and technology start-ups, including 19-plus years in the archiving, information governance, and eDiscovery markets. Bill is a frequent speaker at legal and information governance industry events and has authored numerous eBooks, articles and blogs.