* The following article was written by eMag and appeared in the May Edition of e-Discovery Law & Strategy.

More often than not, elements of corporate investigations and legal proceedings come to an abrupt halt because archived electronic files cannot be accessed. However, thanks to the latest generation of restoration software applications, access can now be gained to virtually all archived electronic files. This new level of access is already having a tremendous impact on litigation and corporate record keeping.

Generations of computer tapes

It has always been the prevalent opinion of the courts and others in the legal profession that most archived electronic files simply can’t be accessed through traditional electronic discovery techniques and, as such, critical evidence is often not introduced. This opinion is driven by three perceived barriers:

1. The operating systems and applications used to store information are unknown. The parties who originated the documents are no longer available, perhaps, and therefore unable to provide direction.

2. Even when the software and media are known, it is likely that the programs are obsolete and that support from the manufacturer is no longer available. Therefore, the systems needed to run the programs are unattainable.

3. Personnel with knowledge of the stored or back-up files are no longer employed by the enterprise in question. In many cases, staff members who can cull through enormous amounts of data to discern which files are important and which are not may have moved on to other positions.

With the development of non-native restoration software applications, these barriers have been eliminated and access can be provided to an extensive range of storage media. Experts estimate that more than 300 families of tape drives and hundreds of software formats have been used over the years to store electronic data. Nevertheless, non-native restoration allows firms to restore data regardless of the physical or logical format of the tape media, floppy media or optical platter – without having to re-create the originating or “native” environment, or the combination of hardware and software that was used at the time the original materials were preserved. Even the oldest formats can be accessed, including media from 30 or more years ago such as 10-inch floppy disks and the earliest generations of magnetic tape.

Non-native restoration applications are also able to circumvent complex security features that prevent access to certain files. For example, backup data is designed to be read by the originating application and sometimes has embedded passwords placed in the data stream. The native application restoring this data knows to interrogate the user for the password at this point in time and, if the password does not match, access is denied. Non-native restoration can bypass this obstacle. Likewise, data can be retrieved when the medium has been damaged or the data compromised. In reality, the only time when retrieval is impossible is when data has been overwritten.

How Non-Native Restoration Works

When attempts to natively restore data are made, a complex set of variables come into play. To start, hardware or tape drives that are compatible with the data must be identified – in other words, does the media match the hardware? Once the appropriate hardware is found, a second question is raised: Can the operating system internal to the tape drive read the media, making sense of the raw 1s and 0s that encrypt the information? When these hurdles are overcome, the appropriate software application is launched to translate the data into a readable and useable format.

Although this might seem like a straightforward process, it quickly becomes unmanageable when the varieties of hardware and software are considered. There are hundreds of different tape drives (DLT, LTO, AIT, 8mm and 4mm, etc.) and countless software applications (Veritas, Legato, Tivoli, ArcServe, NT, Tar, etc.). Plus, these applications have evolved through thousands of versions and each is no longer compatible with older versions. Backwards compatibility can be an issue. For instance, the current version of the application might not understand data generated by an earlier version of the same application.

In other words, when trying to create the originating environment, organisations may need to purchase and try several tape drives before identifying one that will actually access the stored data. Then, the organisation might need to purchase the applications and licensing for potential operating systems and software programs – and all potentially relevant versions that might have been released over the years. In some cases the tape hardware and backup software may no longer be available because the vendor has gone out of business or the model is obsolete. Through trial and error (a process that may never bear fruit) the proper combination of hardware and software might be discovered to retrieve the information contained on the file. In any event, the effort is prohibitively expensive, in terms of both time and money.

That is not the case, however. When linguists need to understand a dead language – Latin, for instance – or a dialect they are not familiar with, they seek out other scholars to assist them in translation. While these colleagues are not native speakers of the alien language, they have either decoded it or been taught how to process it.

Non-native restoration bypasses this matrix of variables. Vendors of this technology, like eMag Solutions, have spent years developing the in-house capability to decode the intricacies of various drives, operating systems and software applications, relying upon a significant pool of knowledge gathered over the years. In addition, these vendors have developed the capability of converting e-mail messages from all servers and systems, including foreign scripts like Cyrillic.

Most electronic discovery vendors capable of non-native restoration work closely with major litigation consultants and law firms that handle corporate litigation issues. While each case is unique and requires a customised approach, all demand meticulous planning and standardised procedures throughout each stage of the restoration process. Specialists who are highly trained in the art and science of restoration oversee the projects and determine how best to apply advanced technology to retrieve the information.

At the start of the electronic discovery process, non-native restoration applications access the header contained at the beginning of the computer storage tape. This header contains vital information about the tape, including the system upon which it was created, the type and version of software program used to save the data, and any salient set-up information. From this, the application begins to pre-categorise the internal database and determine what is required to populate the database.

As the process moves ahead, the application pulls data off the tape and moves it into today’s environment – from an obsolete AS400, for instance, onto a state-of-the-art PC. From here, data is saved and can then be manipulated and used as required. This makes it possible to convert from one type of data structure to another (e.g., writing converted data to a different tape or operating system), from one type media to another (e.g., from 9840 to LTO3), from one format to another (e.g., from Novell to a PC) and from one magento-optical platform to another (e.g., from optical platters to a CD or DVD).

As data restoration is occurring, it is possible to dynamically de-duplicate on the fly, which also represents a tremendous savings of storage space, time and money. Each time a document is opened and revised, a copy of that version is saved within the program thus creating duplicates. These copies are, in turn, stored on the back-up media. eMag’s application, for instance, eliminates all repetitions through a process know as de-duplication, often resulting in a 30-80 percent reduction in data. This means it is easier and faster to examine the remaining data to determine what can be productively used.

Electronic discovery vendors often provide online repository and storage where extracted data is hosted in the native format so client organisations can query data, and electronically cull and review document contents prior to complete conversion.

Surveying The Impact

The use of non-native restoration represents an electronic discovery breakthrough that has far-reaching significance. Essentially, if data has been stored or backed-up, it can be retrieved in virtually all cases.

Most corporate organisations believe in the value of storing important data and materials – operating under the assumption that the more they keep, the less risk they face. And, to a large degree, this is correct, thanks to increasing compliance demands that have resulted from legislation like SEC requirements, HIPAA and Sarbanes-Oxley. Failure to respond to requests for data can lead to fines, economic sanctions and litigation. With non-native data restoration, organisations now have the ability not only to access historical data, but to easily catalog, organise and analyse the information as well. Once retrieved, data can be converted to contemporary programs so that it becomes useable and actionable.

On the other hand, easy access to electronic data can influence organisations to keep too much archival information. Most regulatory agencies require that records be kept for a prescribed number of years. However, if files beyond that term are requested and if the organization has indeed stored them, they must also be produced. With the advent of non-native restoration, companies can no longer contend that older data is inaccessible. On more than one occasion, the practice of keeping files longer than required has offered the opportunity for a smoking gun to be uncovered. In other words, excess data storage can result in excessive exposure to discovery risk.

In next month’s edition, we’ll conclude this article…