The subject of Forensics Analysis is huge with many books published on the subject. eMag has long been a leader in providing software in this challenging field, and our main area of expertise focuses on decoding tape and floppy disk formats. This article discusses the challenges of forensic tape analysis and how to extract the crucial data from a tape.

At the simplest level, to find out what is on a tape, it helps to be able to read the tape correctly. There are many different types of tape and while some look the same, the drives used to read them are different and so understanding the generational differences of media and the associated hardware is essential. For instance reading a DDS-4 tape is not possible in an (older generation) DDS-3 drive. While the tape will fit in the drive, the media is of a higher density than that supported by the drive. So matching the tape to the correct hardware is usually the first obstacle to overcome, and once you have overcome that you are ready to find out more about what’s inside.

Data on a tape is laid out in a “format”. A format can be thought of as being similar to a language. It has a specific syntax and set of rules and a definite logical layout. In computer terms we refer to nibbles, bits, bytes, words, long words, records, blocks, files, end of file markers, etc. These are all terms that you need to become familiar with when determining how to crack out data from a tape. A tape format expert will do what is called a “tape dump” where the contents of the tape are viewed in a raw format on the screen which in turn allows the format to be verified. This is usually as “deep” as someone needs to get into a tape. If need be software is written to access the tape, the format cracked and the data extracted.

Cracking a format is an interesting challenge. The secret is recognizing the patterns within the data. A format is normally structured so that it has set-up information encoded in it designed to tell the software reading it what to expect. The reading program then knows what to process and what to do with it. Some formats contain file names, dates, block sizes, etc, while others are more or less purely numerical. Text data within these formats can be encoded in ASCII (typically written by PC’s & Unix machines) or EBCDIC (Mainframes, older technology). These are internally represented as numerical codes, and they are easily translated so that the data can be read back like text when viewing a tape dump.

If we look at our business arena, which is commercial tape & disk formats, we are primarily concerned with electronic document storage. These can be backup tapes, interchange formats and so on. Ultimately they contain files generated by a software package whether it be word processing, databases, spreadsheets, graphical images, etc. The issue one faces when reading these tapes is whether or not you access the original software that created the tapes. If you do not, then reading the tape becomes a real challenge.

The job of someone performing a forensics analysis is to be able to take a tape and get the data off in a fast, logical, cohesive and accurate fashion. At the simplest level, someone might want to know what the names of the files are on a tape and when these were created. The tape is placed in a drive and read with the metadata being output to a file. This is known as tape logging and helps create a timeline for the tape contents as well as indicating the dates, names and types of files present. An investigation may involve hundreds of tapes and having the ability to log each tape and then bore down into those logs looking for certain details is very useful. The investigator can then just restore the files needed rather than the whole save set, which could be Terabytes of data, saving them many days and vast expenses.

At a more complex level, one might want to go to a business and electronically capture every bit & byte on every disk on every computer at the office. This is typically done by making a series of disk images to tape. The tapes would then be taken to a remote site where they could be restored to other disks, and thus recreating the working environment at the original site. Analysis can then begin, digging down through the various layers of data looking for signs of fraud and incriminating evidence.

Digging down forensically into a hard disk is a lot easier than working with a tape. In most situations when you are reading a tape and you hit end of data, then that is it. There are no “old or earlier versions” stored on tape. The contents of a backup tape are a snapshot in time so if you need an older version of the file you need to read an earlier tape. It is generally not possible to rewrite a section of data on a tape so any file you read cannot be newer than the tapes creation date. Reading beyond end of data sometimes produces useful results, but again you require the proper software tools to help evaluate and extract the contents and this is usually not possible (without specialized hardware). So with these limitations in place, tape is actually a great tamper proof backup medium.

The problem with tapes is that there are no format layout standards and as a result there are numerous formats and many variations within these formats. Developing your own code to read these tapes therefore becomes a very major issue and is best left to specialists in the field.

So where does eMag fit into this picture? About fifteen years ago we started writing a restoration program to read tapes of an unknown origin and format. We also did the same for floppy disk formats. Both programs have grown tremendously over the years. Today we offer our MediaMerge/PC and MediaMerge/UNIX forensic analysis software packages that can read, automatically recognize and restore 100’s of different tape formats and variations. And we have our InterMedia for Windows package that does the same for over 2000 floppy disk formats.

These packages give users access to data written on usually obsolete or non-accessible hardware. In most cases the user does not have to know anything about the tape, you just put it in the drive, tell the program to restore it and it does the rest. It is also used for data interchange between hardware platforms. For example if you want to read a VAX/VMS backup tape on a PC it can do this; likewise if you want to read and restore an older Backup Exec dataset and convert it to TAR for use on a UNIX machine, it can do this as well. If you want to look at the raw data layout of a tape on a byte-by-byte basis and maybe investigate what is beyond end of data, all interactively, you can do this as well. We have written an extremely comprehensive package that addresses the vast majority of your standard tape processing requirements, and this software has rapidly become one of the “must-have” electronic discovery tools.

If you would like to learn more about these software programs or eMag’s tape analysis and forensic capabilities, please do not hesitate to contact us. The software is under continuous development as new formats and features are requested and added, and if you are one of our many customers, then you already know how we support the program. And if you are not, then we look forward to helping you someday soon!