Zip - It!!!
OK – JUST ZIP IT!
That’s what my mother used to say to me when I was bad. But that’s not what we are talking about here.
However, we will talk about “bad” zipping software. Yes zipping software doesn’t always perform as expected. What a revelation.
Even though zipping of files/data is a routine normal occurrence, those that are conducting security or forensic exams and "save" their results or reports for delivery to management or attorneys should consider the capabilities of these zipping programs with relation to what you are zipping. Some forensic examiners zip image files, which would work fine. Others zip extracted evidence after processing with a forensic suite. And still others massage the initial extracts and get the data down to a point where they will provide it to management or legal persons for review. The problem(s) you should consider, is that during your process, you may inadvertently create some unusual data files. The forensic suite may extract alternate data streams, or extract files and put them in a folder which has a long filename. I have seen forensic suites extract data to long filenames as a matter of course. Especially when you are asking it to recreate the original path/folder structure, and you yourself are outputting the data to a sub-directory which itself starts multi-levels down the tree. You then don't know how long the ultimate extracted folder will be, and when you go to zip up the data for delivery, you may miss one or two important files. Or totally miss an alternate data stream which was hiding behind an important file. Last, but not least, does the zipping program maintain the source and ultimately unzipped file dates? If you have never tested your zipping program to see how/what it does in unusual situations, you probably should. Test not only your version, but the version that the opposition is using.That being said.
First, let me state: the tests I ran are not at all scientific. I consider them practical. Second: I am not using names here because I do not want to point fingers. I’m just pointing out what I found in my unscientific tests. If my minimal tests show a failure, why proceed further. Third: The test suite I used was purely arbitrary, and seeded with items which, from other tests, I knew might cause problems with zipping software, but would not be unheard of in a forensic or backup environment. Fourth: I DID NOT test any encryption capabilities of the software. I use separate PGP encryption of the stand-alone zipped file when necessary. Finally: Run some tests yourself. Don’t take my word for it.
I began by selecting three of the most popular zipping software packages. My versions may not be the most current, because I’m CHEAP and don’t spend money needlessly. Most had both GUI and command line capability. However, for consistency, and because most people prefer the GUI interface, I only used the GUI in my tests. A very important thing to remember with the GUI versions, is that unless all the correct boxes are checked, you may not get the results you expect, and may not obtain results similar to mine. But if the program hid the option so much that I couldn't find it, I may not wish to use that program in the future.
Some required in-depth choices of operations which I would consider required, so I had to look for those options I wished to implement. Each package showed different options for the same operation, and some had no option for a needed capability. I MAY have missed an important option. If I did, it only means the option was so far buried, that a normal person might also forget to look for it, and find it. I tried as best I could to locate and check all the boxes which would cover the items I tested for (ie: Long filenames, Alternate Data Streams, Date retention).
Then I created a folder with the following parameters: Files containing Unicode characters in their filename (ie: CYRILLIC names).
Second, I created some files with long filenames (path/filename > 255 characters). Third, I inserted some Alternate Data Streams (ADS) in a number of the files, both those with normal length names, and long filenames.
I created those three types of files because, in a backup scenario, an investigation and subsequent evidentiary output which would probably be sent to an opposing party (attorney), one or all of these types of data (files) might be necessary to produce. Forensic suites and other forensic software operations may routinely export files with any or all of these items.
Also, I have seen discussions, where persons have been asking which methods people use to store data for posterity. That’s a long time, not your rear end.
The common methods of storing or delivering data are in a zip format. Not only for space saving, but also for inclusion into a single file for distribution to a requesting party.
So, lets test some of the zip(ping) capability of these programs.
Again, I’m not going to name names, or identify which program failed in which area so here are the general results. If you think some of the items may belong to your processes, you might test the software yourself. What a novel idea.
First: All zipped and unzipped all filenames correctly, (UNICODE, etc). Second: Two of three processed Long Filenames correctly. (see ADS below) Third: Only one processed ADS’s in normal and LFN filenames, Fourth: None of them reset the last access date of the original files after the zip process. Fifth: Two of three properly reset (original) last access date of the restored file to the original access data. Sixth: Two of three properly reset all the MAC dates to the restored file. The third only reset the original ‘M’odified date.
Here is a quick and dirty spreadsheet showing which program did what. If you want to know the real name of the program contact me at: dm at dmares.com
Program #UnicodeLFNADSReset Src AccessReset Dest AccessReset MAC Dest.# 1PASSPASSPASSFAILPASSMAC# 2PASSFAILFAILFAILPASSMAC# 3PASSPASSFAILFAILFAILM# 4PASSPASSPASSPASSPASSMAC
File #1, even though it captured the ADS files in both the normal and Long Filenames, the options to obtain that capture proved to be very confusing. I had to try and create the zip file over 5 times before the ADS's were properly captured. File #2, the GUI interface was nothing less than horrible to work with. So much so, I uninstalled it as soon as the tests were completed. Special note of Program #4, which is WINRAR and is used in both Linux and Windows. It is quite inexpensive (actually I think its shareware, but I paid for a license), when I first tested it, the program did not have the ability to reset the source last access date. However, with one simple request, and what I think was a reasonable evidentiary explanation, the programmers agreed to include the reset of source last access in the next version. Well, as of December 11, 2019 the version 5.80 has all the capabilities which I tested and it has passed all my tests.
I personally prefer the command line, since I have more control. Just take a look at all the available commands and options with the command line version of WinRar (called rar.exe). "Technically" there is a limitation to the length of the path/filename in WinRar. But it is normally not a concern. If you have an exceedingly long path/filename (>2047 characters) I suggest you get to reading war and peace. (just a joke). The 2047 limit should be enough for most instances. In the next section I have provided a command line that seems to work very well at creating a self extracting exe which passes all my tests. You may want to check it out.
"c:\path_to_winrar\rar" a -sfx -r -ts+ -tsp -os _DEMO.EXE -zc:\"program files"\winrar\comment.txt folders/files-to-ad -ppassword The content of the comment.txt file which contains routine required options is: The comment below contains SFX script commands which will cause the extraction of the .exe to be silent (not ask for user input) and overwrite any existing files during the extraction. The other item which begin with a semi-colon are unnecessary and are included for other purposes not needed at this time. Silent=2 Overwrite=1 ;Setup=setup.exe ;SETUP=setup16.exe not needed ;Presetup=hello.exe ;Path=C:\temp\default_unzip_path ;PATH=.\. ;SavePath the subsequent extraction/unzip command line (which is easily included in a batch file) is C:DEMO.exe -s2 -tsp -tp+ -os -ppassword
Even though the above process seems to work and passes all my forensic, evidentiary preservation requirements, this mention is in NO WAY and endorsement of WINRAR. Don't take my word for it, and test for yourself any zipping program you use and be comfortable with its operation. I have tested and use WINRAR when preparing all my test data and it has worked admirably.
Consider any or all of the above shortcomings when you are archiving, or preparing for discovery your files.
In short, none of the zip programs tested in this minimal test process passed all the tests. The tests included: Unicode File Name retention, LFN, ADS, reset ALL appropriate MAC dates (of source and restored files).
AND: When you actually think about it, isn't a "zipping" process a sort of copy method for retention, discovery, safe data saving? What evidence might you be missing in the zip process? Also, is next years version of the zipping program going to be able to unzip last years version. Or is product 'A' capable of processing the zip file of product 'B'.
A final thought, but not included in the above list. I tested a recent "free" version of PGP (v8). It compresses, and lo and behold, also had failures. However, since i don't use PGP to compress, only encrypt, I didn't include it in the statistics.
So, which zipping program are you using to store and restore your legacy data or evidentiary file data?
Associated articles and programs of interest: hash program to calculate hash values. HASH_IT_OUT an article discussing forensic hashing of evidence. COPY_THAT an article discussing forensic copying of evidence. ZIP_IT_TAKE2 an article explaining the testing of zipping software.
Test data containing about 30 files, in a self extracting executable