2

Let's say I have 50 GiB of files that weights around 500 KiB each.

My guess is that having, for example, 5 large files of 10 GiB each with the same content archived in them would be better for hard drive performance. Am I correct?

Will there be a noticeable gain on an NTFS filesystem?

=====================================================================

Finally, which tool could I use to group the files together while retaining the ability to modify the content of the archive with zero or minor performance loss? For example, I like TrueCrypt archiving because after mounting an archive file, it creates a drive which I can use seamlessly as if it was a normal drive. The only thing with TrueCrypt is that I don't need encryption/compression, only archiving.

Oliver Salzburg
  • 89,072
  • 65
  • 269
  • 311

2 Answers2

3

Combining files

I would expect that a single large file is only better for performance if you usually read all the data, read it sequentially and if the large file is relatively unfragmented.

TrueCrypt

Using any kind of compression or encryption will be much worse for hard drive performance.

Update:

According to an answer to this question "there will be some drop in performance, albeit a slight one." The answer refers to a Tom's Hardware article which says

The benchmark shows varying performance and highly depends on the processor, followed by the drive you are about to encrypt: AES and Twofish provide highest throughput on our Core 2 Duo notebook Dell Latitude D610. Once you start combining multiple encryption algorithms, e.g. Twofish and Serpent, performance drops considerably. While this isn’t noticeable while working with Windows and popular applications, increasing system load—such as may occur during heavy multi-tasking or when taking on intensive workloads such as video transcoding—will reduce system performance considerably.

The Wikipedia article says

When using popular desktop applications in a "reasonable manner", and with only a single encryption algorithm, the performance impact of TrueCrypt on desktop applications is not generally noticeable, though that does depend on the application, and power users may complain. Using a fast multi core processor and a fast system drive, preferably a Flash SSD, makes TrueCrypt almost transparent

I don't know of any evidence that shows Truecrypt is going to significantly be "better for hard drive performance".

1

In Windows 7, you can mount a .VHD as a drive. This is the virtual hard drive format used by virtual machines and by Windows Backup (for Complete PC backup only on Windows client, and for all backups on Windows Server). No compression or encryption. Performance slowdown during ordinary use is minimal. After all, people are running whole virtual machines this way.

NTFS metadata and disk seeking can lead to substantial overhead on small files. For example, copying 10,000 files of under 10 KB each to a USB hard drive will proceed at about 300 files per second. That's 30 seconds to copy the files individually, vs. 10 seconds to copy them in a block. (The difference becomes even more striking with internal or eSATA drives, since the block throughput rate is higher. SSDs are so great at random access that it might not matter either way.)

But 500 KB files are large enough that the impact might be limited. You'd have to benchmark it and see.

taoyue
  • 2,769