Best filesystem to use for a specific type of application
On 02/13/2012 11:08 AM, Steve Flynn wrote:
Afternoon all,
I'm looking at a situation where I need to read a lot of small files.
Roughly 40,000,000 files averaging around 35KB each. Some will be
larger and some will be smaller as they are TIFF scans. Not sure of
overall size of the dataset yet but 3 TB feels about right (not got
the data yet so can't tell exactly)
As you're probably aware, very small files are a nightmare for
throughput. Currently, we've been using encrypted external USB
drivesto move this data between ourselves and my clients but now that
the size of the dataset is increasing, it's time to move to something
a bit more robust. I've been looking at a couple of NAS drives to
press into action, some of which give us the option of changing the
filesystem from to something more suitable.
Can anyone point me to some stats for how differing filesystems
(Reiser, XFS, JFS, Ext3, Ext4, BTRfs, etc) stack up against each other
when dealing with a lot of very small files. I have a little bell
tinkling away at the back of my mind that Reiser was particularly good
for small files, but I could well be making that up... plus I don't
know how well that stacks up these days against the advances made in
other filesystems.
ReiserFS is deceptive. It was advertised as being very high performance
with small files. But that was only taking into account benchmarks that
favor it. In my experience with this kind of workload, one very
important 'benchmark' is the speed in which you can read all files in
the order they are returned by Readdir. (The function that lists all
files in a directory.) For the reading of small files to be efficient,
they have to be read in the order they are laid out on the hard drive.
Otherwise, the head has to thrash with lots of random seeks, and that
will drop your read performance to only a few MB/s.
I haven't tested it much for this, but I think EXT4 has overcome this
limitation. XFS was by far the fastest filesystem I tested for this
workload (before EXT4 was released.) Reiser, and EXT3 were practically
unusable.
--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
|