FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Ubuntu > Ubuntu User

 
 
LinkBack Thread Tools
 
Old 02-13-2012, 03:08 PM
Steve Flynn
 
Default Best filesystem to use for a specific type of application

Afternoon all,

I'm looking at a situation where I need to read a lot of small files.
Roughly 40,000,000 files averaging around 35KB each. Some will be
larger and some will be smaller as they are TIFF scans. Not sure of
overall size of the dataset yet but 3 TB feels about right (not got
the data yet so can't tell exactly)

As you're probably aware, very small files are a nightmare for
throughput. Currently, we've been using encrypted external USB
drivesto move this data between ourselves and my clients but now that
the size of the dataset is increasing, it's time to move to something
a bit more robust. I've been looking at a couple of NAS drives to
press into action, some of which give us the option of changing the
filesystem from to something more suitable.

Can anyone point me to some stats for how differing filesystems
(Reiser, XFS, JFS, Ext3, Ext4, BTRfs, etc) stack up against each other
when dealing with a lot of very small files. I have a little bell
tinkling away at the back of my mind that Reiser was particularly good
for small files, but I could well be making that up... plus I don't
know how well that stacks up these days against the advances made in
other filesystems.

Anyone got any empirical evidence or horror/success stories with any
of the available filesystems?

--
Steve

When one person suffers from a delusion it is insanity. When many
people suffer from a delusion it is called religion.

--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
 
Old 02-13-2012, 03:20 PM
Rashkae
 
Default Best filesystem to use for a specific type of application

On 02/13/2012 11:08 AM, Steve Flynn wrote:

Afternoon all,

I'm looking at a situation where I need to read a lot of small files.
Roughly 40,000,000 files averaging around 35KB each. Some will be
larger and some will be smaller as they are TIFF scans. Not sure of
overall size of the dataset yet but 3 TB feels about right (not got
the data yet so can't tell exactly)

As you're probably aware, very small files are a nightmare for
throughput. Currently, we've been using encrypted external USB
drivesto move this data between ourselves and my clients but now that
the size of the dataset is increasing, it's time to move to something
a bit more robust. I've been looking at a couple of NAS drives to
press into action, some of which give us the option of changing the
filesystem from to something more suitable.

Can anyone point me to some stats for how differing filesystems
(Reiser, XFS, JFS, Ext3, Ext4, BTRfs, etc) stack up against each other
when dealing with a lot of very small files. I have a little bell
tinkling away at the back of my mind that Reiser was particularly good
for small files, but I could well be making that up... plus I don't
know how well that stacks up these days against the advances made in
other filesystems.



ReiserFS is deceptive. It was advertised as being very high performance
with small files. But that was only taking into account benchmarks that
favor it. In my experience with this kind of workload, one very
important 'benchmark' is the speed in which you can read all files in
the order they are returned by Readdir. (The function that lists all
files in a directory.) For the reading of small files to be efficient,
they have to be read in the order they are laid out on the hard drive.
Otherwise, the head has to thrash with lots of random seeks, and that
will drop your read performance to only a few MB/s.


I haven't tested it much for this, but I think EXT4 has overcome this
limitation. XFS was by far the fastest filesystem I tested for this
workload (before EXT4 was released.) Reiser, and EXT3 were practically
unusable.





--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
 

Thread Tools




All times are GMT. The time now is 04:52 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org