dynamic inode allocation
On Mon, Sep 01, 2008 at 01:18:31PM -0400, Mag Gam wrote:
> This maybe a newbie question but how come other file systems such as > ReiserFS and Veritas' Vxfs dynamically allocate inodes and filesystems > such as ext2/ext3 and JFS we need to allocate them when creating the > filesystem? Is there a performance or maintenance gain when pre > allocating? Having a static inode table is definitely much simpler than a dynamic inode table, and that's why ext2 originally used a static inode allocation system. Ext2 drew much of its initial design inspiration from the BSD Fast Filesystem, and it (along with most traditional Unix filesystems) used a static inode table. One of the advantages of having a static inode table is you can always reliably find it. With a dynamic inode table, it can often be much more difficult to find it in the face of filesystem corruption, caused by either hardware or software failure. For example, with Reiserfs, the inodes are stored in a B-Tree. If the root node, or a relatively high-level node of the B-tree is lost, the only way to recover all of the inodes is by looking at each block, and trying to determine if it "looks" like part of the filesystem B-tree or not. This is what the reiserfs's fsck program will do if the filesystem is sufficiently damaged. Unfortuntaely, this means that if you store reiserfs filesystem image (for example, for use by vmware, or qemu, or kvm, or xen) in a reiserfs filesystem, and the filesystem gets damaged, the recovery procedure will take every single block that looks like it could have been part Reiserfs B-tree, and stich them together into a new-btree. The result, if you have Reiserfs filesystem images is those blocks will get treated as if they were part of the containing filesystem, and the result is not pretty. These problems can be solved (although they were not for Reiserfs), but it means a lot more complexity. - Ted _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
dynamic inode allocation
On Mon, Sep 1, 2008 at 2:37 PM, Theodore Tso <tytso@mit.edu> wrote:
> On Mon, Sep 01, 2008 at 01:18:31PM -0400, Mag Gam wrote: >> This maybe a newbie question but how come other file systems such as >> ReiserFS and Veritas' Vxfs dynamically allocate inodes and filesystems >> such as ext2/ext3 and JFS we need to allocate them when creating the >> filesystem? Is there a performance or maintenance gain when pre >> allocating? > > Having a static inode table is definitely much simpler than a dynamic > inode table, and that's why ext2 originally used a static inode > allocation system. Ext2 drew much of its initial design inspiration > from the BSD Fast Filesystem, and it (along with most traditional Unix > filesystems) used a static inode table. > > One of the advantages of having a static inode table is you can always > reliably find it. With a dynamic inode table, it can often be much > more difficult to find it in the face of filesystem corruption, caused > by either hardware or software failure. For example, with Reiserfs, > the inodes are stored in a B-Tree. If the root node, or a relatively > high-level node of the B-tree is lost, the only way to recover all of > the inodes is by looking at each block, and trying to determine if it > "looks" like part of the filesystem B-tree or not. This is what the > reiserfs's fsck program will do if the filesystem is sufficiently > damaged. Unfortuntaely, this means that if you store reiserfs > filesystem image (for example, for use by vmware, or qemu, or kvm, or > xen) in a reiserfs filesystem, and the filesystem gets damaged, the > recovery procedure will take every single block that looks like it > could have been part Reiserfs B-tree, and stich them together into a > new-btree. The result, if you have Reiserfs filesystem images is > those blocks will get treated as if they were part of the containing > filesystem, and the result is not pretty. > > These problems can be solved (although they were not for Reiserfs), > but it means a lot more complexity. > > - Ted > Ted, Thanks for the explanation and dumb-ing it down for me :-) So, if a reiserFs filesystem is damaged and it naturally do a fsck. The fsck basically recreated the b-tree by scanning from 1 to end of the filesystem? _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
dynamic inode allocation
On Mon, Sep 01, 2008 at 04:29:06PM -0400, Mag Gam wrote:
> > So, if a reiserFs filesystem is damaged and it naturally do a fsck. > The fsck basically recreated the b-tree by scanning from 1 to end of > the filesystem? If the filesystem is sufficiently damaged such that portions of the b-tree can't be found, then yes. Otherwise, the data would be totally lost. As you can imagine, scaning every single block on the disk to see if it looks like filesystem metadata is quite slow, so naturally the reiserfs's fsck will avoid doing it if at all possible. But if the root or top-level nodes of the B-tree is damaged, it doesn't have much choice. - Ted _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
dynamic inode allocation
On Mon, Sep 1, 2008 at 4:39 PM, Theodore Tso <tytso@mit.edu> wrote:
> On Mon, Sep 01, 2008 at 04:29:06PM -0400, Mag Gam wrote: >> >> So, if a reiserFs filesystem is damaged and it naturally do a fsck. >> The fsck basically recreated the b-tree by scanning from 1 to end of >> the filesystem? > > If the filesystem is sufficiently damaged such that portions of the > b-tree can't be found, then yes. Otherwise, the data would be totally > lost. As you can imagine, scaning every single block on the disk to > see if it looks like filesystem metadata is quite slow, so naturally > the reiserfs's fsck will avoid doing it if at all possible. But if > the root or top-level nodes of the B-tree is damaged, it doesn't have > much choice. > > - Ted > > But, if thats the last and worst case scenario why don't they do the full scan? Sure its going to take a long time if its a big filesystem (there should be no changes since it would be unmounted), but its better than not having any data at all... _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
dynamic inode allocation
On Mon, Sep 01, 2008 at 05:16:01PM -0400, Mag Gam wrote:
> > If the filesystem is sufficiently damaged such that portions of the > > b-tree can't be found, then yes. Otherwise, the data would be totally > > lost. As you can imagine, scaning every single block on the disk to > > see if it looks like filesystem metadata is quite slow, so naturally > > the reiserfs's fsck will avoid doing it if at all possible. But if > > the root or top-level nodes of the B-tree is damaged, it doesn't have > > much choice. > > > > But, if thats the last and worst case scenario why don't they do the > full scan? Sure its going to take a long time if its a big filesystem > (there should be no changes since it would be unmounted), but its > better than not having any data at all... As I said, in the worst case, it will do a full scan. But (a) it takes a long time, and (b) if the filesystem has any files that contain images of reiserfs filesystem, it will be totally scrambled. So it makes sense that the reiserfs fsck would try to avoid this if it can (i.e., if the b-tree is only mildly corrupted). With that said, this is really going out of scope of this mailing list. And I am not an expert on reiserfs's filesystem checker, although I have had people confirm to me that indeed, you can lose really big if your reiserfs filesystem contains files that have are images of other reiserfs filesystems for things like Virtualization. This problem is apparently solved in reiser4, it is NOT solved in reiserfs (i.e., version 3). As far as I am concerned, that's ample reason not to use reiserfs, but obviously I'm basied. :-) - Ted _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
dynamic inode allocation
Thanks!
This has cured my curiosity (for now...) On Mon, Sep 1, 2008 at 5:23 PM, Theodore Tso <tytso@mit.edu> wrote: > On Mon, Sep 01, 2008 at 05:16:01PM -0400, Mag Gam wrote: >> > If the filesystem is sufficiently damaged such that portions of the >> > b-tree can't be found, then yes. Otherwise, the data would be totally >> > lost. As you can imagine, scaning every single block on the disk to >> > see if it looks like filesystem metadata is quite slow, so naturally >> > the reiserfs's fsck will avoid doing it if at all possible. But if >> > the root or top-level nodes of the B-tree is damaged, it doesn't have >> > much choice. >> > >> >> But, if thats the last and worst case scenario why don't they do the >> full scan? Sure its going to take a long time if its a big filesystem >> (there should be no changes since it would be unmounted), but its >> better than not having any data at all... > > As I said, in the worst case, it will do a full scan. But (a) it > takes a long time, and (b) if the filesystem has any files that > contain images of reiserfs filesystem, it will be totally scrambled. > So it makes sense that the reiserfs fsck would try to avoid this if it > can (i.e., if the b-tree is only mildly corrupted). > > With that said, this is really going out of scope of this mailing > list. And I am not an expert on reiserfs's filesystem checker, > although I have had people confirm to me that indeed, you can lose > really big if your reiserfs filesystem contains files that have are > images of other reiserfs filesystems for things like Virtualization. > This problem is apparently solved in reiser4, it is NOT solved in > reiserfs (i.e., version 3). As far as I am concerned, that's ample > reason not to use reiserfs, but obviously I'm basied. :-) > > - Ted > > > _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
| All times are GMT. The time now is 09:12 AM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.