FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 02-16-2012, 12:46 PM
Jan Kara
 
Default Push file_update_time() into .page_mkwrite

Hello,

to provide reliable support for filesystem freezing, filesystems need to have
complete control over when metadata is changed. In particular,
file_update_time() calls from page fault code make it impossible for
filesystems to prevent inodes from being dirtied while the filesystem is
frozen.

To fix the issue, this patch set changes page fault code to call
file_update_time() only when ->page_mkwrite() callback is not provided. If the
callback is provided, it is the responsibility of the filesystem to perform
update of i_mtime / i_ctime if needed. We also push file_update_time() call
to all existing ->page_mkwrite() implementations if the time update does not
obviously happen by other means. If you know your filesystem does not need
update of modification times in ->page_mkwrite() handler, please speak up and
I'll drop the patch for your filesystem.

As a side note, an alternative would be to remove call of file_update_time()
from page fault code altogether and require all filesystems needing it to do
that in their ->page_mkwrite() implementation. That is certainly possible
although maybe slightly inefficient and would require auditting 100+
vm_operations_structs *shake*.

If I get acks on these patches, Andrew, would you be willing to take these
patches?

Honza

CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Ingo Molnar <mingo@elte.hu>
CC: Paul Mackerras <paulus@samba.org>
CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
CC: Jaya Kumar <jayalk@intworks.biz>
CC: Sage Weil <sage@newdream.net>
CC: ceph-devel@vger.kernel.org
CC: Steve French <sfrench@samba.org>
CC: linux-cifs@vger.kernel.org
CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Ron Minnich <rminnich@sandia.gov>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: v9fs-developer@lists.sourceforge.net
CC: Miklos Szeredi <miklos@szeredi.hu>
CC: fuse-devel@lists.sourceforge.net
CC: Steven Whitehouse <swhiteho@redhat.com>
CC: cluster-devel@redhat.com
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
CC: linux-nfs@vger.kernel.org
 
Old 03-01-2012, 10:41 AM
Jan Kara
 
Default Push file_update_time() into .page_mkwrite

Hello,

to provide reliable support for filesystem freezing, filesystems need to have
complete control over when metadata is changed. In particular,
file_update_time() calls from page fault code make it impossible for
filesystems to prevent inodes from being dirtied while the filesystem is
frozen.

To fix the issue, this patch set changes page fault code to call
file_update_time() only when ->page_mkwrite() callback is not provided. If the
callback is provided, it is the responsibility of the filesystem to perform
update of i_mtime / i_ctime if needed. We also push file_update_time() call
to all existing ->page_mkwrite() implementations if the time update does not
obviously happen by other means. If you know your filesystem does not need
update of modification times in ->page_mkwrite() handler, please speak up and
I'll drop the patch for your filesystem.

As a side note, an alternative would be to remove call of file_update_time()
from page fault code altogether and require all filesystems needing it to do
that in their ->page_mkwrite() implementation. That is certainly possible
although maybe slightly inefficient and would require auditting 100+
vm_operations_structs *shiver*.

Changes since v1:
* Dropped patches for filesystems which don't need them
* Added some acks
* Improved sysfs patch by Alex Elder's suggestion

Andrew, would you be willing to merge these patches via your tree?

Honza

CC: Jaya Kumar <jayalk@intworks.biz>
CC: Sage Weil <sage@newdream.net>
CC: ceph-devel@vger.kernel.org
CC: Steve French <sfrench@samba.org>
CC: linux-cifs@vger.kernel.org
CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Ron Minnich <rminnich@sandia.gov>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: v9fs-developer@lists.sourceforge.net
CC: Miklos Szeredi <miklos@szeredi.hu>
CC: fuse-devel@lists.sourceforge.net
CC: Steven Whitehouse <swhiteho@redhat.com>
CC: cluster-devel@redhat.com
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 
Old 03-01-2012, 11:23 AM
Jan Kara
 
Default Push file_update_time() into .page_mkwrite

Bah, the subject should have been 0/9... Sorry.

Honza
On Thu 01-03-12 12:41:34, Jan Kara wrote:
> Hello,
>
> to provide reliable support for filesystem freezing, filesystems need to have
> complete control over when metadata is changed. In particular,
> file_update_time() calls from page fault code make it impossible for
> filesystems to prevent inodes from being dirtied while the filesystem is
> frozen.
>
> To fix the issue, this patch set changes page fault code to call
> file_update_time() only when ->page_mkwrite() callback is not provided. If the
> callback is provided, it is the responsibility of the filesystem to perform
> update of i_mtime / i_ctime if needed. We also push file_update_time() call
> to all existing ->page_mkwrite() implementations if the time update does not
> obviously happen by other means. If you know your filesystem does not need
> update of modification times in ->page_mkwrite() handler, please speak up and
> I'll drop the patch for your filesystem.
>
> As a side note, an alternative would be to remove call of file_update_time()
> from page fault code altogether and require all filesystems needing it to do
> that in their ->page_mkwrite() implementation. That is certainly possible
> although maybe slightly inefficient and would require auditting 100+
> vm_operations_structs *shiver*.
>
> Changes since v1:
> * Dropped patches for filesystems which don't need them
> * Added some acks
> * Improved sysfs patch by Alex Elder's suggestion
>
> Andrew, would you be willing to merge these patches via your tree?
>
> Honza
>
> CC: Jaya Kumar <jayalk@intworks.biz>
> CC: Sage Weil <sage@newdream.net>
> CC: ceph-devel@vger.kernel.org
> CC: Steve French <sfrench@samba.org>
> CC: linux-cifs@vger.kernel.org
> CC: Eric Van Hensbergen <ericvh@gmail.com>
> CC: Ron Minnich <rminnich@sandia.gov>
> CC: Latchesar Ionkov <lucho@ionkov.net>
> CC: v9fs-developer@lists.sourceforge.net
> CC: Miklos Szeredi <miklos@szeredi.hu>
> CC: fuse-devel@lists.sourceforge.net
> CC: Steven Whitehouse <swhiteho@redhat.com>
> CC: cluster-devel@redhat.com
> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
 
Old 03-02-2012, 08:41 AM
Jan Kara
 
Default Push file_update_time() into .page_mkwrite

On Thu 01-03-12 18:29:42, Ted Tso wrote:
> On Thu, Mar 01, 2012 at 12:41:34PM +0100, Jan Kara wrote:
> >
> > To fix the issue, this patch set changes page fault code to call
> > file_update_time() only when ->page_mkwrite() callback is not provided. If the
> > callback is provided, it is the responsibility of the filesystem to perform
> > update of i_mtime / i_ctime if needed. We also push file_update_time() call
> > to all existing ->page_mkwrite() implementations if the time update does not
> > obviously happen by other means. If you know your filesystem does not need
> > update of modification times in ->page_mkwrite() handler, please speak up and
> > I'll drop the patch for your filesystem.
>
> I don't know if this introductory text is going to be saved anywhere
> permanent, such as the merge commit (since git now has the ability to
> have much more informative merge descriptions). But if it is going to
> be preserved, it might be worth mentioning that if the filesystem uses
> block_page_mkpage(), it will handled automatically for them since the
> patch series does push the call to file_update_time(0 into
> __block_page_mkpage().
Good point, added to description.

Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
 
Old 03-05-2012, 01:54 PM
Jan Kara
 
Default Push file_update_time() into .page_mkwrite

Hello,

to provide reliable support for filesystem freezing, filesystems need to have
complete control over when metadata is changed. In particular,
file_update_time() calls from page fault code make it impossible for
filesystems to prevent inodes from being dirtied while the filesystem is
frozen.

To fix the issue, this patch set changes page fault code to call
file_update_time() only when ->page_mkwrite() callback is not provided. If the
callback is provided, it is the responsibility of the filesystem to perform
update of i_mtime / i_ctime if needed. We also push file_update_time() call
to all existing ->page_mkwrite() implementations if the time update does not
obviously happen by other means. This is including __block_page_mkwrite() so
filesystems using it are handled. If you know your filesystem does not need
update of modification times in ->page_mkwrite() handler, please speak up and
I'll drop the patch for your filesystem.

As a side note, an alternative would be to remove calls of file_update_time()
from page fault code altogether and require all filesystems needing it to do
that in their ->page_mkwrite() implementation. That is certainly possible
although maybe slightly inefficient and would require auditting 100+
vm_operations_structs *shiver*.

Changes since v1:
* Dropped patches for filesystems which don't need them
* Added some acks
* Improved sysfs patch by Alex Elder's suggestion

Changes since v2:
* Dropped patches for more filesystems

Andrew, would you be willing to merge these patches via your tree? This seems
to be a final version.

Honza

CC: Jaya Kumar <jayalk@intworks.biz>
CC: Sage Weil <sage@newdream.net>
CC: ceph-devel@vger.kernel.org
CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Ron Minnich <rminnich@sandia.gov>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: v9fs-developer@lists.sourceforge.net
CC: Steven Whitehouse <swhiteho@redhat.com>
CC: cluster-devel@redhat.com
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 
Old 03-09-2012, 07:19 AM
Jan Kara
 
Default Push file_update_time() into .page_mkwrite

Hello,

On Thu 08-03-12 15:12:56, Andy Lutomirski wrote:
> On 03/01/2012 03:41 AM, Jan Kara wrote:
> > Hello,
> >
> > to provide reliable support for filesystem freezing, filesystems need to have
> > complete control over when metadata is changed. In particular,
> > file_update_time() calls from page fault code make it impossible for
> > filesystems to prevent inodes from being dirtied while the filesystem is
> > frozen.
> >
> > To fix the issue, this patch set changes page fault code to call
> > file_update_time() only when ->page_mkwrite() callback is not provided. If the
> > callback is provided, it is the responsibility of the filesystem to perform
> > update of i_mtime / i_ctime if needed. We also push file_update_time() call
> > to all existing ->page_mkwrite() implementations if the time update does not
> > obviously happen by other means. If you know your filesystem does not need
> > update of modification times in ->page_mkwrite() handler, please speak up and
> > I'll drop the patch for your filesystem.
> >
> > As a side note, an alternative would be to remove call of file_update_time()
> > from page fault code altogether and require all filesystems needing it to do
> > that in their ->page_mkwrite() implementation. That is certainly possible
> > although maybe slightly inefficient and would require auditting 100+
> > vm_operations_structs *shiver*.
>
>
>
> IMO updating file times should happen when changes get written out, not
> when a page is made writable, for two reasons:
>
> 1. Correctness. With the current approach, it's very easy for files to
> be changed after the last mtime update -- any changes between mkwrite
> and actual writeback won't affect mtime.
>
> 2. Performance. I have an application (presumably guessable from my
> email address) for which blocking in page_mkwrite is an absolute
> show-stopper. (In fact it's so bad that we reverted back to running on
> Windows until I hacked up a kernel to not do this.) I have an incorrect
> patch [1] to fix it, but I haven't gotten around to a real fix. (I also
> have stable pages reverted in my kernel. Some day I'll submit a patch
> to make it a filesystem option. Or maybe it should even be a block
> device / queue property like the alignment offset and optimal io size --
> there are plenty of block device and file combinations which don't
> benefit at all from stable pages.)
>
> I'd prefer if file_update_time in page_mkwrite didn't proliferate. A
> better fix is probably to introduce a new inode flag, update it when a
> page is undirtied, and then dirty and write the inode from the writeback
> path. (Kind of like my patch, but with an inode flag instead of a page
> flag, and with the file_update_time done from the fs.)
>
> [1] http://patchwork.ozlabs.org/patch/122516/
Andy, I'm aware of your problems. Just firstly, I wouldn't like to
complicate the filesystem freezing patch set even more by improving unrelated
things. And secondly, I think these changes won't make fixing your problem
harder. I'd even argue it will be easier because you can do conversion
filesystem by filesystem. Getting lock ordering and other things right for
all filesystems at once is much harded.

Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
 
Old 03-12-2012, 07:22 AM
Jan Kara
 
Default Push file_update_time() into .page_mkwrite

On Sun 11-03-12 13:23:17, Kamal Mostafa wrote:
> On Mon, 2012-03-05 at 15:54 +0100, Jan Kara wrote:
> > Hello,
> >
> > to provide reliable support for filesystem freezing, filesystems need to have
> > complete control over when metadata is changed. [...]
>
> This patch set has been tested at Canonical along with the testing for
> "[PATCH 00/19] Fix filesystem freezing deadlocks".
>
> Please add the following endorsements for these patches (those actually
> exercised by our test case): 1, 2, 6, 7
>
> Tested-by: Kamal Mostafa <kamal@canonical.com>
> Tested-by: Peter M. Petrakis <peter.petrakis@canonical.com>
> Tested-by: Dann Frazier <dann.frazier@canonical.com>
> Tested-by: Massimo Morana <massimo.morana@canonical.com>
Thanks for testing guys!

Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
 

Thread Tools




All times are GMT. The time now is 10:33 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org