FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Fedora User

 
 
LinkBack Thread Tools
 
Old 05-08-2012, 06:42 PM
Andrew Gray
 
Default Please stop apps going into state D uninterrupted sleep !!

Hi



Either give use a way to kill a hung cp or rsync* when the VPN goes down and they end up is state D uninterrupted sleep or stop apps being able to go into uninterrupted sleep !!



It is unacceptable for a Linux system to have to be CRASH* reboot as the mounted CIF mount can't be umounted as it is in use by cp or rsync in stated D uninterrupted sleep !!



This there should be NO uninterrupted sleep !!








--

Andrew Gray <andrewg@linnetsol.co.uk>

Linnet Solutions Ltd





--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 07:39 PM
Patrick O'Callaghan
 
Default Please stop apps going into state D uninterrupted sleep !!

On Tue, 2012-05-08 at 19:42 +0100, Andrew Gray wrote:
> Hi
>
> Either give use a way to kill a hung cp or rsync when the VPN goes down
> and they end up is state D uninterrupted sleep or stop apps being able
> to go into uninterrupted sleep !!

It is *not possible* to kill a process in D state. D state can be
defined as "the state which cannot be interrupted". It originally
applied to fast operations which were guaranteed to succeed, e.g. a DMA
transfer. If the DMA interrupt didn't happen, you had problems much more
serious than a hanging app, and there was no reasonable way for the
system to recover automatically without user intervention.

With the introduction of networked filesystems, the desire for
transparency at the app level disguised the fact that networks actually
do fail from time to time, and not in nice ways. I think it was Lamport
who said that you can't tell 'down' from 'disconnected' (did the remote
server crash, or is there a network disconnection? maybe the network is
just congested, there's no way to tell).

Apps which access resources in the real world, including networked
devices, can be written to allow for these suddenly disappearing, or
just not bother. In the former case, the app becomes very much more
complex without ever completely solving the problem, just reducing the
probability of it happening. Note that "the app" doesn't just mean cp or
rsync, it means anything which accesses the filesystem, which can mean
virtually any program under Linux or similar systems. Making resource
failures completely transparent is a seriously hard problem, and doing
it in such a way that the applications programmer never needs to worry
about is probably unsolvable, given that the right thing to do in each
circumstance depends on the semantics that the app is trying to
preserve.

Take a look at the literature on fault-tolerant computing to see how
complex and expensive it is to even approximate this level of
reliability. General purpose systems such as Linux take the view that
transparency and a clean file access model are easier for programmers to
deal with, and in any case many such problems are better resolved by
direct user intervention. If that means a reboot, then so be it. I don't
like it either, but there it is.

> It is unacceptable for a Linux system to have to be CRASH reboot as the
> mounted CIF mount can't be umounted as it is in use by cp or rsync in
> stated D uninterrupted sleep !!
>
> This there should be NO uninterrupted sleep !!

I agree. Also, it shouldn't rain on public holidays.

poc

--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 07:42 PM
Tom Horsley
 
Default Please stop apps going into state D uninterrupted sleep !!

On Tue, 08 May 2012 15:09:31 -0430
Patrick O'Callaghan wrote:

> I think it was Lamport
> who said that you can't tell 'down' from 'disconnected' (did the remote
> server crash, or is there a network disconnection? maybe the network is
> just congested, there's no way to tell).

I can tell: If I've been getting responses in under 20 ms and now I
haven't gotten a response in 2 minutes, then the dadgum thing
is down :-).
--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 08:16 PM
Joe Zeff
 
Default Please stop apps going into state D uninterrupted sleep !!

On 05/08/2012 12:39 PM, Patrick O'Callaghan wrote:

On Tue, 2012-05-08 at 19:42 +0100, Andrew Gray wrote:

> Hi
>
> Either give use a way to kill a hung cp or rsync when the VPN goes down
> and they end up is state D uninterrupted sleep or stop apps being able
> to go into uninterrupted sleep !!

It is*not possible* to kill a process in D state. D state can be
defined as "the state which cannot be interrupted".


I think it's fairly clear that Mr. O'Callaghan knows that. He's
complaining about the consequences of there being an uninterruptable
sleep. If I read him right, he's saying that it should always be
possible for the user to force a hung app to die when it's clear to the
user that something has happened that makes it impossible for the app to
continue, such as rsync completing when the remote server's known to
have crashed. At this point, probably the best way to proceed is to
request that whoever maintains the programs in question modify them so
that they don't enter this state when accessing a remote file system or
that there's some way to get the app's attention and force it to abort.
On the surface, at least, the request sounds reasonable, although I'll
be the first to admit that things like this are often much more
difficult than they sound.

--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 08:32 PM
Patrick O'Callaghan
 
Default Please stop apps going into state D uninterrupted sleep !!

On Tue, 2012-05-08 at 13:16 -0700, Joe Zeff wrote:
> On 05/08/2012 12:39 PM, Patrick O'Callaghan wrote:
> > On Tue, 2012-05-08 at 19:42 +0100, Andrew Gray wrote:
> >> > Hi
> >> >
> >> > Either give use a way to kill a hung cp or rsync when the VPN goes down
> >> > and they end up is state D uninterrupted sleep or stop apps being able
> >> > to go into uninterrupted sleep !!
> > It is*not possible* to kill a process in D state. D state can be
> > defined as "the state which cannot be interrupted".
>
> I think it's fairly clear that Mr. O'Callaghan knows that.

I think you mean Mr. Gray.

> He's
> complaining about the consequences of there being an uninterruptable
> sleep. If I read him right, he's saying that it should always be
> possible for the user to force a hung app to die when it's clear to the
> user that something has happened that makes it impossible for the app to
> continue, such as rsync completing when the remote server's known to
> have crashed. At this point, probably the best way to proceed is to
> request that whoever maintains the programs in question modify them so
> that they don't enter this state when accessing a remote file system or
> that there's some way to get the app's attention and force it to abort.
> On the surface, at least, the request sounds reasonable, although I'll
> be the first to admit that things like this are often much more
> difficult than they sound.

As I tried to explain, rewriting a couple of apps is not going to hack
it. The apps don't *know* they're using a networked filesystem, they're
just accessing files. They could find out and try to take measures, but
then what about all the other apps that also write files? Rewrite tar,
cpio, dd, cat, ...?

The price of treating a networked fs as equivalent to a local one is
that you get screwed when it doesn't behave like a local one. Dealing
with this in a coherent and consistent way is hard. See the literature
on distributed filesystems. The semantics of an NFS system are *not* the
same as a local system. We brush this under the carpet most of the time
because it usually works, but sometimes the differences bite.

poc

--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 08:38 PM
Patrick O'Callaghan
 
Default Please stop apps going into state D uninterrupted sleep !!

On Tue, 2012-05-08 at 15:42 -0400, Tom Horsley wrote:
> On Tue, 08 May 2012 15:09:31 -0430
> Patrick O'Callaghan wrote:
>
> > I think it was Lamport
> > who said that you can't tell 'down' from 'disconnected' (did the remote
> > server crash, or is there a network disconnection? maybe the network is
> > just congested, there's no way to tell).
>
> I can tell: If I've been getting responses in under 20 ms and now I
> haven't gotten a response in 2 minutes, then the dadgum thing
> is down :-).

Or some router along the way is down, or congested. You simply don't
know. Often you don't care, but sometimes you do because it makes a real
difference (e.g. distributed database transactions) so you have to
program defensively. However reliability is not cheap and is never
perfect.

poc

--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 08:41 PM
Alan Cox
 
Default Please stop apps going into state D uninterrupted sleep !!

On Tue, 08 May 2012 19:42:46 +0100
Andrew Gray <andrewg@linnetsol.co.uk> wrote:

> Hi
>
> Either give use a way to kill a hung cp or rsync when the VPN goes down
> and they end up is state D uninterrupted sleep or stop apps being able
> to go into uninterrupted sleep !!
>
> It is unacceptable for a Linux system to have to be CRASH reboot as the
> mounted CIF mount can't be umounted as it is in use by cp or rsync in
> stated D uninterrupted sleep !!

Can we have less exclamation marks and more explanation. What vpn are you
using, what circumstances does it occur, what does the longer ps data
show it blocked on, are you using NFS or similar over your VPN ?

Alan
--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 09:20 PM
Joe Zeff
 
Default Please stop apps going into state D uninterrupted sleep !!

On 05/08/2012 01:32 PM, Patrick O'Callaghan wrote:

As I tried to explain, rewriting a couple of apps is not going to hack
it. The apps don't*know* they're using a networked filesystem, they're
just accessing files.


I suppose there could be a command line switch to control this. Please
understand that I'm only discussing ways this *might* be fixed, on a
program by program basis. Actually getting it done would require
persuading the maintainers that it's worth the effort unless somebody's
willing to step up, do the work and then get them to accept their patch.

--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 09:31 PM
Patrick O'Callaghan
 
Default Please stop apps going into state D uninterrupted sleep !!

On Tue, 2012-05-08 at 14:20 -0700, Joe Zeff wrote:
> On 05/08/2012 01:32 PM, Patrick O'Callaghan wrote:
> > As I tried to explain, rewriting a couple of apps is not going to hack
> > it. The apps don't*know* they're using a networked filesystem, they're
> > just accessing files.
>
> I suppose there could be a command line switch to control this. Please
> understand that I'm only discussing ways this *might* be fixed, on a
> program by program basis. Actually getting it done would require
> persuading the maintainers that it's worth the effort unless somebody's
> willing to step up, do the work and then get them to accept their patch.

I think you're underestimating the amount of extra effort this would
require for each program. It's not a simple patch by any means (even
assuming it's possible). Much more like a fork in fact. And as a general
rule, adding reliability code to any program makes it larger, slower
(because of extra network operations and checks) and a lot harder to
maintain. If this stuff was easy, it would be packaged and accessible to
the average programmer.

poc

--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 05-08-2012, 09:41 PM
Joe Zeff
 
Default Please stop apps going into state D uninterrupted sleep !!

On 05/08/2012 02:31 PM, Patrick O'Callaghan wrote:

I think you're underestimating the amount of extra effort this would
require for each program.


Actually, I was thinking in terms of "if this switch is set, don't use
uninturrupted sleep." And, I'd be very astonished if any of the
maintainers would be willing to do something like that. Mostly, I was
thinking in theoretical terms about what could be done to prevent this
in the highly unlikely case that somebody was willing to do the work.

--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 

Thread Tools




All times are GMT. The time now is 07:19 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org