FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 01-02-2008, 06:29 PM
Robert Collins
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

On Wed, 2008-01-02 at 09:50 -0800, Russ Allbery wrote:
> Robert Collins <robertc@robertcollins.net> writes:
> > On Wed, 2008-01-02 at 00:29 +0000, Colin Watson wrote:
>
> >> Some packages actually do need to shut down cleanly; in the case of a
> >> database, for example, such a change could cause data loss.
>
> > Surely no more than a hard power failure(*), which databases (even such
> > lightweight ones as sqlite) are designed to handle.
>
> You can still lose data transactions that weren't complete, and you may
> have to go through an extended consistency check when the system comes
> back up.

Meh, two answers to my point that equate 'data loss' with 'incomplete
transactions are not committed'.

Incomplete transactions are *just that* - incomplete. Its why database
clients use transactions and understand that errors (such as the db
going away) mean their changes are not committed.

If a database is asked to shutdown and a client has a 10 minute update
running, you still end up with 'incomplete transaction not being
completed' - unless of course you want to block indefinitely on each
service.

Your note about an extended consistency check is valid when a power loss
has occurred but doesn't apply to the proposed fast shutdown as a signal
is in fact sent to the database. (And BTW, if your database needs a huge
consistency check on startup after a hard-down situation - consider
changing database engine!).

-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
 
Old 01-03-2008, 12:40 AM
Gabor Gombas
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

On Wed, Jan 02, 2008 at 12:47:12PM -0800, Russ Allbery wrote:

> Right. The only case where a shutdown script makes sense to me is if it's
> doing something other than sending signals or if it's waiting
> (intelligently, not just blindly for five seconds) for the process to shut
> down cleanly.

So the only question is how many scripts _should_ wait but currently get
away without waiting because the shutdown sequence takes so long. IMHO
if a daemon does not write anything to disk except maybe log messages
then it should be fine without a shutdown script, but everything else
should have one.

Gabor

--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 01-03-2008, 11:11 AM
Gabor Gombas
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

On Thu, Jan 03, 2008 at 02:45:40AM +0000, Colin Watson wrote:

> If this is a real problem for a given service, surely its init script
> should actually wait for the process to shut down cleanly? If so, it
> wouldn't be a candidate for this refactoring.

IMHO there can be many init scripts that currently do not wait for the
process to stop but they should if you want to do this refactoring. Some
random checks:

- samba: there is a sleep in "stop" but that may not be enough if
there is heavy I/O. An explicit wait for process termination should be
added.
- bind9: there is a sleep in "restart" but not in "stop". Killing
named in the middle of say a zone transfer may not be nice, so waiting
should be added.
- heimdal-kdc: waits in "restart" but not in "stop". Killing the KDC in
the middle of a database update is not nice, so waiting should be added.
- squid: waits properly.
- acpid: there is a sleep in "restart" but not in "stop" and IMHO it
does not need one - it can go without an explicit stop.
- chrony: there is a sleep in "restart" but not in "stop". AFAIK chrony
writes the RTC status file on exit so it must not be killed before
that's done.
- apache2: there is a (rather long) sleep in "restart" but not in
"stop". Waiting for process termination should be added.

Also, in the proposed scheme sync() must be called _before_ sending the
final TERM signal since sync() may take longer than 5 seconds and
therefore heavy I/O may prevent even simple processes to shut down
properly between the final SIGTERM and SIGKILL.

Gabor

--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 01-03-2008, 04:14 PM
Joey Hess
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

Gabor Gombas wrote:
> IMHO there can be many init scripts that currently do not wait for the
> process to stop but they should if you want to do this refactoring. Some
> random checks:

If a package only shuts down cleanly because the rest of the shutdown
process is slow, it is already buggy. Especially on systems where the
shutdown is much faster, either due to their being fewer shutdown
scripts than usual or the cpu being a lot faster than usual, or due to
its shutdown script being reordered to run later than usual.

> - acpid: there is a sleep in "restart" but not in "stop" and IMHO it
> does not need one - it can go without an explicit stop.
> - apache2: there is a (rather long) sleep in "restart" but not in
> "stop". Waiting for process termination should be added.

It's fairly common to add a sleep in restart to (try to) deal with
issues such as reopening a socket.

--
see shy jo
 
Old 01-03-2008, 05:40 PM
Steve Langasek
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

On Thu, Jan 03, 2008 at 01:11:57PM +0100, Gabor Gombas wrote:
> On Thu, Jan 03, 2008 at 02:45:40AM +0000, Colin Watson wrote:

> > If this is a real problem for a given service, surely its init script
> > should actually wait for the process to shut down cleanly? If so, it
> > wouldn't be a candidate for this refactoring.

> IMHO there can be many init scripts that currently do not wait for the
> process to stop but they should if you want to do this refactoring. Some
> random checks:

> - samba: there is a sleep in "stop" but that may not be enough if
> there is heavy I/O. An explicit wait for process termination should be
> added.

The sleep is only there to try to clean up a pid file that the daemon fails
to take care of on its own. Put /var/run on a tmpfs and it's a non-issue on
shutdown.

It's also, as commented already in the init script, recognized as a bug in
the associated daemon. Fixing that bug would drop the need for the sleep,
though if there's a possibility of SIGKILL coming before the daemon is done
shutting down then you still don't have a guaranteed cleanup, and there's no
good "wait for process termination" facility that we can use from init
scripts.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
slangasek@ubuntu.com vorlon@debian.org


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 01-03-2008, 05:47 PM
Christian Perrier
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

Quoting Steve Langasek (vorlon@debian.org):

> It's also, as commented already in the init script, recognized as a bug in
^^^^^^^^^^^^^^^^^^^^^^
> the associated daemon. Fixing that bug would drop the need for the sleep,
^^^^^^^^^^^^^^^^^^^^^^

Hmmm, have we reported this upstream?
 
Old 01-03-2008, 07:59 PM
Gabor Gombas
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

On Thu, Jan 03, 2008 at 12:14:15PM -0500, Joey Hess wrote:

> If a package only shuts down cleanly because the rest of the shutdown
> process is slow, it is already buggy. Especially on systems where the
> shutdown is much faster, either due to their being fewer shutdown
> scripts than usual or the cpu being a lot faster than usual, or due to
> its shutdown script being reordered to run later than usual.

That's what I want to say. Such bugs are really hard to trigger and if
something gets corrupted during say a reboot then most people will blame
the HW or the kernel before thinking about the shutdown script.

I'm wondering if init could be modified to warn if it really has to kill
something with SIGKILL but of course syslog is long dead by then so
unless you've serial console you'll likely miss that warning.

> > - apache2: there is a (rather long) sleep in "restart" but not in
> > "stop". Waiting for process termination should be added.
>
> It's fairly common to add a sleep in restart to (try to) deal with
> issues such as reopening a socket.

But if the listening socket is still open then some apache module may
still be doing disk I/O/database access/etc. as well, which means "stop"
should wait till apache really quits.

Gabor

--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 01-03-2008, 08:48 PM
Gabor Gombas
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

On Thu, Jan 03, 2008 at 10:40:32AM -0800, Steve Langasek wrote:

> It's also, as commented already in the init script, recognized as a bug in
> the associated daemon. Fixing that bug would drop the need for the sleep,
> though if there's a possibility of SIGKILL coming before the daemon is done
> shutting down then you still don't have a guaranteed cleanup, and there's no
> good "wait for process termination" facility that we can use from init
> scripts.

Yep, "waiting for an unrelated process to exit" is surprisingly hard to
do correctly. I wonder if the processor connector support in recent
kernels could be used to create a "kill_and_wait" utility:

- start listening on netlink for process-related events
- send the signal to the process
- wait until we receive a notification that the process has died (or a
timeout has occured).
- from time to time do a kill(pid, 0) just to be sure we did not loose
netlink messages

Non-linux ports could fall back to sending kill(pid, 0) in a loop.

Gabor

--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 01-03-2008, 09:01 PM
Gabor Gombas
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

On Thu, Jan 03, 2008 at 09:24:59PM +0100, Petter Reinholdtsen wrote:

> Btw, if the 5 second wait isn't long enough for sendsigs, we can
> extend it. There is code there to make sure sendsigs terminates as
> soon as the last process it tries to kill is dead, so we could
> increase the timeout without affecting the normal shutdown times. It
> will wait from 0 to 5 seconds at the moment, depending on how long it
> take for the processes to die. It would not be a problem to let it
> wait from say 0 to 10 seconds, or 0 to 30 seconds.

That may be a good safety measure. I think it is really hard to hit the
5 second limit but when that happens it is very hard to diagnose later
what went wrong. So if we can increase the max. timeout without imposing
a real delay in the common case (i.e. when everything shuts down
properly) that's good.

Also, how about doing a sync before sending the signals? That way I/O
generated by the services that _do_ have a proper shutdown script won't
interfere with killing the "trivial" services. Sure, that sync can take
time, but then the final sync will be that much shorter.

Gabor

--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 01-03-2008, 10:16 PM
Kurt Roeckx
 
Default Faster shutdown and the ubuntu "multiuser" update-rc.d extention

On Fri, Jan 04, 2008 at 12:01:17AM +0100, Petter Reinholdtsen wrote:
>
> # use SIGCONT/signal 18 to check if there are
> # processes left. No need to check the exit code
> # value, because either killall5 work and it make
> # sense to wait for processes to die, or it fail and
> # there is nothing to wait for.
> - killall5 -18 $OMITPIDS || break
> +
> + if killall5 -18 $OMITPIDS ; then

Why is it using -18? Please change that to SIGCONT, it depends on the
arch what the value should be. See signal(7), which even mentions that
that is different for ppc/i386.


> done
> - log_action_begin_msg "Killing all remaining processes"
> - killall5 -9 $OMITPIDS # SIGKILL
> - log_action_end_msg 0
> + if [ -z "$alldead" ] ; then
> + log_action_begin_msg "Killing all remaining processes"
> + killall5 -9 $OMITPIDS # SIGKILL
> + log_action_end_msg 0
> + fi
> }

I think it would be nice that in case it needs to send KILL
for debug purpuses it could:
- output which processes are still running
- have some sort of delay before it turns the box off/reboots.


Kurt


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 

Thread Tools




All times are GMT. The time now is 02:26 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org