FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 02-26-2009, 04:14 AM
"John A. Sullivan III"
 
Default multipathd segfault and error calling out

On Wed, 2009-02-25 at 22:23 -0500, John A. Sullivan III wrote:
> On Wed, 2009-02-25 at 22:04 -0500, Konrad Rzeszutek wrote:
> > On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote:
> > > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer
> > > and device-mapper-multipath-0.4.7-17.el5. I have a custom
> > > mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls
> > > out the path from /etc/disk/by-path and then echos a priority based upon
> > > a lookup table. It works perfectly fine from the command line.
> > > multipath -ll shows the priorities assigned perfectly and exactly the
> > > right paths are active.
> > >
> > > However, when I start multipathd, it all goes down the tubes. The paths
> > > disappear and /var/log/messages is filled with:
> > > Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh
> >
> > Keep in mind that the environment you have when multipathd calls is quite
> > limited. I believe there is no PATH set, nor any other "normal" values.
> >
> > Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc..
> <snip>
> Thank you. I was enthusiastic that might have been the problem, but
> alas not. Even with absolute pathnames and setting the PATH variable, it
> still gives the same error. In fact, I should have mentioned, I created
> a bogus file with the same pathname which did nothing but "echo hello"
> and it gave the same error calling out error. What next? - John
This is increasingly bizarre. I did an strace on the multipath command
and on the multipathd command.

Here is a portion of the strace for multipath:
close(1) = 0
dup(6) = 1
execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = 0
brk(0) = 0x8c3000

Here is the same call from multipathd:
close(1) = 0
dup(7) = 1
execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = -1 ENOENT (No such file or directory)
exit_group(-1) = ?

Is it my imagination or is it exactly the same call but one is finding
the file and the other is not. What could cause this? It is an explicit
pathname and the file exists??!! Thanks - John
--
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan@opensourcedevel.com

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-26-2009, 04:30 AM
"John A. Sullivan III"
 
Default multipathd segfault and error calling out

On Thu, 2009-02-26 at 00:14 -0500, John A. Sullivan III wrote:
> On Wed, 2009-02-25 at 22:23 -0500, John A. Sullivan III wrote:
> > On Wed, 2009-02-25 at 22:04 -0500, Konrad Rzeszutek wrote:
> > > On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote:
> > > > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer
> > > > and device-mapper-multipath-0.4.7-17.el5. I have a custom
> > > > mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls
> > > > out the path from /etc/disk/by-path and then echos a priority based upon
> > > > a lookup table. It works perfectly fine from the command line.
> > > > multipath -ll shows the priorities assigned perfectly and exactly the
> > > > right paths are active.
> > > >
> > > > However, when I start multipathd, it all goes down the tubes. The paths
> > > > disappear and /var/log/messages is filled with:
> > > > Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh
> > >
> > > Keep in mind that the environment you have when multipathd calls is quite
> > > limited. I believe there is no PATH set, nor any other "normal" values.
> > >
> > > Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc..
> > <snip>
> > Thank you. I was enthusiastic that might have been the problem, but
> > alas not. Even with absolute pathnames and setting the PATH variable, it
> > still gives the same error. In fact, I should have mentioned, I created
> > a bogus file with the same pathname which did nothing but "echo hello"
> > and it gave the same error calling out error. What next? - John
> This is increasingly bizarre. I did an strace on the multipath command
> and on the multipathd command.
>
> Here is a portion of the strace for multipath:
> close(1) = 0
> dup(6) = 1
> execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = 0
> brk(0) = 0x8c3000
>
> Here is the same call from multipathd:
> close(1) = 0
> dup(7) = 1
> execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = -1 ENOENT (No such file or directory)
> exit_group(-1) = ?
>
> Is it my imagination or is it exactly the same call but one is finding
> the file and the other is not. What could cause this? It is an explicit
> pathname and the file exists??!! Thanks - John
I should also mention that the trace shows there is no problem for
multipathd to open the file. Two threads before the failure, we see
this in the strace:

stat("/var/cache/multipathd", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
open("/usr/local/sbin/mpath_prio_ssi", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=368, ...}) = 0
close(4) = 0

So the problem appears to be explicitly with the execve call. How does
one fix this? Thanks - John
--
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan@opensourcedevel.com

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-26-2009, 06:16 AM
Hannes Reinecke
 
Default multipathd segfault and error calling out

John A. Sullivan III wrote:

On Wed, 2009-02-25 at 22:04 -0500, Konrad Rzeszutek wrote:

On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote:

Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer
and device-mapper-multipath-0.4.7-17.el5. I have a custom
mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls
out the path from /etc/disk/by-path and then echos a priority based upon
a lookup table. It works perfectly fine from the command line.
multipath -ll shows the priorities assigned perfectly and exactly the
right paths are active.

However, when I start multipathd, it all goes down the tubes. The paths
disappear and /var/log/messages is filled with:
Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh

Keep in mind that the environment you have when multipathd calls is quite
limited. I believe there is no PATH set, nor any other "normal" values.

Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc..

<snip>
Thank you. I was enthusiastic that might have been the problem, but
alas not. Even with absolute pathnames and setting the PATH variable, it
still gives the same error. In fact, I should have mentioned, I created
a bogus file with the same pathname which did nothing but "echo hello"
and it gave the same error calling out error. What next? - John

Return an explicit exit code. It might be that eg 'cut' returns a non-zero
value, which then would interpreted as a failure.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-26-2009, 08:40 AM
"Bryn M. Reeves"
 
Default multipathd segfault and error calling out

John A. Sullivan III wrote:
Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with
VServer and device-mapper-multipath-0.4.7-17.el5. I have a custom
mpath_prio_ssi script which takes the device name (e.g., sdaa),
pulls out the path from /etc/disk/by-path and then echos a priority
based upon a lookup table. It works perfectly fine from the
command line. multipath -ll shows the priorities assigned perfectly

and exactly the right paths are active.

However, when I start multipathd, it all goes down the tubes. The
paths disappear and /var/log/messages is filled with: Feb 25
20:50:17 vd01 multipathd: error calling out
/usr/local/sbin/mpath_prio_ssi sdh Feb 25 20:50:17 vd01 multipathd:
error calling out /usr/local/sbin/mpath_prio_ssi sdi Feb 25
20:50:17 vd01 multipathd: error calling out
/usr/local/sbin/mpath_prio_ssi sdj Feb 25 20:50:17 vd01 multipathd:

error calling out /usr/local/sbin/mpath_prio_ssi sdc


I think you'll need to modify the multipathd binary to achieve this.

To avoid deadlocking when file system access is interrupted due to
path failures multipathd forks into a new namespace and discards all
the device-backed file systems that are mounted.

It creates an in-memory file system (ramfs) and copies all the
binaries it will need into this. The file system is locked into memory
so that multipathd can continue to function even if the paths backing
the root file system have all failed.

For the callouts themselves (getuid and getprio binaries) the config
file processing takes care of this but this only works for stand-alone
binaries. If your script has other dependencies then you'll have to
add code to pull those into the ramfs volume.

See libmultipath/config.cush_callout(),
libmultipath/config.c:store_hwe(),
multipathd/main.crepare_namespace() and other code that manipulates
the list of binaries stored in conf->binvec.

Regards,
Bryn.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-26-2009, 12:33 PM
"John A. Sullivan III"
 
Default multipathd segfault and error calling out

On Thu, 2009-02-26 at 09:40 +0000, Bryn M. Reeves wrote:
> John A. Sullivan III wrote:
> > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with
> > VServer and device-mapper-multipath-0.4.7-17.el5. I have a custom
> > mpath_prio_ssi script which takes the device name (e.g., sdaa),
> > pulls out the path from /etc/disk/by-path and then echos a priority
> > based upon a lookup table. It works perfectly fine from the
> > command line. multipath -ll shows the priorities assigned perfectly
> > and exactly the right paths are active.
> >
> > However, when I start multipathd, it all goes down the tubes. The
> > paths disappear and /var/log/messages is filled with: Feb 25
> > 20:50:17 vd01 multipathd: error calling out
> > /usr/local/sbin/mpath_prio_ssi sdh Feb 25 20:50:17 vd01 multipathd:
> > error calling out /usr/local/sbin/mpath_prio_ssi sdi Feb 25
> > 20:50:17 vd01 multipathd: error calling out
> > /usr/local/sbin/mpath_prio_ssi sdj Feb 25 20:50:17 vd01 multipathd:
> > error calling out /usr/local/sbin/mpath_prio_ssi sdc
>
> I think you'll need to modify the multipathd binary to achieve this.
>
> To avoid deadlocking when file system access is interrupted due to
> path failures multipathd forks into a new namespace and discards all
> the device-backed file systems that are mounted.
>
> It creates an in-memory file system (ramfs) and copies all the
> binaries it will need into this. The file system is locked into memory
> so that multipathd can continue to function even if the paths backing
> the root file system have all failed.
>
> For the callouts themselves (getuid and getprio binaries) the config
> file processing takes care of this but this only works for stand-alone
> binaries. If your script has other dependencies then you'll have to
> add code to pull those into the ramfs volume.
>
> See libmultipath/config.cush_callout(),
> libmultipath/config.c:store_hwe(),
> multipathd/main.crepare_namespace() and other code that manipulates
> the list of binaries stored in conf->binvec.
>
> Regards,
> Bryn.
<snip>
Thank you very much, Bryn. That finally makes sense of it all.
Unfortunately, I am not a developer at all and hence approach this more
as a systems designer.

If I understand you correctly, the best approach would be to create my
script as a compiled binary rather than a bash script. Then the config
file processing will load it into memory. Is that correct? Does that
also imply that the file referenced as the list of iSCSI ids and
priorities needs to be embedded in the binary? Is that a non-issue if I
am not using multipathing for the devices containing the referenced
script?

As my skills are limited for converting this from bash to C, could I
achieve the same thing by calling bash rather than the script and
passing the script as an argument, e.g.,
prio_callout "/bin/bash /usr/local/sbin/mpath_prio_ssi %n"

Thanks again - John
--
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan@opensourcedevel.com

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 02-27-2009, 02:06 AM
"John A. Sullivan III"
 
Default multipathd segfault and error calling out

On Thu, 2009-02-26 at 09:40 +0000, Bryn M. Reeves wrote:
> John A. Sullivan III wrote:
> > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with
> > VServer and device-mapper-multipath-0.4.7-17.el5. I have a custom
> > mpath_prio_ssi script which takes the device name (e.g., sdaa),
> > pulls out the path from /etc/disk/by-path and then echos a priority
> > based upon a lookup table. It works perfectly fine from the
> > command line. multipath -ll shows the priorities assigned perfectly
> > and exactly the right paths are active.
> >
> > However, when I start multipathd, it all goes down the tubes. The
> > paths disappear and /var/log/messages is filled with: Feb 25
> > 20:50:17 vd01 multipathd: error calling out
> > /usr/local/sbin/mpath_prio_ssi sdh Feb 25 20:50:17 vd01 multipathd:
> > error calling out /usr/local/sbin/mpath_prio_ssi sdi Feb 25
> > 20:50:17 vd01 multipathd: error calling out
> > /usr/local/sbin/mpath_prio_ssi sdj Feb 25 20:50:17 vd01 multipathd:
> > error calling out /usr/local/sbin/mpath_prio_ssi sdc
>
> I think you'll need to modify the multipathd binary to achieve this.
>
> To avoid deadlocking when file system access is interrupted due to
> path failures multipathd forks into a new namespace and discards all
> the device-backed file systems that are mounted.
>
> It creates an in-memory file system (ramfs) and copies all the
> binaries it will need into this. The file system is locked into memory
> so that multipathd can continue to function even if the paths backing
> the root file system have all failed.
>
> For the callouts themselves (getuid and getprio binaries) the config
> file processing takes care of this but this only works for stand-alone
> binaries. If your script has other dependencies then you'll have to
> add code to pull those into the ramfs volume.
>
> See libmultipath/config.cush_callout(),
> libmultipath/config.c:store_hwe(),
> multipathd/main.crepare_namespace() and other code that manipulates
> the list of binaries stored in conf->binvec.
<snip>
You were exactly right (of course!). I changed prio_callout from
directly calling a bash scrip to /bin/bash scriptname %n and that
eliminated the callout errors. However, as expected, the internal calls
to bin/ls, bin/grep, etc. all failed. I then rewrote the script to use
nothing but bash internals (took a little doing such as getting the path
list from /dev/disk/by-path but it seems to work).

That, in our initial testing of simply pulling the network cable (no
live data transfer yet), multipathd fails the devices and fails them
back on recovery but, after recover, all the paths are shown as enabled
- none are active. We hope to start live data testing tomorrow. Thanks
again - John
--
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan@opensourcedevel.com

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 02:23 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org