Trouble with StorageTek 2530 (SAS) and RDAC
Hi!
I have contacted list almost a half a year ago about this storage. I still haven't figured out how to set it up... I have 3 nodes connected to it, and 2 volumes shared across all 3 nodes. I'm using CentOS 5.4. Here is my multipath.conf: defaults { udev_dir /dev polling_interval 10 selector "round-robin 0" path_grouping_policy multibus getuid_callout "/sbin/scsi_id -g -u -s /block/%n" prio_callout /bin/true path_checker readsector0 rr_min_io 100 max_fds 8192 rr_weight priorities failback immediate no_path_retry fail } blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z]" devnode "^sda" } multipaths { multipath { wwid 3600a0b80003abc5c000011504b52f919 alias sas-qd } multipath { wwid 3600a0b80002fcd1800001a374b52fa1e alias sas-data } } devices { device { vendor "SUN" product "LCSM100_S" getuid_callout "/sbin/scsi_id -g -u -s /block/%n" prio_callout "/sbin/mpath_prio_rdac /dev/%n" features "0" hardware_handler "1 rdac" path_grouping_policy group_by_prio failback immediate path_checker rdac rr_weight uniform no_path_retry 300 rr_min_io 1000 } } And here is multipath -ll: # multipath -ll sas-data sas-data (3600a0b80002fcd1800001a374b52fa1e) dm-1 SUN,LCSM100_S [size=2.7T][features=1 queue_if_no_path][hwhandler=1 rdac][rw] \_ round-robin 0 [prio=100][enabled] \_ 1:0:3:1 sde 8:64 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 1:0:0:1 sdc 8:32 [active][ghost] On that volume, I have set up CLVM, and I have created one logical clustered volume. If I try to format it with ext3, here is what I finish with: Jan 26 23:00:43 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector 1360267648 Jan 26 23:00:43 node01 kernel: device-mapper: multipath: Failing path 8:64. Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector 1360269696 Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector 1360527744 Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector 1360528768 Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector 1360529792 Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:43 node01 multipathd: 8:64: mark as failed Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector 1360530816 Jan 26 23:00:44 node01 multipathd: sas-data: remaining active paths: 1 Jan 26 23:00:44 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:44 node01 multipathd: dm-1: add map (uevent) Jan 26 23:00:44 node01 kernel: end_request: I/O error, dev sde, sector 1360531840 Jan 26 23:00:44 node01 multipathd: dm-1: devmap already registered Jan 26 23:00:44 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:44 node01 multipathd: sdd: remove path (uevent) Jan 26 23:00:44 node01 kernel: end_request: I/O error, dev sde, sector 1360789888 Jan 26 23:00:44 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:44 node01 kernel: end_request: I/O error, dev sde, sector 1360790912 . . . lot of similar messages . . . Jan 26 23:00:50 node01 kernel: sd 1:0:1:1: SCSI error: return code = 0x00010000 Jan 26 23:00:50 node01 kernel: end_request: I/O error, dev sde, sector 1358694784 Jan 26 23:00:50 node01 kernel: mptsas: ioc1: removing ssp device, channel 0, id 4, phy 7 Jan 26 23:00:50 node01 kernel: scsi 1:0:1:0: rdac Dettached Jan 26 23:00:50 node01 kernel: scsi 1:0:1:1: rdac Dettached Jan 26 23:00:50 node01 kernel: sd 1:0:0:1: queueing MODE_SELECT command. Jan 26 23:00:50 node01 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Jan 26 23:00:51 node01 kernel: sd 1:0:0:0: rdac Dettached Jan 26 23:00:51 node01 multipathd: sas-qd: load table [0 204800 multipath 0 1 rdac 1 1 round-robin 0 1 1 8:16 1000] Jan 26 23:00:51 node01 multipathd: sde: remove path (uevent) Jan 26 23:00:51 node01 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Jan 26 23:00:52 node01 kernel: sd 1:0:0:1: rdac Dettached Jan 26 23:00:52 node01 multipathd: sas-data: load table [0 5855165440 multipath 0 1 rdac 1 1 round-robin 0 1 1 8:32 1000] Jan 26 23:00:52 node01 multipathd: dm-0: add map (uevent) Jan 26 23:00:52 node01 multipathd: dm-0: devmap already registered Jan 26 23:00:52 node01 multipathd: dm-1: add map (uevent) Jan 26 23:00:52 node01 multipathd: dm-1: devmap already registered Jan 26 23:00:52 node01 kernel: device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. Any ideas? -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================== =============== | start fighting cancer -> http://www.worldcommunitygrid.org/ | -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel |
Trouble with StorageTek 2530 (SAS) and RDAC
On Tue, 2010-01-26 at 23:09 +0100, Jakov Sosic wrote:
> Hi! > > I have contacted list almost a half a year ago about this storage. I > still haven't figured out how to set it up... I have 3 nodes connected > to it, and 2 volumes shared across all 3 nodes. I'm using CentOS 5.4. > Here is my multipath.conf: > > > defaults { > udev_dir /dev > polling_interval 10 > selector "round-robin 0" > path_grouping_policy multibus > getuid_callout "/sbin/scsi_id -g -u -s /block/%n" > prio_callout /bin/true > path_checker readsector0 > rr_min_io 100 > max_fds 8192 > rr_weight priorities > failback immediate > no_path_retry fail > } > blacklist { > devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" > devnode "^hd[a-z]" > devnode "^sda" > } > multipaths { > multipath { > wwid 3600a0b80003abc5c000011504b52f919 > alias sas-qd > } > multipath { > wwid 3600a0b80002fcd1800001a374b52fa1e > alias sas-data > } > } > > devices { > device { > vendor "SUN" > product "LCSM100_S" > getuid_callout "/sbin/scsi_id -g -u -s /block/%n" > prio_callout "/sbin/mpath_prio_rdac /dev/%n" > features "0" > hardware_handler "1 rdac" > path_grouping_policy group_by_prio > failback immediate > path_checker rdac > rr_weight uniform > no_path_retry 300 > rr_min_io 1000 > } > } > > > And here is multipath -ll: > # multipath -ll sas-data > sas-data (3600a0b80002fcd1800001a374b52fa1e) dm-1 SUN,LCSM100_S > [size=2.7T][features=1 queue_if_no_path][hwhandler=1 rdac][rw] > \_ round-robin 0 [prio=100][enabled] > \_ 1:0:3:1 sde 8:64 [active][ready] > \_ round-robin 0 [prio=0][enabled] > \_ 1:0:0:1 sdc 8:32 [active][ghost] > > > On that volume, I have set up CLVM, and I have created one logical > clustered volume. If I try to format it with ext3, here is what I finish > with: > > > Jan 26 23:00:43 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): > Originator={PL}, Code={IO Executed}, SubCode(0x0000) > Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector > 1360267648 Is the message got from the same node as where you got the multipath -ll o/p from ? >From these messages it looks like sde is 1:0:1:1, but from the multipath -ll o/p it looks like it is 1:0:3:1. > Jan 26 23:00:43 node01 kernel: device-mapper: multipath: Failing path 8:64. > Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 This return code means that the host is returning DID_NO_CONNECT. which means that the host is not able to connect to the end point. I would suggest you to go step-by-step. 1. Try to access both the paths of a lun (in all nodes). one should succeed and other should fail. 2. Try to access the multipath device and see if all is good. 3. Create a LVM on a single node (not clusters) and see if that works. 4. Create a clustered LVM on top of all the Active (non-ghost) sd devices and see if it works. When you send the results include o/p "dmsetup table" and "dmsetup ls" > Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector > 1360269696 > Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector > 1360527744 > Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector > 1360528768 > Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector > 1360529792 > Jan 26 23:00:43 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:43 node01 multipathd: 8:64: mark as failed > Jan 26 23:00:43 node01 kernel: end_request: I/O error, dev sde, sector > 1360530816 > Jan 26 23:00:44 node01 multipathd: sas-data: remaining active paths: 1 > Jan 26 23:00:44 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:44 node01 multipathd: dm-1: add map (uevent) > Jan 26 23:00:44 node01 kernel: end_request: I/O error, dev sde, sector > 1360531840 > Jan 26 23:00:44 node01 multipathd: dm-1: devmap already registered > Jan 26 23:00:44 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:44 node01 multipathd: sdd: remove path (uevent) > Jan 26 23:00:44 node01 kernel: end_request: I/O error, dev sde, sector > 1360789888 > Jan 26 23:00:44 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:44 node01 kernel: end_request: I/O error, dev sde, sector > 1360790912 > . > . > . > lot of similar messages > . > . > . > > Jan 26 23:00:50 node01 kernel: sd 1:0:1:1: SCSI error: return code = > 0x00010000 > Jan 26 23:00:50 node01 kernel: end_request: I/O error, dev sde, sector > 1358694784 > Jan 26 23:00:50 node01 kernel: mptsas: ioc1: removing ssp device, > channel 0, id 4, phy 7 > Jan 26 23:00:50 node01 kernel: scsi 1:0:1:0: rdac Dettached > Jan 26 23:00:50 node01 kernel: scsi 1:0:1:1: rdac Dettached > Jan 26 23:00:50 node01 kernel: sd 1:0:0:1: queueing MODE_SELECT command. > Jan 26 23:00:50 node01 kernel: device-mapper: multipath: Using scsi_dh > module scsi_dh_rdac for failover/failback and device management. > Jan 26 23:00:51 node01 kernel: sd 1:0:0:0: rdac Dettached > Jan 26 23:00:51 node01 multipathd: sas-qd: load table [0 204800 > multipath 0 1 rdac 1 1 round-robin 0 1 1 8:16 1000] > Jan 26 23:00:51 node01 multipathd: sde: remove path (uevent) > Jan 26 23:00:51 node01 kernel: device-mapper: multipath: Using scsi_dh > module scsi_dh_rdac for failover/failback and device management. > Jan 26 23:00:52 node01 kernel: sd 1:0:0:1: rdac Dettached > Jan 26 23:00:52 node01 multipathd: sas-data: load table [0 5855165440 > multipath 0 1 rdac 1 1 round-robin 0 1 1 8:32 1000] > Jan 26 23:00:52 node01 multipathd: dm-0: add map (uevent) > Jan 26 23:00:52 node01 multipathd: dm-0: devmap already registered > Jan 26 23:00:52 node01 multipathd: dm-1: add map (uevent) > Jan 26 23:00:52 node01 multipathd: dm-1: devmap already registered > Jan 26 23:00:52 node01 kernel: device-mapper: multipath: Cannot failover > device because scsi_dh_rdac was not loaded. > > > Any ideas? > > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel |
Trouble with StorageTek 2530 (SAS) and RDAC
On 01/27/2010 03:23 AM, Chandra Seetharaman wrote:
> This return code means that the host is returning DID_NO_CONNECT. which > means that the host is not able to connect to the end point. > > I would suggest you to go step-by-step. > 1. Try to access both the paths of a lun (in all nodes). > one should succeed and other should fail. > 2. Try to access the multipath device and see if all is good. > 3. Create a LVM on a single node (not clusters) and see if that works. > 4. Create a clustered LVM on top of all the Active (non-ghost) sd > devices and see if it works. > > When you send the results include o/p "dmsetup table" and "dmsetup ls" Thank you! I've solved the multipath problems with new kernel I built with my device added to scsi_dh_rdac.c! I've added the "SUN" "LCMS100_S", just as few months back Charlie Brady suggested to me! That was the solution for the multipath problems. Now multipath is able to do it's own part. But, after the failover, secondary path works for just a bit, and then hangs... When I disconnect active SAS cable from the server, multipath and scsi_dh_rdac do their thing, but if I have active read/write processes (like copying one file over on the volume mounted from storage to the exact same partition for example), everything hangs few seconds after multipath failover. Very strange behaviour indeed. This is what happens now: Jan 28 20:26:12 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) Jan 28 20:26:12 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) Jan 28 20:26:12 node01 kernel: sd 1:0:0:1: SCSI error: return code = 0x00010000 Jan 28 20:26:12 node01 kernel: end_request: I/O error, dev sdc, sector 7012168 Jan 28 20:26:12 node01 kernel: device-mapper: multipath: Failing path 8:32. Jan 28 20:26:12 node01 kernel: sd 1:0:0:1: SCSI error: return code = 0x00010000 Jan 28 20:26:12 node01 kernel: end_request: I/O error, dev sdc, sector 7012424 So, multipath activated... Lots of similar scsi I/O error messages follow, and in between I see this: Jan 28 20:26:12 node01 multipathd: dm-1: add map (uevent) Jan 28 20:26:12 node01 multipathd: dm-1: devmap already registered Jan 28 20:26:12 node01 multipathd: 8:32: mark as failed Jan 28 20:26:12 node01 multipathd: sas-data: remaining active paths: 1 Jan 28 20:26:12 node01 multipathd: sdb: remove path (uevent) and then Jan 28 20:26:13 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) Jan 28 20:26:13 node01 last message repeated 61 times Jan 28 20:26:18 node01 multipathd: sas-qd: load table [0 204800 multipath 0 1 rdac 1 1 round-robin 0 1 1 8:80 3000] Jan 28 20:26:18 node01 multipathd: sdc: remove path (uevent) Jan 28 20:26:18 node01 multipathd: sas-data: load table [0 3774873600 multipath 0 1 rdac 1 1 round-robin 0 1 1 8:96 1000] Jan 28 20:26:18 node01 multipathd: sdd: remove path (uevent) Jan 28 20:26:18 node01 kernel: mptsas: ioc1: removing ssp device, channel 0, id 1, phy 3 Jan 28 20:26:18 node01 multipathd: sas-os: load table [0 2080291840 multipath 0 1 rdac 1 1 round-robin 0 1 1 8:112 3000] Jan 28 20:26:18 node01 multipathd: sde: remove path (uevent) Jan 28 20:26:18 node01 kernel: scsi 1:0:0:0: rdac Dettached Jan 28 20:26:19 node01 multipathd: sde: spurious uevent, path not in pathvec Jan 28 20:26:19 node01 kernel: scsi 1:0:0:1: rdac Dettached Jan 28 20:26:19 node01 multipathd: uevent trigger error Jan 28 20:26:19 node01 kernel: scsi 1:0:0:2: rdac Dettached Jan 28 20:26:19 node01 multipathd: dm-0: add map (uevent) Jan 28 20:26:19 node01 kernel: sd 1:0:3:1: queueing MODE_SELECT command. Jan 28 20:26:19 node01 multipathd: dm-0: devmap already registered Jan 28 20:26:19 node01 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Jan 28 20:26:19 node01 multipathd: dm-1: add map (uevent) Jan 28 20:26:19 node01 multipathd: dm-1: devmap already registered Jan 28 20:26:19 node01 multipathd: dm-2: add map (uevent) Jan 28 20:26:19 node01 kernel: scsi 1:0:0:1: rejecting I/O to dead device Jan 28 20:26:19 node01 multipathd: dm-2: devmap already registered Jan 28 20:26:19 node01 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Jan 28 20:26:19 node01 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Jan 28 20:26:20 node01 multipathd: 8:96: reinstated Jan 28 20:27:08 node01 multipathd: dm-1: add map (uevent) Jan 28 20:27:08 node01 multipathd: dm-1: devmap already registered Jan 28 20:27:08 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29045144 Jan 28 20:27:08 node01 kernel: device-mapper: multipath: Failing path 8:96. Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29089224 Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29090248 Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29091272 Jan 28 20:27:08 node01 multipathd: 8:96: mark as failed Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 multipathd: sas-data: Entering recovery mode: max_retries=300 Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29092296 Jan 28 20:27:08 node01 multipathd: sas-data: remaining active paths: 0 Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 multipathd: sdf: remove path (uevent) Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29093320 Jan 28 20:27:08 node01 multipathd: sas-qd: stop event checker thread Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 multipathd: sdg: remove path (uevent) Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29094344 Jan 28 20:27:08 node01 multipathd: sas-data: map in use Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 multipathd: sas-data: can't flush Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29095368 Jan 28 20:27:08 node01 multipathd: sdh: remove path (uevent) Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 multipathd: sas-os: stop event checker thread Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29096400 Jan 28 20:27:08 node01 multipathd: sdi: remove path (uevent) Jan 28 20:27:08 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) Jan 28 20:27:08 node01 multipathd: sdi: spurious uevent, path not in pathvec Jan 28 20:27:08 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) Jan 28 20:27:08 node01 multipathd: uevent trigger error Jan 28 20:27:08 node01 kernel: mptbase: ioc1: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) Jan 28 20:27:08 node01 last message repeated 60 times Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 Jan 28 20:27:08 node01 kernel: end_request: I/O error, dev sdg, sector 29097424 Jan 28 20:27:08 node01 kernel: sd 1:0:3:1: SCSI error: return code = 0x00010000 lots of SCSI errors... Jan 28 20:27:14 node01 kernel: mptsas: ioc1: removing ssp device, channel 0, id 4, phy 7 Jan 28 20:27:14 node01 kernel: scsi 1:0:3:0: rdac Dettached Jan 28 20:27:14 node01 kernel: scsi 1:0:3:1: rdac Dettached Jan 28 20:27:14 node01 kernel: scsi 1:0:3:2: rdac Dettached Jan 28 20:27:14 node01 kernel: scsi 1:0:3:1: rejecting I/O to dead device Jan 28 20:28:18 node01 kernel: scsi 1:0:3:1: rejecting I/O to dead device Jan 28 20:28:18 node01 multipathd: sdg: rdac checker reports path is down Jan 28 20:29:29 node01 kernel: scsi 1:0:3:1: rejecting I/O to dead device Jan 28 20:29:29 node01 multipathd: sdg: rdac checker reports path is down Jan 28 20:30:40 node01 kernel: scsi 1:0:3:1: rejecting I/O to dead device Jan 28 20:30:40 node01 multipathd: sdg: rdac checker reports path is down And that's it... all path's lost. Node is still alive, I can access it, read from it, write to it, but commands like "multipath -ll" just hang forever... And if I try to restart the server, it hangs too. I do use CLVM partition, but I'm willing to try going on raw SAS volume, if you think that would be solution. And about your suggestions: 1. Try to access both the paths of a lun (in all nodes). one should succeed and other should fail. This works OK. No problems noticed. 2. Try to access the multipath device and see if all is good. This works too, if I don't disconnect one of the two cables :) 3. Create a LVM on a single node (not clusters) and see if that works. 4. Create a clustered LVM on top of all the Active (non-ghost) sd devices and see if it works. 3 & 4 I did not try. Problem is that after I get errors, I loose all the volumes from the nodes. It is ok to loose one path, but on secondary path, I get something like # # # # (failed)(failed) in multipath -ll output... Also, all other volumes are simply lost, there are no devices present. It seems to me like the controller itself, or maybe mptsas driver goes berzerk in the process. Any ideas? :) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel |
| All times are GMT. The time now is 02:37 AM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.