FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 06-13-2008, 10:14 AM
"Gianluca Cecchi"
 
Default two paths remain failed on DS6800 after code upgrade

Hello,
I have a test server connected to an IBM DS6800 storage.
It is a blade bl480c with two qlogic hbas, connected to 2 fc-switches.
RH EL 4.6 x86_64 installed (kernel 2.6.9-67.ELsmp)
device-mapper-1.02.21-1.el4

device-mapper-multipath-0.4.5-27.RHEL4

In boot messages I have for the hbas:
qla2400 0000:0c:00.0: Found an ISP2432, irq 185, iobase 0xffffff000001c000
QLogic Fibre Channel HBA Driver: 8.01.07-d4
QLogic QMH2462 - SBUS to 2Gb FC, Dual Channel

ISP2432: PCIe (2.5Gb/s x4) @ 0000:0c:00.0 hdma+, host#=0, fw=4.00.150 [IP]
Vendor: IBM****** Model: 1750500********** Rev: .155
Type:** Direct-Access********************* ANSI SCSI revision: 05

On the storage I have access to two luns, so that in total I get 8 paths and disks from sda to sdh.

In multipath I'm using default os install config for ds6800 (storage 1750500)
so it should be:
#***** device {
#************** vendor***************** "IBM"
#************** product**************** "1750500"

#************** path_grouping_policy*** group_by_prio
#************** getuid_callout********* "/sbin/scsi_id -g -u -s"
#************** prio_callout*********** "/sbin/mpath_prio_alua %d"
#************** features*************** "1 queue_if_no_path"

#************** path_checker*********** tur
#****** }

In normal operation the command "multipath -ll" gives:


[root@test-rhel-p ~]# multipath -ll


mpath1
(3600507630efe05800000000000001700)


[size=20 GB][features="1
queue_if_no_path"][hwhandler="0"]


\_ round-robin 0 [prio=100][active]************************************



**** \_
0:0:1:1 sdd 8:48* [active][ready]


**** \_
1:0:1:1 sdh 8:112 [active][ready]


\_ round-robin 0 [prio=20][enabled]***********************************


**** \_ 0:0:0:1 sdb 8:16* [active][ready]


**** \_ 1:0:0:1 sdf 8:80* [active][ready]


*


mpath0
(3600507630efe05800000000000001600)


[size=20 GB][features="1
queue_if_no_path"][hwhandler="0"]


\_ round-robin 0 [prio=100][active] ***********************************


**** \_
0:0:0:0 sda 8:0** [active][ready]


*** *\_
1:0:0:0 sde 8:64* [active][ready]


\_ round-robin 0 [prio=20][enabled] **********************************


**** \_ 0:0:1:0 sdc 8:32* [active][ready]


**** \_ 1:0:1:0 sdg 8:96* [active][ready]



We had a code update for the storage, and so I wanted to test the multipath behaviour.
It was made in concurrent mode.
I get a first path-change whithout problems, probably when fisrt controller was updated.



mpath1:

\_ round-robin 0 [enabled]



**** \_
0:0:0:1 sdb 8:16* [failed]


**** \_ 1:0:0:1 sdf 8:80* [failed]


and
mpath0:


\_ round-robin 0 [enabled]


**** \_ 0:0:0:0 sda 8:0** [failed]


**** \_ 1:0:0:0 sde 8:64* [failed]


while the other two path group remained active.
At the end of upgrade, probably with the second controller update, I get the situation below.
while other servers with windows and Linux (using sdd) came back with all paths, this server retains two paths in failed state:




[root@test-rhel-p
RPMS]# multipath -l


mpath1
(3600507630efe05800000000000001700)


[size=20
GB][features="1 queue_if_no_path"][hwhandler="0"]


\_ round-robin
0 [enabled]


*\_ 0:0:1:1 sdd 8:48* [failed][faulty]


*\_ 1:0:1:1 sdh 8:112 [active]


\_ round-robin
0 [enabled]


*\_ 0:0:0:1 sdb 8:16* [active]


*\_ 1:0:0:1 sdf 8:80* [active]


*


mpath0
(3600507630efe05800000000000001600)


[size=20
GB][features="1 queue_if_no_path"][hwhandler="0"]


\_ round-robin
0 [active]


*\_ 0:0:0:0 sda 8:0** [active]


*\_ 1:0:0:0 sde 8:64* [active]


\_ round-robin
0 [enabled]


*\_ 0:0:1:0 sdc 8:32* [failed][faulty]


*\_ 1:0:1:0 sdg 8:96* [active]



with messages every 5 seconds of type:

error calling out /sbin/mpath_prio_alua /dev/sdc
error calling out /sbin/mpath_prio_alua /dev/sdd

Other information:
[root@test-rhel-p ]# sg_inq /dev/sdc

sg_inq: error opening file: /dev/sdc: No such device or address

[root@test-rhel-p RPMS]# ll /dev/sdc
brw-rw----* 1 root disk 8, 32 Jun 11 19:03 /dev/sdc

[root@test-rhel-p RPMS]# sg_inq /dev/sda
standard INQUIRY:

* PQual=0* Device_type=0* RMB=0* version=0x05* [SPC-3]
* [AERC=0]* [TrmTsk=0]* NormACA=1* HiSUP=1* Resp_data_format=2
* SCCS=0* ACC=0* TGPS=1* 3PC=0* Protect=0* BQue=0
* EncServ=0* MultiP=1 (VS=0)* [MChngr=0]* [ACKREQQ=0]* Addr16=0

* [RelAdr=0]* WBus16=0* Sync=0* Linked=0* [TranDis=0]* CmdQue=1
* Clocking=0x0* QAS=0* IUS=0
*** length=164 (0xa4)** Peripheral device type: disk
*Vendor identification: IBM
*Product identification: 1750500

*Product revision level: .441
*Unit serial number: 68778501600

Any help to get up the paths?
Could it help a scsi rescan? What should be the correct command in this case?
The system is operational and without interruption on disk acces for the users, but I don't understand why the paths don't come up again...


Thanks in advance for help or suggestions.
Gianluca



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 06-13-2008, 04:22 PM
"Gianluca Cecchi"
 
Default two paths remain failed on DS6800 after code upgrade

It seems I solved the problem removing the disks and then adding them, with the commands:

for sdc
remove:
echo "1" > /sys/class/scsi_host/host0/device/target0:0:1/0:0:1:0/delete




To reregister the disk with the kernel used the command:
echo "0 1 0" > /sys/class/scsi_host/host0/scan

for sdd

remove:

echo "1" > /sys/class/scsi_host/host0/device/target0:0:1/0:0:1:1/delete






To reregister the disk with the kernel used the command:

echo "0 1 1" > /sys/class/scsi_host/host0/scan


In general
deregister
echo "1" > /sys/class/scsi_host/hostH/device/targetH:B:T/H:B:T:L/delete

To reregister the disk with the kernel:
echo "B T L" > /sys/class/scsi_host/hostH/scan

Device name persistence not granted. In my case

sdc --> sdi
sdd --> sdd

All operations were transparent and with no downtime.
I also removed other disks and then readded (also during I/O using them) and I have to say that multipath seems very robust.

Thanks to the developers.
HIH for others.
Still I don't understand why* it was before unable to reinstate the path.
Gianluca*


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 04:34 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org