Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Device-mapper Development (http://www.linux-archive.org/device-mapper-development/)
-   -   scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn (http://www.linux-archive.org/device-mapper-development/410726-scsi_dh_rdac-retry-io-06-3f-03-rdac_check_sense-fn.html)

Chandra Seetharaman 08-10-2010 01:47 AM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
Vijay,

Since it has to be handled at SCSI (and it is a SCSI sense), shouldn't
we be retrying it at SCSI level. James, what is your opinion ?

It looks good to me as the effect is local to LSI rdac storage.

chandra

On Mon, 2010-08-02 at 13:07 +0530, Chauhan, Vijay wrote:
> Hi,
>
> This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code in rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid layer, as a reason momentarily path failure is noticed by DM multipath.
>
> Signed-off-by: Vijay Chauhan<vijay.chauhan@lsi.com>
> Reviewed-by: Babu Moger <babu.moger@lsi.com>
> Reviewed-by: Bob Stankey <Robert.stankey@lsi.com>
> ---
>
> diff -uprN linux-2.6.35-rc6-orig/drivers/scsi/device_handler/scsi_dh_rdac.c linux-2.6.35-rc6/drivers/scsi/device_handler/scsi_dh_rdac.c
> --- linux-2.6.35-rc6-orig/drivers/scsi/device_handler/scsi_dh_rdac.c 2010-07-22 15:13:38.000000000 -0400
> +++ linux-2.6.35-rc6/drivers/scsi/device_handler/scsi_dh_rdac.c 2010-07-27 12:13:58.000000000 -0400
> @@ -738,6 +738,11 @@ static int rdac_check_sense(struct scsi_
> * Quiescence in progress , just retry.
> */
> return ADD_TO_MLQUEUE;
> + if (sense_hdr->asc == 0x3f && sense_hdr->ascq == 0x03)
> + /*
> + * INQUIRY DATA has changed, retry again.
> + */
> + return ADD_TO_MLQUEUE;
> break;
> }
> /* success just means we do not care what scsi-ml does */
>
> --


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

"Chauhan, Vijay" 09-29-2010 04:39 PM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
Hi James,

For 06/3f/03 scsi sense, SUCCESS status is returned by scsi_check_sense. scsi_io_completion fn should have retried this check condition but some how it is not happening. IMO, command should be either retried by scsi or scsi dh rather than failing it (for this check condition). Can you please comment on this patch.

Thanks,
Vijay

-----Original Message-----
From: Chandra Seetharaman [mailto:sekharan@us.ibm.com]
Sent: Tuesday, August 10, 2010 7:18 AM
To: Chauhan, Vijay; James Bottomley
Cc: device-mapper development; linux-scsi@vger.kernel.org
Subject: Re: [Patch] scsi_dh_rdac: retry IO for 06/3f/03 in
rdac_check_sense fn

Vijay,

Since it has to be handled at SCSI (and it is a SCSI sense), shouldn't
we be retrying it at SCSI level. James, what is your opinion ?

It looks good to me as the effect is local to LSI rdac storage.

chandra

On Mon, 2010-08-02 at 13:07 +0530, Chauhan, Vijay wrote:
> Hi,
>
> This patch adds retry for the IO returned with
06/3f/03((INQUIRY_DATA_CHANGED)) sense code in
rdac_check_sense(). IO returned with 06/3f/03 from controller
are currently failed by scsi mid layer, as a reason momentarily
path failure is noticed by DM multipath.
>
> Signed-off-by: Vijay Chauhan<vijay.chauhan@lsi.com>
> Reviewed-by: Babu Moger <babu.moger@lsi.com>
> Reviewed-by: Bob Stankey <Robert.stankey@lsi.com>
> ---
>
> diff -uprN
linux-2.6.35-rc6-orig/drivers/scsi/device_handler/scsi_dh_rdac.c
linux-2.6.35-rc6/drivers/scsi/device_handler/scsi_dh_rdac.c
> ---
linux-2.6.35-rc6-orig/drivers/scsi/device_handler/scsi_dh
_rdac.c 2010-07-22 15:13:38.000000000 -0400
> +++
linux-2.6.35-rc6/drivers/scsi/device_handler/scsi_dh_rdac.c
2010-07-27 12:13:58.000000000 -0400
> @@ -738,6 +738,11 @@ static int rdac_check_sense(struct scsi_
> * Quiescence in progress , just retry.
> */
> return ADD_TO_MLQUEUE;
> + if (sense_hdr->asc == 0x3f && sense_hdr->ascq == 0x03)
> + /*
> + * INQUIRY DATA has changed, retry again.
> + */
> + return ADD_TO_MLQUEUE;
> break;
> }
> /* success just means we do not care what scsi-ml does */
>
> --



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

"Chauhan, Vijay" 10-26-2010 01:53 PM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
Resubmitting this patch to get the attention.

This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code in rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid layer, as a reason momentarily path failure is noticed by DM multipath.

Signed-off-by: Vijay Chauhan<vijay.chauhan@lsi.com>
Reviewed-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Bob Stankey <Robert.stankey@lsi.com>
---

diff -uprN linux-2.6.35-rc6-orig/drivers/scsi/device_handler/scsi_dh_rdac.c linux-2.6.35-rc6/drivers/scsi/device_handler/scsi_dh_rdac.c
--- linux-2.6.35-rc6-orig/drivers/scsi/device_handler/scsi_dh_rdac.c 2010-07-22 15:13:38.000000000 -0400
+++ linux-2.6.35-rc6/drivers/scsi/device_handler/scsi_dh_rdac.c 2010-07-27 12:13:58.000000000 -0400
@@ -738,6 +738,11 @@ static int rdac_check_sense(struct scsi_
* Quiescence in progress , just retry.
*/
return ADD_TO_MLQUEUE;
+ if (sense_hdr->asc == 0x3f && sense_hdr->ascq == 0x03)
+ /*
+ * INQUIRY DATA has changed, retry again.
+ */
+ return ADD_TO_MLQUEUE;
break;
}
/* success just means we do not care what scsi-ml does */

--

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Malahal Naineni 10-26-2010 06:55 PM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
Chauhan, Vijay [Vijay.Chauhan@lsi.com] wrote:
> Resubmitting this patch to get the attention.
>
> This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code in rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid layer, as a reason momentarily path failure is noticed by DM multipath.
>
> Signed-off-by: Vijay Chauhan<vijay.chauhan@lsi.com>
> Reviewed-by: Babu Moger <babu.moger@lsi.com>
> Reviewed-by: Bob Stankey <Robert.stankey@lsi.com>
> ---
>
> diff -uprN linux-2.6.35-rc6-orig/drivers/scsi/device_handler/scsi_dh_rdac.c linux-2.6.35-rc6/drivers/scsi/device_handler/scsi_dh_rdac.c
> --- linux-2.6.35-rc6-orig/drivers/scsi/device_handler/scsi_dh_rdac.c 2010-07-22 15:13:38.000000000 -0400
> +++ linux-2.6.35-rc6/drivers/scsi/device_handler/scsi_dh_rdac.c 2010-07-27 12:13:58.000000000 -0400
> @@ -738,6 +738,11 @@ static int rdac_check_sense(struct scsi_
> * Quiescence in progress , just retry.
> */
> return ADD_TO_MLQUEUE;
> + if (sense_hdr->asc == 0x3f && sense_hdr->ascq == 0x03)
> + /*
> + * INQUIRY DATA has changed, retry again.
> + */
> + return ADD_TO_MLQUEUE;

The code looks fine. Is this ASC/ASCQ code specific to RDAC devices? If
not, how about changing the scsi_error.c itself?

Thanks, Malahal.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Mike Christie 10-26-2010 07:18 PM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
On 10/26/2010 08:53 AM, Chauhan, Vijay wrote:

Resubmitting this patch to get the attention.

This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code in rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid layer, as a reason momentarily path failure is noticed by DM multipath.



Is it getting failed by accident? In scsi_io_completion we check for UAs
and will retry if the removable bit is not set. That check is after
scsi_end_request though (is the scsi_end_request call failing the IO).


Did you guys also want REPORTED_LUNS_DATA_HAS_CHANGED to be retried too.
I think scsi_dh_alua's REPORTED_LUNS_DATA_HAS_CHANGED maybe should be
genericly retried, because it seems for both errors we will want to
retry for all devices.


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

James Bottomley 10-26-2010 07:21 PM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
On Tue, 2010-10-26 at 14:18 -0500, Mike Christie wrote:
> On 10/26/2010 08:53 AM, Chauhan, Vijay wrote:
> > Resubmitting this patch to get the attention.
> >
> > This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code in rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid layer, as a reason momentarily path failure is noticed by DM multipath.
> >
>
> Is it getting failed by accident? In scsi_io_completion we check for UAs
> and will retry if the removable bit is not set. That check is after
> scsi_end_request though (is the scsi_end_request call failing the IO).
>
> Did you guys also want REPORTED_LUNS_DATA_HAS_CHANGED to be retried too.
> I think scsi_dh_alua's REPORTED_LUNS_DATA_HAS_CHANGED maybe should be
> genericly retried, because it seems for both errors we will want to
> retry for all devices.

So my primary worry about patches like this is that it eats AENs ...
this is fine because, as Mike says, we should just ignore them.

However, the moment we start processing AENs (as another set of dm
people promise they have in process) we'll lose them from rdac arrays
and people will get unhappy.

If the generic UA retry isn't working, let's fix it there rather than
these hacks that would be hard to spot and pull out when (if) we ever
get a generic AEN infrastructure.

James


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

10-26-2010 07:32 PM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
-----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of James
> Bottomley
> Sent: Tuesday, October 26, 2010 3:22 PM
> To: Mike Christie
> Cc: device-mapper development; Chauhan, Vijay; James Bottomley; linux-scsi@vger.kernel.org
> Subject: Re: [dm-devel] [RESUBMIT][Patch] scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
>
> On Tue, 2010-10-26 at 14:18 -0500, Mike Christie wrote:
> > On 10/26/2010 08:53 AM, Chauhan, Vijay wrote:
> > > Resubmitting this patch to get the attention.
> > >
> > > This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code in
> rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid layer,
> as a reason momentarily path failure is noticed by DM multipath.
> > >
> >
> > Is it getting failed by accident? In scsi_io_completion we check for UAs
> > and will retry if the removable bit is not set. That check is after
> > scsi_end_request though (is the scsi_end_request call failing the IO).
> >
> > Did you guys also want REPORTED_LUNS_DATA_HAS_CHANGED to be retried too.
> > I think scsi_dh_alua's REPORTED_LUNS_DATA_HAS_CHANGED maybe should be
> > genericly retried, because it seems for both errors we will want to
> > retry for all devices.
>
> So my primary worry about patches like this is that it eats AENs ...
> this is fine because, as Mike says, we should just ignore them.
>
> However, the moment we start processing AENs (as another set of dm
> people promise they have in process) we'll lose them from rdac arrays
> and people will get unhappy.
>
> If the generic UA retry isn't working, let's fix it there rather than
> these hacks that would be hard to spot and pull out when (if) we ever
> get a generic AEN infrastructure.
>
> James

Sometimes the default way to handle a UA may be not the correct one. One arrays implementation to respond to the UA could be different from another array.

Example: A thin provisioning threshold exceed check condition. The device handler infrastructure can be a savior with such hacks..

-Shyam

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

James Bottomley 10-26-2010 07:58 PM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
On Wed, 2010-10-27 at 01:02 +0530, Shyam_Iyer@Dell.com wrote:
> -----Original Message-----
> > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of James
> > Bottomley
> > Sent: Tuesday, October 26, 2010 3:22 PM
> > To: Mike Christie
> > Cc: device-mapper development; Chauhan, Vijay; James Bottomley; linux-scsi@vger.kernel.org
> > Subject: Re: [dm-devel] [RESUBMIT][Patch] scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
> >
> > On Tue, 2010-10-26 at 14:18 -0500, Mike Christie wrote:
> > > On 10/26/2010 08:53 AM, Chauhan, Vijay wrote:
> > > > Resubmitting this patch to get the attention.
> > > >
> > > > This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code in
> > rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid layer,
> > as a reason momentarily path failure is noticed by DM multipath.
> > > >
> > >
> > > Is it getting failed by accident? In scsi_io_completion we check for UAs
> > > and will retry if the removable bit is not set. That check is after
> > > scsi_end_request though (is the scsi_end_request call failing the IO).
> > >
> > > Did you guys also want REPORTED_LUNS_DATA_HAS_CHANGED to be retried too.
> > > I think scsi_dh_alua's REPORTED_LUNS_DATA_HAS_CHANGED maybe should be
> > > genericly retried, because it seems for both errors we will want to
> > > retry for all devices.
> >
> > So my primary worry about patches like this is that it eats AENs ...
> > this is fine because, as Mike says, we should just ignore them.
> >
> > However, the moment we start processing AENs (as another set of dm
> > people promise they have in process) we'll lose them from rdac arrays
> > and people will get unhappy.
> >
> > If the generic UA retry isn't working, let's fix it there rather than
> > these hacks that would be hard to spot and pull out when (if) we ever
> > get a generic AEN infrastructure.
> >
> > James
>
> Sometimes the default way to handle a UA may be not the correct one.
> One arrays implementation to respond to the UA could be different from
> another array.

I don't quite understand this observation in the context of this patch.
What the patch does is retry the command (i.e. ignore the UA) which
should pretty much be our default response.

> Example: A thin provisioning threshold exceed check condition. The
> device handler infrastructure can be a savior with such hacks..

I'm going to regret asking this (especially given all the noise there's
been on thin provisioning thresholds): Which arrays don't actually
issue them correctly?

James


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

10-26-2010 08:27 PM

scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
 
> -----Original Message-----
> From: James Bottomley [mailto:James.Bottomley@suse.de]
> Sent: Tuesday, October 26, 2010 3:59 PM
> To: Iyer, Shyam
> Cc: michaelc@cs.wisc.edu; dm-devel@redhat.com; Vijay.Chauhan@lsi.com; jejb@kernel.org; linux-
> scsi@vger.kernel.org
> Subject: RE: [dm-devel] [RESUBMIT][Patch] scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
>
> On Wed, 2010-10-27 at 01:02 +0530, Shyam_Iyer@Dell.com wrote:
> > -----Original Message-----
> > > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of
> James
> > > Bottomley
> > > Sent: Tuesday, October 26, 2010 3:22 PM
> > > To: Mike Christie
> > > Cc: device-mapper development; Chauhan, Vijay; James Bottomley; linux-scsi@vger.kernel.org
> > > Subject: Re: [dm-devel] [RESUBMIT][Patch] scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense
> fn
> > >
> > > On Tue, 2010-10-26 at 14:18 -0500, Mike Christie wrote:
> > > > On 10/26/2010 08:53 AM, Chauhan, Vijay wrote:
> > > > > Resubmitting this patch to get the attention.
> > > > >
> > > > > This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code in
> > > rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid
> layer,
> > > as a reason momentarily path failure is noticed by DM multipath.
> > > > >
> > > >
> > > > Is it getting failed by accident? In scsi_io_completion we check for UAs
> > > > and will retry if the removable bit is not set. That check is after
> > > > scsi_end_request though (is the scsi_end_request call failing the IO).
> > > >
> > > > Did you guys also want REPORTED_LUNS_DATA_HAS_CHANGED to be retried too.
> > > > I think scsi_dh_alua's REPORTED_LUNS_DATA_HAS_CHANGED maybe should be
> > > > genericly retried, because it seems for both errors we will want to
> > > > retry for all devices.
> > >
> > > So my primary worry about patches like this is that it eats AENs ...
> > > this is fine because, as Mike says, we should just ignore them.
> > >
> > > However, the moment we start processing AENs (as another set of dm
> > > people promise they have in process) we'll lose them from rdac arrays
> > > and people will get unhappy.
> > >
> > > If the generic UA retry isn't working, let's fix it there rather than
> > > these hacks that would be hard to spot and pull out when (if) we ever
> > > get a generic AEN infrastructure.
> > >
> > > James
> >
> > Sometimes the default way to handle a UA may be not the correct one.
> > One arrays implementation to respond to the UA could be different from
> > another array.
>
> I don't quite understand this observation in the context of this patch.
> What the patch does is retry the command (i.e. ignore the UA) which
> should pretty much be our default response.
>

Observation in this context is based on the move here to make UA handling generic.

> > Example: A thin provisioning threshold exceed check condition. The
> > device handler infrastructure can be a savior with such hacks..
>
> I'm going to regret asking this (especially given all the noise there's
> been on thin provisioning thresholds): Which arrays don't actually
> issue them correctly?
>


Its not a question of correct vs incorrect. The implementation detail of how a thin provisioning threshold UA should be handled is outside the scope of T10 and so there are different architectures which deal with this situation differently. Viewing them all with the same prism may cause issues.

The spec only says it should be retried.. and no more. What administrative actions need to be taken based on the event is outside the scope of T10...

It is more of a flexibility to handle the retries based on the array architecture..


Shyam

> James
>


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


All times are GMT. The time now is 08:38 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.