Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Device-mapper Development (http://www.linux-archive.org/device-mapper-development/)
-   -   training mpath to discern between SCSI errors (http://www.linux-archive.org/device-mapper-development/418274-training-mpath-discern-between-scsi-errors.html)

Mike Christie 08-25-2010 07:15 PM

training mpath to discern between SCSI errors
 
On 08/25/2010 10:59 AM, Mike Snitzer wrote:

On Wed, Aug 25 2010 at 4:00am -0400,
Kiyoshi Ueda<k-ueda@ct.jp.nec.com> wrote:


I'm not sure how to proceed here. How much work would
discerning between transport and IO errors take? If it can't be done
quickly enough the retry logic can be kept around to keep the old
behavior but that already was a broken behavior, so... :-(


I'm not sure how long will it take.


We first need to understand what direction we want to go with this. We
currently have 2 options. But any other ideas are obviously welcome.

1)
Mike Christie has a patchset that introduce more specific
target/transport/host error codes. Mike shared these pointers but he'd
have to put the work in to refresh them:
http://marc.info/?l=linux-scsi&m=112487427230642&w=2
http://marc.info/?l=linux-scsi&m=112487427306501&w=2
http://marc.info/?l=linux-scsi&m=112487431524436&w=2
http://marc.info/?l=linux-scsi&m=112487431524350&w=2

errno.h new EXYZ
http://marc.info/?l=linux-kernel&m=107715299008231&w=2

add block layer blkdev.h error values
http://marc.info/?l=linux-kernel&m=107961883915068&w=2

add block layer blkdev.h error values (v2 convert more drivers)
http://marc.info/?l=linux-scsi&m=112487427230642&w=2

I think that patchset's appoach is fairly disruptive just to be able to
train upper layers to differentiate (e.g. mpath). But in the end maybe
that change takes the code in a more desirable direction?


I think it is more disruptive, but is the cleaner approach in the end.

#2 looks hacky. In upper layers, we will have checks for dasd and other
AOE and other drivers. And then #2 does not even work for filesystems
(ext said they need this).






2)
Another option is Hannes' approach of having DM consume req->errors and
SCSI sense more directly.



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Hannes Reinecke 08-30-2010 11:38 AM

training mpath to discern between SCSI errors
 
Mike Snitzer wrote:
> On Wed, Aug 25 2010 at 4:00am -0400,
> Kiyoshi Ueda <k-ueda@ct.jp.nec.com> wrote:
>
>>> I'm not sure how to proceed here. How much work would
>>> discerning between transport and IO errors take? If it can't be done
>>> quickly enough the retry logic can be kept around to keep the old
>>> behavior but that already was a broken behavior, so... :-(
>> I'm not sure how long will it take.
>
> We first need to understand what direction we want to go with this. We
> currently have 2 options. But any other ideas are obviously welcome.
>
> 1)
> Mike Christie has a patchset that introduce more specific
> target/transport/host error codes. Mike shared these pointers but he'd
> have to put the work in to refresh them:
> http://marc.info/?l=linux-scsi&m=112487427230642&w=2
> http://marc.info/?l=linux-scsi&m=112487427306501&w=2
> http://marc.info/?l=linux-scsi&m=112487431524436&w=2
> http://marc.info/?l=linux-scsi&m=112487431524350&w=2
>
> errno.h new EXYZ
> http://marc.info/?l=linux-kernel&m=107715299008231&w=2
>
> add block layer blkdev.h error values
> http://marc.info/?l=linux-kernel&m=107961883915068&w=2
>
> add block layer blkdev.h error values (v2 convert more drivers)
> http://marc.info/?l=linux-scsi&m=112487427230642&w=2
>
> I think that patchset's appoach is fairly disruptive just to be able to
> train upper layers to differentiate (e.g. mpath). But in the end maybe
> that change takes the code in a more desirable direction?
>
> 2)
> Another option is Hannes' approach of having DM consume req->errors and
> SCSI sense more directly.
>

Actually, I think we have two separate issues here:
1) The need of having more detailed I/O errors even in the fs layer. This
we've already discussed at the LSF, consensus here is to allow other
errors than just 'EIO'.
Instead of Mike's approach I would rather use existing error codes here;
this will make the transition somewhat easier.
Initially I would propose to return 'ENOLINK' for a transport failure,
'EIO' for a non-retryable failure on the target, and 'ENODEV' for a
retryable failure on the target.

2) The need to differentiate the various error conditions on the multipath
layer. Multipath needs to distinguish the three error types as specified
in 1)

Mike has been trying to solve 1) and 2) by introducing separate/new error
codes, and I have been trying to use 2) by parsing the sense codes directly
from multipathing.

Given that the fs people have expressed their desire to know about these
error classes, too, it makes sense to have them exposed to the fs layer.

I see if I can come up with a patch.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Sergei Shtylyov 08-30-2010 12:07 PM

training mpath to discern between SCSI errors
 
Hello.

Hannes Reinecke wrote:


Actually, I think we have two separate issues here:
1) The need of having more detailed I/O errors even in the fs layer. This
we've already discussed at the LSF, consensus here is to allow other
errors than just 'EIO'.
Instead of Mike's approach I would rather use existing error codes here;
this will make the transition somewhat easier.
Initially I would propose to return 'ENOLINK' for a transport failure,
'EIO' for a non-retryable failure on the target, and 'ENODEV' for a
retryable failure on the target.


Are you sure it's not vice versa: EIO for retryable and ENODEV for
non-retryable failures. ENODEV looks more like permanent condition to me.


WBR, Sergei

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Hannes Reinecke 08-30-2010 12:39 PM

training mpath to discern between SCSI errors
 
Sergei Shtylyov wrote:
> Hello.
>
> Hannes Reinecke wrote:
>
>> Actually, I think we have two separate issues here:
>> 1) The need of having more detailed I/O errors even in the fs layer. This
>> we've already discussed at the LSF, consensus here is to allow other
>> errors than just 'EIO'.
>> Instead of Mike's approach I would rather use existing error codes
>> here;
>> this will make the transition somewhat easier.
>> Initially I would propose to return 'ENOLINK' for a transport failure,
>> 'EIO' for a non-retryable failure on the target, and 'ENODEV' for a
>> retryable failure on the target.
>
> Are you sure it's not vice versa: EIO for retryable and ENODEV for
> non-retryable failures. ENODEV looks more like permanent condition to me.
>
Ok, can do.
And looking a the error numbers again, maybe we should be using 'EREMOTEIO'
for non-retryable failures.

So we would be ending with:

ENOLINK: transport failure
EIO: retryable remote failure
EREMOTEIO: non-retryable remote failure

Does that look okay?

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Christophe Varoqui 08-30-2010 12:51 PM

training mpath to discern between SCSI errors
 
> So we would be ending with:
>
> ENOLINK: transport failure
> EIO: retryable remote failure
> EREMOTEIO: non-retryable remote failure
>
would scsi reservation conflict errors raise EREMOTEIO in this case ?

regards,
--
Christophe Varoqui <christophe.varoqui@opensvc.com>
OpenSVC

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Hannes Reinecke 08-30-2010 01:10 PM

training mpath to discern between SCSI errors
 
Christophe Varoqui wrote:
>> So we would be ending with:
>>
>> ENOLINK: transport failure
>> EIO: retryable remote failure
>> EREMOTEIO: non-retryable remote failure
>>
> would scsi reservation conflict errors raise EREMOTEIO in this case ?
>
Precisely.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Hannes Reinecke 08-30-2010 02:52 PM

training mpath to discern between SCSI errors
 
Hannes Reinecke wrote:
> Sergei Shtylyov wrote:
>> Hello.
>>
>> Hannes Reinecke wrote:
>>
>>> Actually, I think we have two separate issues here:
>>> 1) The need of having more detailed I/O errors even in the fs layer. This
>>> we've already discussed at the LSF, consensus here is to allow other
>>> errors than just 'EIO'.
>>> Instead of Mike's approach I would rather use existing error codes
>>> here;
>>> this will make the transition somewhat easier.
>>> Initially I would propose to return 'ENOLINK' for a transport failure,
>>> 'EIO' for a non-retryable failure on the target, and 'ENODEV' for a
>>> retryable failure on the target.
>> Are you sure it's not vice versa: EIO for retryable and ENODEV for
>> non-retryable failures. ENODEV looks more like permanent condition to me.
>>
> Ok, can do.
> And looking a the error numbers again, maybe we should be using 'EREMOTEIO'
> for non-retryable failures.
>
> So we would be ending with:
>
> ENOLINK: transport failure
> EIO: retryable remote failure
> EREMOTEIO: non-retryable remote failure
>
And here is the corresponding patch.
Compile tested only; just to give an idea of the possible implementation.

I have decided to pass the I/O failure information in-line:
- scsi_check_sense() might now return 'TARGET_ERROR' to signal
a permanent error
- scsi_decide_disposition() sets the driver byte of the result
field to 'DID_TARGET_FAILURE' if a return code of 'TARGET_ERROR'
is encountered.
- scsi_io_completion() sets the error to ENOLINK for DID_TRANSPORT_FAILFAST,
EREMOTEIO for DID_TARGET_FAILURE, and EIO for any other error. It also
resets DID_TARGET_FAILURE back to DID_OK once the error code is set.

I'm not 100% happy with this patch; DID_TARGET_FAILURE is really just
a communication vehicle to signal the permanent target failure.
I looked at passing this information directly via an explicit argument
to scsi_finish_command(), but this would include changing
scsi_io_completion(), too. As both of them are exported / public
interfaces I didn't like modifying them.

Another possibility would be to re-use / redefine the 'DRIVER_'
bits; they don't seem to be used a the moment. Eg 'DRIVER_HARD'
for permanent errors, DRIVER_SOFT for link failures.

Opinions welcome.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
>From f0835d92426cb3938f79f1b7a1e1208de63ca7bc Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Mon, 30 Aug 2010 16:21:10 +0200
Subject: [RFC][PATCH] scsi: Detailed I/O errors

Instead of just passing 'EIO' for any I/O errors we should be
notifying the upper layers with some more details about the cause
of this error.
This patch updates the possible I/O errors to:

- ENOLINK: Link failure between host and target
- EIO: Retryable I/O error
- EREMOTEIO: Non-retryable I/O error

'Retryable' in this context means that an I/O error _might_ be
restricted to the I_T_L nexus (vulgo: path), so retrying on another
nexus / path might succeed.

Signed-off-by: Hannes Reinecke <hare@suse.de>

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 487ecda..d49b375 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1270,7 +1270,7 @@ static int do_end_io(struct multipath *m, struct request *clone,
if (!error && !clone->errors)
return 0; /* I/O complete */

- if (error == -EOPNOTSUPP)
+ if (error == -EOPNOTSUPP || error == -EREMOTEIO)
return error;

if (clone->cmd_flags & REQ_DISCARD)
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ce089df..5da040b 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -223,7 +223,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost,
* @scmd: Cmd to have sense checked.
*
* Return value:
- * SUCCESS or FAILED or NEEDS_RETRY
+ * SUCCESS or FAILED or NEEDS_RETRY or TARGET_ERROR
*
* Notes:
* When a deferred error is detected the current command has
@@ -338,25 +338,25 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
case COPY_ABORTED:
case VOLUME_OVERFLOW:
case MISCOMPARE:
- return SUCCESS;
+ case DATA_PROTECT:
+ case BLANK_CHECK:
+ return TARGET_ERROR;

case MEDIUM_ERROR:
if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
sshdr.asc == 0x13 || /* AMNF DATA FIELD */
sshdr.asc == 0x14) { /* RECORD NOT FOUND */
- return SUCCESS;
+ return TARGET_ERROR;
}
return NEEDS_RETRY;

case HARDWARE_ERROR:
if (scmd->device->retry_hwerror)
return ADD_TO_MLQUEUE;
- else
- return SUCCESS;
-
+ else {
+ return TARGET_ERROR;
+ }
case ILLEGAL_REQUEST:
- case BLANK_CHECK:
- case DATA_PROTECT:
default:
return SUCCESS;
}
@@ -819,6 +819,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
case SUCCESS:
case NEEDS_RETRY:
case FAILED:
+ case TARGET_ERROR:
break;
case ADD_TO_MLQUEUE:
rtn = NEEDS_RETRY;
@@ -1512,6 +1513,12 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
rtn = scsi_check_sense(scmd);
if (rtn == NEEDS_RETRY)
goto maybe_retry;
+ if (rtn == TARGET_ERROR) {
+ /* Need to modify host byte to signal a
+ * permanent target failure */
+ scmd->result |= (DID_TARGET_FAILURE << 16);
+ rtn = SUCCESS;
+ }
/* if rtn == FAILED, we have no sense information;
* returning FAILED will wake the error handler thread
* to collect the sense and redo the decide
@@ -1529,6 +1536,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
case RESERVATION_CONFLICT:
SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, scmd->device,
"reservation conflict
"));
+ scmd->result |= (DID_TARGET_FAILURE << 16);
return SUCCESS; /* causes immediate i/o error */
default:
return FAILED;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9ade720..fb841e3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -736,8 +736,20 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
memcpy(req->sense, cmd->sense_buffer, len);
req->sense_len = len;
}
- if (!sense_deferred)
- error = -EIO;
+ if (!sense_deferred) {
+ switch(host_byte(result)) {
+ case DID_TRANSPORT_FAILFAST:
+ error = -ENOLINK;
+ break;
+ case DID_TARGET_FAILURE:
+ cmd->result |= (DID_OK << 16);
+ error = -EREMOTEIO;
+ break;
+ default:
+ error = -EIO;
+ break;
+ }
+ }
}

req->resid_len = scsi_get_resid(cmd);
@@ -796,7 +808,18 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
return;

- error = -EIO;
+ switch (host_byte(result)) {
+ case DID_TRANSPORT_FAILFAST:
+ error = -ENOLINK;
+ break;
+ case DID_TARGET_FAILURE:
+ cmd->result |= (DID_OK << 16);
+ error = -EREMOTEIO;
+ break;
+ default:
+ error = -EIO;
+ break;
+ }

if (host_byte(result) == DID_RESET) {
/* Third party bus reset or reset for error recovery
@@ -1418,7 +1441,6 @@ static void scsi_softirq_done(struct request *rq)
wait_for/HZ);
disposition = SUCCESS;
}
-
scsi_log_completion(cmd, disposition);

switch (disposition) {
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 8fcb6e0..abfee76 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -397,6 +397,8 @@ static inline int scsi_is_wlun(unsigned int lun)
* recover the link. Transport class will
* retry or fail IO */
#define DID_TRANSPORT_FAILFAST 0x0f /* Transport class fastfailed the io */
+#define DID_TARGET_FAILURE 0x10 /* Permanent target failure, do not retry on
+ * other paths */
#define DRIVER_OK 0x00 /* Driver status */

/*
@@ -426,6 +428,7 @@ static inline int scsi_is_wlun(unsigned int lun)
#define TIMEOUT_ERROR 0x2007
#define SCSI_RETURN_NOT_HANDLED 0x2008
#define FAST_IO_FAIL 0x2009
+#define TARGET_ERROR 0x200A

/*
* Midlevel queue return values.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

"Jun'ichi Nomura" 10-18-2010 08:09 AM

training mpath to discern between SCSI errors
 
Hi Hannes,

Thank you for working on this issue and sorry for very late reply...

(08/30/10 23:52), Hannes Reinecke wrote:
> From: Hannes Reinecke <hare@suse.de>
> Date: Mon, 30 Aug 2010 16:21:10 +0200
> Subject: [RFC][PATCH] scsi: Detailed I/O errors
>
> Instead of just passing 'EIO' for any I/O errors we should be
> notifying the upper layers with some more details about the cause
> of this error.
> This patch updates the possible I/O errors to:
>
> - ENOLINK: Link failure between host and target
> - EIO: Retryable I/O error
> - EREMOTEIO: Non-retryable I/O error
>
> 'Retryable' in this context means that an I/O error _might_ be
> restricted to the I_T_L nexus (vulgo: path), so retrying on another
> nexus / path might succeed.

Does 'retryable' of EIO mean retryable in multipath layer?
If so, what is the difference between EIO and ENOLINK?

I've heard of a case where just retrying within path-group is
preferred to (relatively costly) switching group.
So, if EIO (or other error code) can be used to indicate such type
of errors, it's nice.


Also (although this might be a bit off topic from your patch),
can we expand such a distinction to what should be logged?
Currently, it's difficult to distinguish important SCSI/block errors
and less important ones in kernel log.
For example, when I get a link failure on sda, kernel prints something
like below, regardless of whether the I/O is recovered by multipathing or not:
end_request: I/O error, dev sda, sector XXXXX

Setting REQ_QUIET in dm-multipath could mask the message
but also other important ones in SCSI.


Thanks,
--
Jun'ichi Nomura, NEC Corporation

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Hannes Reinecke 10-18-2010 11:55 AM

training mpath to discern between SCSI errors
 
On 10/18/2010 10:09 AM, Jun'ichi Nomura wrote:
> Hi Hannes,
>
> Thank you for working on this issue and sorry for very late reply...
>
> (08/30/10 23:52), Hannes Reinecke wrote:
>> From: Hannes Reinecke <hare@suse.de>
>> Date: Mon, 30 Aug 2010 16:21:10 +0200
>> Subject: [RFC][PATCH] scsi: Detailed I/O errors
>>
>> Instead of just passing 'EIO' for any I/O errors we should be
>> notifying the upper layers with some more details about the cause
>> of this error.
>> This patch updates the possible I/O errors to:
>>
>> - ENOLINK: Link failure between host and target
>> - EIO: Retryable I/O error
>> - EREMOTEIO: Non-retryable I/O error
>>
>> 'Retryable' in this context means that an I/O error _might_ be
>> restricted to the I_T_L nexus (vulgo: path), so retrying on another
>> nexus / path might succeed.
>
> Does 'retryable' of EIO mean retryable in multipath layer?
> If so, what is the difference between EIO and ENOLINK?
>
Yes, EIO is intended for errors which should be retried at the
multipath layer. This does _not_ include transport errors, which are
signalled by ENOLINK.

Basically, ENOLINK is a transport error, and EIO just means
something is wrong and we weren't able to classify it properly.
If we were, it'd be either ENOLINK or EREMOTEIO.

> I've heard of a case where just retrying within path-group is
> preferred to (relatively costly) switching group.
> So, if EIO (or other error code) can be used to indicate such type
> of errors, it's nice.
>
Yes, that was one of the intention.

>
> Also (although this might be a bit off topic from your patch),
> can we expand such a distinction to what should be logged?
> Currently, it's difficult to distinguish important SCSI/block errors
> and less important ones in kernel log.
> For example, when I get a link failure on sda, kernel prints something
> like below, regardless of whether the I/O is recovered by multipathing or not:
> end_request: I/O error, dev sda, sector XXXXX
>
Indeed, when using the above we could be modifying the above
message, eg by

end_request: transport error, dev sda, sector XXXXX

or

end_request: target error, dev sda, sector XXXXX

which would improve the output noticeable.

> Setting REQ_QUIET in dm-multipath could mask the message
> but also other important ones in SCSI.
>
Hmm. Not sure about that, but I think the above modifications will
be useful already.

I'll be sending an updated patch.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

"Jun'ichi Nomura" 10-19-2010 04:03 AM

training mpath to discern between SCSI errors
 
Hi Hannes,

(10/18/10 20:55), Hannes Reinecke wrote:
>> Does 'retryable' of EIO mean retryable in multipath layer?
>> If so, what is the difference between EIO and ENOLINK?
>>
> Yes, EIO is intended for errors which should be retried at the
> multipath layer. This does _not_ include transport errors, which are
> signalled by ENOLINK.
>
> Basically, ENOLINK is a transport error, and EIO just means
> something is wrong and we weren't able to classify it properly.
> If we were, it'd be either ENOLINK or EREMOTEIO.
>
>> I've heard of a case where just retrying within path-group is
>> preferred to (relatively costly) switching group.
>> So, if EIO (or other error code) can be used to indicate such type
>> of errors, it's nice.
>>
> Yes, that was one of the intention.

Great to hear that.

And when it comes to retrying, the next problem is who controls it.
I don't think it's good to duplicate retry logic in multipath and
underlying device like SCSI (i.e. sd retries 5 times).
So perhaps we need a way to disable (or limit) retries in underlying
device at least.

>> Also (although this might be a bit off topic from your patch),
>> can we expand such a distinction to what should be logged?
>> Currently, it's difficult to distinguish important SCSI/block errors
>> and less important ones in kernel log.
>> For example, when I get a link failure on sda, kernel prints something
>> like below, regardless of whether the I/O is recovered by multipathing or not:
>> end_request: I/O error, dev sda, sector XXXXX
>>
> Indeed, when using the above we could be modifying the above
> message, eg by
>
> end_request: transport error, dev sda, sector XXXXX
>
> or
>
> end_request: target error, dev sda, sector XXXXX
>
> which would improve the output noticeable.

It improves but still they look like critical errors
even if multipath saves them.

When I see this:

end_request: target error, dev sda, sector XXXXX

I can't tell whether it's a real error visible to user space
or it's just recoverred by multipath retry/failover afterwards.


>> Setting REQ_QUIET in dm-multipath could mask the message
>> but also other important ones in SCSI.
>>
> Hmm. Not sure about that, but I think the above modifications will
> be useful already.
>
> I'll be sending an updated patch.

Thank you. I'm looking for that.

--
Jun'ichi Nomura, NEC Corporation

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


All times are GMT. The time now is 10:11 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.