[dm-devel] Bug in dm-stripe.c driver
Hello everyone,
* I believe that I’ve found a bug in the dm-stripe.c driver. I’ve been working on getting event alerting put into the driver so that it matches what is found in the dm-raid1.c driver. During my implementation efforts I ran into a problem where the kernel would oops out when I tried to call the queue_work() function. After putting some printk’s in the driver to monitor the pointer addresses of the stripe_c structure members I finally think I’ve got an answer. I found that in its declaration the struct stripe stripe[0] is only accounting for a stripe with one drive present; in my case I have 2 drives in the stripe volume. Here is the stripe struct (and as you can see it only allows for 1 drive): * struct stripe_c { ****** uint32_t stripes; * ****** /* The size of this target / num. stripes */ ****** sector_t stripe_width; * ****** /* stripe chunk size */ ****** uint32_t chunk_shift; ****** sector_t chunk_mask; * /***** Here’s the problem spot *****/ ****** struct stripe stripe[0]; ****** /***** New member elements below problem declaration *****/ ****** /* Needed for handling events */ ****** struct dm_target *ti; ****** ****** /* New work_queue setup for triggering events*/ ****** struct work_struct kstriped_ws; * }; * So, by adding the new declarations of member elements after that stripe[0] declaration there is the possibility of overwriting memory addresses when the processing code for the get_stripe() function is ran. Here’s a snippet of the printk code I used to figure this out and its resulting syslog data: * /* ** Construct a striped mapping. ** <number of stripes> <chunk size (2^^n)> [<dev_path> <offset>]+ **/ static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv) { . . . ****** /* ****** ** Get the stripe destinations. ****** **/ ****** for (i = 0; i < stripes; i++) { ************* argv += 2; printk("EVENT: 7:%d Address of sc->kstriped_ws.entry=%p, next=%p, prev=%p ", i, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev); ************* r = get_stripe(ti, sc, i, argv); printk("EVENT: 8:%d Address of sc->kstriped_ws.entry=%p, next=%p, prev=%p ", i, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev); ************* if (r < 0) { ******************** ti->error = "Couldn't parse stripe destination"; ******************** while (i--) ************************* *** dm_put_device(ti, sc->stripe[i].dev); ******************** kfree(sc); ******************** return r; ************* } ****** } . . . } ****** /* ** Parse a single <dev> <sector> pair **/ static int get_stripe(struct dm_target *ti, struct stripe_c *sc, ************* ***** unsigned int stripe, char **argv) { ****** unsigned long long start; * ****** if (sscanf(argv[1], "%llu", &start) != 1) ************* return -EINVAL; * printk("EVENT: 7:%d%d.1 Address of sc->kstriped_ws.entry=%p, next=%p, prev=%p ", stripe, stripe, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev); printk("********** argv[0]=%s, stripe=%d, sc->stripe_width=%u ", argv[0], stripe, (uint32_t)sc->stripe_width);***** ****** if (dm_get_device(ti, argv[0], start, sc->stripe_width, ******************** * dm_table_get_mode(ti->table), ******************** * &sc->stripe[stripe].dev)) ************* return -ENXIO; printk("EVENT: 7:%d%d.2 Address of sc->kstriped_ws.entry=%p, next=%p, prev=%p ",stripe, stripe, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev); * ****** sc->stripe[stripe].physical_start = start; printk("EVENT: 7:%d%d.3 Address of sc->kstriped_ws.entry=%p, next=%p, prev=%p ",stripe, stripe, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev); * ****** return 0; } * * * Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:0 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0, prev=ffff81001d8ca5b0 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.1 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0, prev=ffff81001d8ca5b0 Nov 21 01:55:19 dmraid-devhost kernel:** *********argv[0]=/dev/sdb, stripe=0, sc->stripe_width=488391936 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.2 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0, prev=ffff81001d8ca5b0 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.3 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0, prev=ffff81001d8ca5b0 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 8:0 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0, prev=ffff81001d8ca5b0 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:1 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0, prev=ffff81001d8ca5b0 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.1 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0, prev=ffff81001d8ca5b0 Nov 21 01:55:19 dmraid-devhost kernel:*********** argv[0]=/dev/sdc, stripe=1, sc->stripe_width=488391936 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.2 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0, prev=ffff81001d8ca5b0 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.3 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0, prev=0000000000000000 Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 8:1 Address of sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0, prev=0000000000000000 * As you can see from the output when it hits the second drive of the stripe it is overwriting the memory addresses for the work_struct list_head “prev” pointer; leading to my oops condition. * To verify this I decided to up the number in stripe[] to match the number of drives present: struct stripe stripe[1]; * After this change the memory corruption is gone: * Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:0 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.1 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel:*********** argv[0]=/dev/sdb, stripe=0, sc->stripe_width=488391936 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.2 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.3 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 8:0 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:1 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.1 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel:*********** argv[0]=/dev/sdc, stripe=1, sc->stripe_width=488391936 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.2 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.3 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 8:1 Address of sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040, prev=ffff81001f4d8040 * * My question now is how should this be patched? Do I just use some arbitrary number in the stripe[] declaration? Say something like 6 (since that is the maximum number of on-board SATA ports for an Intel based controller at the moment). (Probably not, since this is making it platform specific.) How about maybe changing this to an array-of-pointers to a “struct stripe” and setting this to something like “struct stripe *stripe[256]”? This would only be wasting memory for the unused pointers which is comparatively small. * I look forward to hearing everyone’s ideas on how this should be solved, thank you. * * Brian Wood Software Engineer Intel Corp., Manageability & Platform Software Division brian.j.wood@intel.com * -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel |
[dm-devel] Bug in dm-stripe.c driver
On Wed, Nov 21, 2007 at 11:15:52AM -0800, Wood, Brian J wrote:
> /***** Here's the problem spot *****/ > struct stripe stripe[0]; That one must remain the last element in the struct. Add any new fields above it. Alasdair -- agk@redhat.com -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel |
[dm-devel] Bug in dm-stripe.c driver
On Wed, Nov 21, 2007 at 07:39:32PM +0000, Alasdair G Kergon wrote:
> On Wed, Nov 21, 2007 at 11:15:52AM -0800, Wood, Brian J wrote: > > /***** Here's the problem spot *****/ > > struct stripe stripe[0]; > > That one must remain the last element in the struct. > Add any new fields above it. Here's how the real size is calculated before it's allocated: static inline struct stripe_c *alloc_context(unsigned int stripes) { size_t len; if (array_too_big(sizeof(struct stripe_c), sizeof(struct stripe), stripes)) return NULL; len = sizeof(struct stripe_c) + (sizeof(struct stripe) * stripes); return kmalloc(len, GFP_KERNEL); } Alasdair -- agk@redhat.com -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel |
[dm-devel] Bug in dm-stripe.c driver
>-----Original Message-----
>From: Alasdair G Kergon [mailto:agk@redhat.com] >Sent: Wednesday, November 21, 2007 11:42 AM >To: Wood, Brian J >Cc: device-mapper development; Ciechanowski, Ed; Healey, Douglas D >Subject: Re: [dm-devel] Re: Bug in dm-stripe.c driver > >On Wed, Nov 21, 2007 at 07:39:32PM +0000, Alasdair G Kergon wrote: >> On Wed, Nov 21, 2007 at 11:15:52AM -0800, Wood, Brian J wrote: >> > /***** Here's the problem spot *****/ >> > struct stripe stripe[0]; >> >> That one must remain the last element in the struct. >> Add any new fields above it. > >Here's how the real size is calculated before it's allocated: > >static inline struct stripe_c *alloc_context(unsigned int stripes) >{ > size_t len; > > if (array_too_big(sizeof(struct stripe_c), sizeof(struct stripe), > stripes)) > return NULL; > > len = sizeof(struct stripe_c) + (sizeof(struct stripe) * stripes); > > return kmalloc(len, GFP_KERNEL); >} Ok, I did see that call to alloc_context() and it looked like the kmalloc call was declaring enough memory to house all the drives in the stripe (so I wasn't worried that it would overwrite anything outside the driver's memory space). When I did my testing I noticed that if I put my declarations above this line it worked correctly, I just thought it might have been a logic error. Would it be ok to add a comment above (or below) the line "struct stripe stripe[0];" just to prevent developers down the road from seeing this strange behavior if they add a field? Thanks > > >Alasdair >-- >agk@redhat.com Brian Wood Software Engineer Intel Corp., Manageability & Platform Software Division brian.j.wood@intel.com -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel |
| All times are GMT. The time now is 02:47 AM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.