FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 03-29-2011, 07:13 PM
 
Default Preliminary Agenda and Activities for LSF

> -----Original Message-----
> From: Vivek Goyal [mailto:vgoyal@redhat.com]
> Sent: Tuesday, March 29, 2011 2:45 PM
> To: Iyer, Shyam
> Cc: rwheeler@redhat.com; James.Bottomley@hansenpartnership.com;
> lsf@lists.linux-foundation.org; linux-fsdevel@vger.kernel.org; dm-
> devel@redhat.com; linux-scsi@vger.kernel.org
> Subject: Re: [Lsf] Preliminary Agenda and Activities for LSF
>
> On Tue, Mar 29, 2011 at 11:10:18AM -0700, Shyam_Iyer@Dell.com wrote:
> >
> >
> > > -----Original Message-----
> > > From: Vivek Goyal [mailto:vgoyal@redhat.com]
> > > Sent: Tuesday, March 29, 2011 1:34 PM
> > > To: Iyer, Shyam
> > > Cc: rwheeler@redhat.com; James.Bottomley@hansenpartnership.com;
> > > lsf@lists.linux-foundation.org; linux-fsdevel@vger.kernel.org; dm-
> > > devel@redhat.com; linux-scsi@vger.kernel.org
> > > Subject: Re: [Lsf] Preliminary Agenda and Activities for LSF
> > >
> > > On Tue, Mar 29, 2011 at 10:20:57AM -0700, Shyam_Iyer@dell.com
> wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> > > > > owner@vger.kernel.org] On Behalf Of Ric Wheeler
> > > > > Sent: Tuesday, March 29, 2011 7:17 AM
> > > > > To: James Bottomley
> > > > > Cc: lsf@lists.linux-foundation.org; linux-fsdevel; linux-
> > > > > scsi@vger.kernel.org; device-mapper development
> > > > > Subject: Re: [Lsf] Preliminary Agenda and Activities for LSF
> > > > >
> > > > > On 03/29/2011 12:36 AM, James Bottomley wrote:
> > > > > > Hi All,
> > > > > >
> > > > > > Since LSF is less than a week away, the programme committee
> put
> > > > > together
> > > > > > a just in time preliminary agenda for LSF. As you can see
> there
> > > is
> > > > > > still plenty of empty space, which you can make suggestions
> (to
> > > this
> > > > > > list with appropriate general list cc's) for filling:
> > > > > >
> > > > > >
> > > > >
> > >
> https://spreadsheets.google.com/pub?hl=en&hl=en&key=0AiQMl7GcVa7OdFdNQz
> > > > > M5UDRXUnVEbHlYVmZUVHQ2amc&output=html
> > > > > >
> > > > > > If you don't make suggestions, the programme committee will
> feel
> > > > > > empowered to make arbitrary assignments based on your topic
> and
> > > > > attendee
> > > > > > email requests ...
> > > > > >
> > > > > > We're still not quite sure what rooms we will have at the
> Kabuki,
> > > but
> > > > > > we'll add them to the spreadsheet when we know (they should
> be
> > > close
> > > > > to
> > > > > > each other).
> > > > > >
> > > > > > The spreadsheet above also gives contact information for all
> the
> > > > > > attendees and the programme committee.
> > > > > >
> > > > > > Yours,
> > > > > >
> > > > > > James Bottomley
> > > > > > on behalf of LSF/MM Programme Committee
> > > > > >
> > > > >
> > > > > Here are a few topic ideas:
> > > > >
> > > > > (1) The first topic that might span IO & FS tracks (or just
> pull
> > > in
> > > > > device
> > > > > mapper people to an FS track) could be adding new commands that
> > > would
> > > > > allow
> > > > > users to grow/shrink/etc file systems in a generic way. The
> > > thought I
> > > > > had was
> > > > > that we have a reasonable model that we could reuse for these
> new
> > > > > commands like
> > > > > mount and mount.fs or fsck and fsck.fs. With btrfs coming down
> the
> > > > > road, it
> > > > > could be nice to identify exactly what common operations users
> want
> > > to
> > > > > do and
> > > > > agree on how to implement them. Alasdair pointed out in the
> > > upstream
> > > > > thread that
> > > > > we had a prototype here in fsadm.
> > > > >
> > > > > (2) Very high speed, low latency SSD devices and testing. Have
> we
> > > > > settled on the
> > > > > need for these devices to all have block level drivers? For S-
> ATA
> > > or
> > > > > SAS
> > > > > devices, are there known performance issues that require
> > > enhancements
> > > > > in
> > > > > somewhere in the stack?
> > > > >
> > > > > (3) The union mount versus overlayfs debate - pros and cons.
> What
> > > each
> > > > > do well,
> > > > > what needs doing. Do we want/need both upstream? (Maybe this
> can
> > > get 10
> > > > > minutes
> > > > > in Al's VFS session?)
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Ric
> > > >
> > > > A few others that I think may span across I/O, Block fs..layers.
> > > >
> > > > 1) Dm-thinp target vs File system thin profile vs block map based
> > > thin/trim profile.
> > >
> > > > Facilitate I/O throttling for thin/trimmable storage. Online and
> > > Offline profil.
> > >
> > > Is above any different from block IO throttling we have got for
> block
> > > devices?
> > >
> > Yes.. so the throttling would be capacity based.. when the storage
> array wants us to throttle the I/O. Depending on the event we may keep
> getting space allocation write protect check conditions for writes
> until a user intervenes to stop I/O.
> >
>
> Sounds like some user space daemon listening for these events and then
> modifying cgroup throttling limits dynamically?

But we have dm-targets in the horizon like dm-thinp setting soft limits on capacity.. we could extend the concept to H/W imposed soft/hard limits.

The user space could throttle the I/O but it had have to go about finding all processes running I/O on the LUN.. In some cases it could be an I/O process running within a VM..

That would require a passthrough interface to inform it.. I doubt if we would be able to accomplish that any sooner with the multiple operating systems involved. Or requiring each application to register with the userland process. Doable but cumbersome and buggy..

The dm-thinp target can help in this scenario by setting a blanket storage limit. We could go about extending the limit dynamically based on hints/commands from the userland daemon listening to such events.

This approach will probably not take care of scenarios where VM storage is over say NFS or clustered filesystem..
>
> >
> > > > 2) Interfaces for SCSI, Ethernet/*transport configuration
> parameters
> > > floating around in sysfs, procfs. Architecting guidelines for
> accepting
> > > patches for hybrid devices.
> > > > 3) DM snapshot vs FS snapshots vs H/W snapshots. There is room
> for
> > > all and they have to help each other
> >
> > For instance if you took a DM snapshot and the storage sent a check
> condition to the original dm device I am not sure if the DM snapshot
> would get one too..
> >
> > If you had a scenario of taking H/W snapshot of an entire pool and
> decide to delete the individual DM snapshots the H/W snapshot would be
> inconsistent.
> >
> > The blocks being managed by a DM-device would have moved (SCSI
> referrals). I believe Hannes is working on the referrals piece..
> >
> > > > 4) B/W control - VM->DM->Block->Ethernet->Switch->Storage. Pick
> your
> > > subsystem and there are many non-cooperating B/W control constructs
> in
> > > each subsystem.
> > >
> > > Above is pretty generic. Do you have specific needs/ideas/concerns?
> > >
> > > Thanks
> > > Vivek
> > Yes.. if I limited by Ethernet b/w to 40% I don't need to limit I/O
> b/w via cgroups. Such bandwidth manipulations are network switch driven
> and cgroups never take care of these events from the Ethernet driver.
>
> So if IO is going over network and actual bandwidth control is taking
> place by throttling ethernet traffic then one does not have to specify
> block cgroup throttling policy and hence no need for cgroups to be
> worried
> about ethernet driver events?
>
> I think I am missing something here.
>
> Vivek
Well.. here is the catch.. example scenario..

- Two iSCSI I/O sessions emanating from Ethernet ports eth0, eth1 multipathed together. Let us say round-robin policy.

- The cgroup profile is to limit I/O bandwidth to 40% of the multipathed I/O bandwidth. But the switch may have limited the I/O bandwidth to 40% for the corresponding vlan associated with one of the eth interface say eth1

The computation that the bandwidth configured is 40% of the available bandwidth is false in this case. What we need to do is possibly push more I/O through eth0 as it is allowed to run at 100% of bandwidth by the switch.

Now this is a dynamic decision and multipathing layer should take care of it.. but it would need a hint..

Policies are usually decided at different levels, SLAs and sometimes logistics determine these decisions etc. Sometimes the bandwidth lowering by the switch is traffic dependent but user level policies remain in tact. Typical case of network administrator not talking to the system administrator.

-Shyam



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-29-2011, 07:47 PM
"Nicholas A. Bellinger"
 
Default Preliminary Agenda and Activities for LSF

On Tue, 2011-03-29 at 07:16 -0400, Ric Wheeler wrote:
> On 03/29/2011 12:36 AM, James Bottomley wrote:
> > Hi All,
> >
> > Since LSF is less than a week away, the programme committee put together
> > a just in time preliminary agenda for LSF. As you can see there is
> > still plenty of empty space, which you can make suggestions (to this
> > list with appropriate general list cc's) for filling:
> >
> > https://spreadsheets.google.com/pub?hl=en&hl=en&key=0AiQMl7GcVa7OdFdNQzM5UDRXUnVEb HlYVmZUVHQ2amc&output=html
> >
> > If you don't make suggestions, the programme committee will feel
> > empowered to make arbitrary assignments based on your topic and attendee
> > email requests ...
> >
> > We're still not quite sure what rooms we will have at the Kabuki, but
> > we'll add them to the spreadsheet when we know (they should be close to
> > each other).
> >
> > The spreadsheet above also gives contact information for all the
> > attendees and the programme committee.
> >
> > Yours,
> >
> > James Bottomley
> > on behalf of LSF/MM Programme Committee
> >
>
> Here are a few topic ideas:
>
> (1) The first topic that might span IO & FS tracks (or just pull in device
> mapper people to an FS track) could be adding new commands that would allow
> users to grow/shrink/etc file systems in a generic way. The thought I had was
> that we have a reasonable model that we could reuse for these new commands like
> mount and mount.fs or fsck and fsck.fs. With btrfs coming down the road, it
> could be nice to identify exactly what common operations users want to do and
> agree on how to implement them. Alasdair pointed out in the upstream thread that
> we had a prototype here in fsadm.
>
> (2) Very high speed, low latency SSD devices and testing. Have we settled on the
> need for these devices to all have block level drivers? For S-ATA or SAS
> devices, are there known performance issues that require enhancements in
> somewhere in the stack?
>
> (3) The union mount versus overlayfs debate - pros and cons. What each do well,
> what needs doing. Do we want/need both upstream? (Maybe this can get 10 minutes
> in Al's VFS session?)
>

Hi Ric, James and LSF-PC chairs,

Beyond my original LSF topic proposal for the next-generation QEMU/KVM
Virtio-SCSI target driver here:

http://marc.info/?l=linux-scsi&m=129706545408966&w=2

The following target mode related topics would be useful for the current
attendees with interest in /drivers/target/ code if there is extra room
available for local attendance within the IO/storage track.

(4) Enabling mixed Target/Initiator mode in existing mainline SCSI LLDs
that support HW target mode, and come to an consensus determination for
how best to make the SCSI LLD / target fabric driver split when enabling
mainline target infrastructure support into existing SCSI LLDs. This
code is currently in flight for qla2xxx / tcm_qla2xxx for .40 (Hannes,
Christoph, Mike, Qlogic and other LLD maintainers)

(5) Driving target configfs group creation from kernel-space via a
userspace passthrough using some form of portable / acceptable mainline
interface. This is a topic that has been raised on the scsi list for
the ibmvscsis target driver for .40, and is going to be useful for other
in-flight HW target driver as well. (Tomo-san, Hannes, Mike, James,
Joel)

Thank you!

--nab

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-29-2011, 07:57 PM
Vivek Goyal
 
Default Preliminary Agenda and Activities for LSF

On Tue, Mar 29, 2011 at 12:13:41PM -0700, Shyam_Iyer@Dell.com wrote:

[..]
> >
> > Sounds like some user space daemon listening for these events and then
> > modifying cgroup throttling limits dynamically?
>
> But we have dm-targets in the horizon like dm-thinp setting soft limits on capacity.. we could extend the concept to H/W imposed soft/hard limits.
>
> The user space could throttle the I/O but it had have to go about finding all processes running I/O on the LUN.. In some cases it could be an I/O process running within a VM..

Well, if there is one cgroup (root cgroup), then daemon does not have to
find anything. This is one global space and there is provision to set
per device limit. So daemon can just go and adjust device limits
dynamically and that gets applicable for all processes.

The problem will happen if there are more cgroups created and limits are
per cgroup, per device. (For creating service differentiation). I would
say in that case daemon needs to be more sophisticated and reduce the
limit in each group by same % as required by thinly provisioned target.

That way a higher rate group will still get higher IO rate on a thinly
provisioned device which is imposing its own throttling. Otherwise we
again run into issues where there is no service differentiation between
faster group or slower group.

IOW, if we are throttling thinly povisioned devices, I think throttling
these using a user space daemon might be better as it will reuse the
kernel throttling infrastructure as well as throttling will be cgroup
aware.

>
> That would require a passthrough interface to inform it.. I doubt if we would be able to accomplish that any sooner with the multiple operating systems involved. Or requiring each application to register with the userland process. Doable but cumbersome and buggy..
>
> The dm-thinp target can help in this scenario by setting a blanket storage limit. We could go about extending the limit dynamically based on hints/commands from the userland daemon listening to such events.
>
> This approach will probably not take care of scenarios where VM storage is over say NFS or clustered filesystem..

Even current blkio throttling does not work over NFS. This is one of the
issues I wanted to discuss at LSF.

[..]
> Well.. here is the catch.. example scenario..
>
> - Two iSCSI I/O sessions emanating from Ethernet ports eth0, eth1 multipathed together. Let us say round-robin policy.
>
> - The cgroup profile is to limit I/O bandwidth to 40% of the multipathed I/O bandwidth. But the switch may have limited the I/O bandwidth to 40% for the corresponding vlan associated with one of the eth interface say eth1
>
> The computation that the bandwidth configured is 40% of the available bandwidth is false in this case. What we need to do is possibly push more I/O through eth0 as it is allowed to run at 100% of bandwidth by the switch.
>
> Now this is a dynamic decision and multipathing layer should take care of it.. but it would need a hint..
>

So we have multipathed two paths in a round robin manner and one path is
faster and other is slower. I am not sure what multipath does in those
scenarios but trying to send more IO on faster path sounds like right
thing to do.

Thanks
Vivek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-29-2011, 07:59 PM
Mike Snitzer
 
Default Preliminary Agenda and Activities for LSF

On Tue, Mar 29 2011 at 3:13pm -0400,
Shyam_Iyer@dell.com <Shyam_Iyer@dell.com> wrote:

> > > > Above is pretty generic. Do you have specific needs/ideas/concerns?
> > > >
> > > > Thanks
> > > > Vivek
> > > Yes.. if I limited by Ethernet b/w to 40% I don't need to limit I/O
> > b/w via cgroups. Such bandwidth manipulations are network switch driven
> > and cgroups never take care of these events from the Ethernet driver.
> >
> > So if IO is going over network and actual bandwidth control is taking
> > place by throttling ethernet traffic then one does not have to specify
> > block cgroup throttling policy and hence no need for cgroups to be
> > worried
> > about ethernet driver events?
> >
> > I think I am missing something here.
> >
> > Vivek
> Well.. here is the catch.. example scenario..
>
> - Two iSCSI I/O sessions emanating from Ethernet ports eth0, eth1 multipathed together. Let us say round-robin policy.
>
> - The cgroup profile is to limit I/O bandwidth to 40% of the multipathed I/O bandwidth. But the switch may have limited the I/O bandwidth to 40% for the corresponding vlan associated with one of the eth interface say eth1
>
> The computation that the bandwidth configured is 40% of the available bandwidth is false in this case. What we need to do is possibly push more I/O through eth0 as it is allowed to run at 100% of bandwidth by the switch.
>
> Now this is a dynamic decision and multipathing layer should take care of it.. but it would need a hint..

No hint should be needed. Just use one of the newer multipath path
selectors that are dynamic by design: "queue-length" or "service-time".

This scenario is exactly what those path selectors are meant to address.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-29-2011, 08:12 PM
 
Default Preliminary Agenda and Activities for LSF

> -----Original Message-----
> From: Mike Snitzer [mailto:snitzer@redhat.com]
> Sent: Tuesday, March 29, 2011 4:00 PM
> To: Iyer, Shyam
> Cc: vgoyal@redhat.com; lsf@lists.linux-foundation.org; linux-
> scsi@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> rwheeler@redhat.com; device-mapper development
> Subject: Re: Preliminary Agenda and Activities for LSF
>
> On Tue, Mar 29 2011 at 3:13pm -0400,
> Shyam_Iyer@dell.com <Shyam_Iyer@dell.com> wrote:
>
> > > > > Above is pretty generic. Do you have specific
> needs/ideas/concerns?
> > > > >
> > > > > Thanks
> > > > > Vivek
> > > > Yes.. if I limited by Ethernet b/w to 40% I don't need to limit
> I/O
> > > b/w via cgroups. Such bandwidth manipulations are network switch
> driven
> > > and cgroups never take care of these events from the Ethernet
> driver.
> > >
> > > So if IO is going over network and actual bandwidth control is
> taking
> > > place by throttling ethernet traffic then one does not have to
> specify
> > > block cgroup throttling policy and hence no need for cgroups to be
> > > worried
> > > about ethernet driver events?
> > >
> > > I think I am missing something here.
> > >
> > > Vivek
> > Well.. here is the catch.. example scenario..
> >
> > - Two iSCSI I/O sessions emanating from Ethernet ports eth0, eth1
> multipathed together. Let us say round-robin policy.
> >
> > - The cgroup profile is to limit I/O bandwidth to 40% of the
> multipathed I/O bandwidth. But the switch may have limited the I/O
> bandwidth to 40% for the corresponding vlan associated with one of the
> eth interface say eth1
> >
> > The computation that the bandwidth configured is 40% of the available
> bandwidth is false in this case. What we need to do is possibly push
> more I/O through eth0 as it is allowed to run at 100% of bandwidth by
> the switch.
> >
> > Now this is a dynamic decision and multipathing layer should take
> care of it.. but it would need a hint..
>
> No hint should be needed. Just use one of the newer multipath path
> selectors that are dynamic by design: "queue-length" or "service-time".
>
> This scenario is exactly what those path selectors are meant to
> address.
>
> Mike

Since iSCSI multipaths are essentially sessions one could configure more than one session through the same ethX interface. The sessions need not be going to the same LUN and hence not governed by the same multipath selector but the bandwidth policy group would be for a group of resources.

-Shyam





--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-29-2011, 08:23 PM
Mike Snitzer
 
Default Preliminary Agenda and Activities for LSF

On Tue, Mar 29 2011 at 4:12pm -0400,
Shyam_Iyer@dell.com <Shyam_Iyer@dell.com> wrote:

>
>
> > -----Original Message-----
> > From: Mike Snitzer [mailto:snitzer@redhat.com]
> > Sent: Tuesday, March 29, 2011 4:00 PM
> > To: Iyer, Shyam
> > Cc: vgoyal@redhat.com; lsf@lists.linux-foundation.org; linux-
> > scsi@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> > rwheeler@redhat.com; device-mapper development
> > Subject: Re: Preliminary Agenda and Activities for LSF
> >
> > On Tue, Mar 29 2011 at 3:13pm -0400,
> > Shyam_Iyer@dell.com <Shyam_Iyer@dell.com> wrote:
> >
> > > > > > Above is pretty generic. Do you have specific
> > needs/ideas/concerns?
> > > > > >
> > > > > > Thanks
> > > > > > Vivek
> > > > > Yes.. if I limited by Ethernet b/w to 40% I don't need to limit
> > I/O
> > > > b/w via cgroups. Such bandwidth manipulations are network switch
> > driven
> > > > and cgroups never take care of these events from the Ethernet
> > driver.
> > > >
> > > > So if IO is going over network and actual bandwidth control is
> > taking
> > > > place by throttling ethernet traffic then one does not have to
> > specify
> > > > block cgroup throttling policy and hence no need for cgroups to be
> > > > worried
> > > > about ethernet driver events?
> > > >
> > > > I think I am missing something here.
> > > >
> > > > Vivek
> > > Well.. here is the catch.. example scenario..
> > >
> > > - Two iSCSI I/O sessions emanating from Ethernet ports eth0, eth1
> > multipathed together. Let us say round-robin policy.
> > >
> > > - The cgroup profile is to limit I/O bandwidth to 40% of the
> > multipathed I/O bandwidth. But the switch may have limited the I/O
> > bandwidth to 40% for the corresponding vlan associated with one of the
> > eth interface say eth1
> > >
> > > The computation that the bandwidth configured is 40% of the available
> > bandwidth is false in this case. What we need to do is possibly push
> > more I/O through eth0 as it is allowed to run at 100% of bandwidth by
> > the switch.
> > >
> > > Now this is a dynamic decision and multipathing layer should take
> > care of it.. but it would need a hint..
> >
> > No hint should be needed. Just use one of the newer multipath path
> > selectors that are dynamic by design: "queue-length" or "service-time".
> >
> > This scenario is exactly what those path selectors are meant to
> > address.
> >
> > Mike
>
> Since iSCSI multipaths are essentially sessions one could configure
> more than one session through the same ethX interface. The sessions
> need not be going to the same LUN and hence not governed by the same
> multipath selector but the bandwidth policy group would be for a group
> of resources.

Then the sessions don't correspond to the same backend LUN (and by
definition aren't part of the same mpath device). You're really all
over the map with your talking points.

I'm having a hard time following you.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-29-2011, 08:29 PM
Jan Kara
 
Default Preliminary Agenda and Activities for LSF

On Tue 29-03-11 07:16:32, Ric Wheeler wrote:
> On 03/29/2011 12:36 AM, James Bottomley wrote:
> (3) The union mount versus overlayfs debate - pros and cons. What each do well,
> what needs doing. Do we want/need both upstream? (Maybe this can get 10 minutes
> in Al's VFS session?)
It might be interesting but neither Miklos nor Val seems to be attending
so I'm not sure how deep discussion we can have .

Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-29-2011, 08:31 PM
Ric Wheeler
 
Default Preliminary Agenda and Activities for LSF

On 03/29/2011 04:29 PM, Jan Kara wrote:

On Tue 29-03-11 07:16:32, Ric Wheeler wrote:

On 03/29/2011 12:36 AM, James Bottomley wrote:
(3) The union mount versus overlayfs debate - pros and cons. What each do well,
what needs doing. Do we want/need both upstream? (Maybe this can get 10 minutes
in Al's VFS session?)

It might be interesting but neither Miklos nor Val seems to be attending
so I'm not sure how deep discussion we can have .

Honza


Very true - probably best to keep that discussion focused upstream (but that
seems to have quieted down as well)...


Ric

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-29-2011, 11:09 PM
 
Default Preliminary Agenda and Activities for LSF

> -----Original Message-----
> From: Mike Snitzer [mailto:snitzer@redhat.com]
> Sent: Tuesday, March 29, 2011 4:24 PM
> To: Iyer, Shyam
> Cc: linux-scsi@vger.kernel.org; lsf@lists.linux-foundation.org; linux-
> fsdevel@vger.kernel.org; rwheeler@redhat.com; vgoyal@redhat.com;
> device-mapper development
> Subject: Re: Preliminary Agenda and Activities for LSF
>
> On Tue, Mar 29 2011 at 4:12pm -0400,
> Shyam_Iyer@dell.com <Shyam_Iyer@dell.com> wrote:
>
> >
> >
> > > -----Original Message-----
> > > From: Mike Snitzer [mailto:snitzer@redhat.com]
> > > Sent: Tuesday, March 29, 2011 4:00 PM
> > > To: Iyer, Shyam
> > > Cc: vgoyal@redhat.com; lsf@lists.linux-foundation.org; linux-
> > > scsi@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> > > rwheeler@redhat.com; device-mapper development
> > > Subject: Re: Preliminary Agenda and Activities for LSF
> > >
> > > On Tue, Mar 29 2011 at 3:13pm -0400,
> > > Shyam_Iyer@dell.com <Shyam_Iyer@dell.com> wrote:
> > >
> > > > > > > Above is pretty generic. Do you have specific
> > > needs/ideas/concerns?
> > > > > > >
> > > > > > > Thanks
> > > > > > > Vivek
> > > > > > Yes.. if I limited by Ethernet b/w to 40% I don't need to
> limit
> > > I/O
> > > > > b/w via cgroups. Such bandwidth manipulations are network
> switch
> > > driven
> > > > > and cgroups never take care of these events from the Ethernet
> > > driver.
> > > > >
> > > > > So if IO is going over network and actual bandwidth control is
> > > taking
> > > > > place by throttling ethernet traffic then one does not have to
> > > specify
> > > > > block cgroup throttling policy and hence no need for cgroups to
> be
> > > > > worried
> > > > > about ethernet driver events?
> > > > >
> > > > > I think I am missing something here.
> > > > >
> > > > > Vivek
> > > > Well.. here is the catch.. example scenario..
> > > >
> > > > - Two iSCSI I/O sessions emanating from Ethernet ports eth0, eth1
> > > multipathed together. Let us say round-robin policy.
> > > >
> > > > - The cgroup profile is to limit I/O bandwidth to 40% of the
> > > multipathed I/O bandwidth. But the switch may have limited the I/O
> > > bandwidth to 40% for the corresponding vlan associated with one of
> the
> > > eth interface say eth1
> > > >
> > > > The computation that the bandwidth configured is 40% of the
> available
> > > bandwidth is false in this case. What we need to do is possibly
> push
> > > more I/O through eth0 as it is allowed to run at 100% of bandwidth
> by
> > > the switch.
> > > >
> > > > Now this is a dynamic decision and multipathing layer should take
> > > care of it.. but it would need a hint..
> > >
> > > No hint should be needed. Just use one of the newer multipath path
> > > selectors that are dynamic by design: "queue-length" or "service-
> time".
> > >
> > > This scenario is exactly what those path selectors are meant to
> > > address.
> > >
> > > Mike
> >
> > Since iSCSI multipaths are essentially sessions one could configure
> > more than one session through the same ethX interface. The sessions
> > need not be going to the same LUN and hence not governed by the same
> > multipath selector but the bandwidth policy group would be for a
> group
> > of resources.
>
> Then the sessions don't correspond to the same backend LUN (and by
> definition aren't part of the same mpath device). You're really all
> over the map with your talking points.
>
> I'm having a hard time following you.
>
> Mike

Let me back up here.. this has to be thought in not only the traditional Ethernet sense but also in a Data Centre Bridged environment. I shouldn't have wandered into the multipath constructs..

I think the statement on not going to the same LUN was a little erroneous. I meant different /dev/sdXs.. and hence different block I/O queues.

Each I/O queue could be thought of as a bandwidth queue class being serviced through a corresponding network adapter's queue(assuming a multiqueue capable adapter)

Let us say /dev/sda(Through eth0) and /dev/sdb(eth1) are a cgroup bandwidth group corresponding to a weightage of 20% of the I/O bandwidth the user has configured this weight thinking that this will correspond to say 200Mb of bandwidth.

Let us say the network bandwidth on the corresponding network queues corresponding was reduced by the DCB capable switch...
We still need an SLA of 200Mb of I/O bandwidth but the underlying dynamics have changed.

In such a scenario the option is to move I/O to a different bandwidth priority queue in the network adapter. This could be moving I/O to a new network queue in eth0 or another queue in eth1 ..

This requires mapping the block queue to the new network queue.

One way of solving this is what is getting into the open-iscsi world i.e. creating a session tagged to the relevant DCB priority and thus the session gets mapped to the relevant tc queue which ultimately maps to one of the network adapters multiqueue..

But when multipath fails over to the different session path then the DCB bandwidth priority will not move with it..

Ok one could argue that is a user mistake to have configured bandwidth priorities differently but it may so happen that the bandwidth priority was just dynamically changed by the switch for the particular queue.

Although I gave an example of a DCB environment but we could definitely look at doing a 1:n map of block queues to network adapter queues for non-DCB environments too..


-Shyam


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 03-30-2011, 12:33 AM
Mingming Cao
 
Default Preliminary Agenda and Activities for LSF

On Tue, 2011-03-29 at 07:16 -0400, Ric Wheeler wrote:
> On 03/29/2011 12:36 AM, James Bottomley wrote:
> > Hi All,
> >
> > Since LSF is less than a week away, the programme committee put together
> > a just in time preliminary agenda for LSF. As you can see there is
> > still plenty of empty space, which you can make suggestions (to this
> > list with appropriate general list cc's) for filling:
> >
> > https://spreadsheets.google.com/pub?hl=en&hl=en&key=0AiQMl7GcVa7OdFdNQzM5UDRXUnVEb HlYVmZUVHQ2amc&output=html
> >
> > If you don't make suggestions, the programme committee will feel
> > empowered to make arbitrary assignments based on your topic and attendee
> > email requests ...
> >
> > We're still not quite sure what rooms we will have at the Kabuki, but
> > we'll add them to the spreadsheet when we know (they should be close to
> > each other).
> >
> > The spreadsheet above also gives contact information for all the
> > attendees and the programme committee.
> >
> > Yours,
> >
> > James Bottomley
> > on behalf of LSF/MM Programme Committee
> >
>
> Here are a few topic ideas:
>
> (1) The first topic that might span IO & FS tracks (or just pull in device
> mapper people to an FS track) could be adding new commands that would allow
> users to grow/shrink/etc file systems in a generic way. The thought I had was
> that we have a reasonable model that we could reuse for these new commands like
> mount and mount.fs or fsck and fsck.fs. With btrfs coming down the road, it
> could be nice to identify exactly what common operations users want to do and
> agree on how to implement them. Alasdair pointed out in the upstream thread that
> we had a prototype here in fsadm.
>
> (2) Very high speed, low latency SSD devices and testing. Have we settled on the
> need for these devices to all have block level drivers? For S-ATA or SAS
> devices, are there known performance issues that require enhancements in
> somewhere in the stack?
>
> (3) The union mount versus overlayfs debate - pros and cons. What each do well,
> what needs doing. Do we want/need both upstream? (Maybe this can get 10 minutes
> in Al's VFS session?)
>

Ric,

May I propose some discussion about concurrent direct IO support for
ext4?

Direct IO write are serialized by the single i_mutex lock. This lock
contention becomes significant when running database or direct IO heavy
workload on guest, where the host pass a file image to guest as a block
device. All the parallel IOs in guests are being serialized by the
i_mutex lock on the host disk image file. This greatly penalize the data
base application performance in KVM.

I am looking for some discussion about removing the i_mutex lock in the
direct IO write code path for ext4, when multiple threads are
direct write to different offset of the same file. This would require
some way to track the in-fly DIO IO range, either done at ext4 level or
above th vfs layer.


Thanks,


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 01:24 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org