FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 10-11-2012, 02:51 AM
Kent Overstreet
 
Default aio: Refactor aio_read_evt, use cmxchg(), fix bug

On Wed, Oct 10, 2012 at 02:43:15PM -0700, Zach Brown wrote:
> > True. But that could be solved with a separate interface that either
> > doesn't use a context to submit a call synchronously, or uses an
> > implicit per thread context.
>
> Sure, but why bother if we can make the one submission interface fast
> enough to satisfy quick callers? Less is more, and all that.

Very true, if it's possible. I'm just still skeptical.

> > I don't have a _strong_ opinion there, but my intuition is that we
> > shouldn't be creating new types of handles without a good reason. I
> > don't think the annoyances are for the most part particular to file
> > descriptors, I think the tend to be applicable to handles in general and
> > at least with file descriptors they're known and solved.
>
> I strongly disagree. That descriptors are an expensive limited
> resources is a perfectly good reason to not make them required to access
> the ring.

What's so special about aio vs. epoll, and now signalfd/eventfd/timerfd
etc.?

> > That would be awesome, though for it to be worthwhile there couldn't be
> > any kernel notion of a context at all and I'm not sure if that's
> > practical. But the idea hadn't occured to me before and I'm sure you've
> > thought about it more than I have... hrm.
> >
> > Oh hey, that's what acall does :P
>
>
>
> > For completions though you really want the ringbuffer pinned... what do
> > you do about that?
>
> I don't think the kernel has to mandate that, no. The code has to deal
> with completions faulting, but they probably won't. In acall it
> happened that completions always came from threads that could block so
> its coping mechanism was to just use put_user() .

Yeah, but that means the completion has to be delivered from process
context. That's not what aio does today, and it'd be a real performance
regression.

I don't know of a way around that myself.

> If userspace wants them rings locked, they can mlock() the memory.
>
> Think about it from another angle: the current mechanism of creating an
> aio ring is a way to allocate pinned memory outside of the usual mlock
> accounting. This could be abused, so aio grew an additional tunable to
> limit the number of total entries in rings in the system.
>
> By putting the ring in normal user memory we avoid that problem
> entirely.

No different from any other place the kernel allocates memory on behalf
of userspace... it needs a general solution, not a bunch of special case
solutions (though since the general solution is memcg you might argue
the cure is worse than the disease... :P)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 10-11-2012, 04:43 PM
Zach Brown
 
Default aio: Refactor aio_read_evt, use cmxchg(), fix bug

> Yeah, but that means the completion has to be delivered from process
> context. That's not what aio does today, and it'd be a real performance
> regression.

It'd only have to to complete from process context if it faults. The
cheapest possible delivery mechanism is simple cpu stores. In the vast
majority of cases the ring will be resident and it'll be done. In rare
cases it could fall back to a deferred completion. If apps can't
stomach that latency and want to pay the overhead of pinning to remove
that risk, they're welcome to do so.

That's my hope, anyway.

- z

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 04:47 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org