On Wed, Nov 10, 2010 at 09:56:39PM -0800, David Miller wrote:
> In the normal regime where an application uses non-blocking I/O
> writes on a socket, they will handle -EAGAIN and use poll() to
> wait for send space.
> They don't actually sleep on the socket I/O write.
> But kernel level RPC layers that do socket I/O operations directly
> and key off of -EAGAIN on the write() to "try again later" don't
> use poll(), they instead have their own sleeping mechanism and
> rely upon ->sk_write_space() to trigger the wakeup.
> So they do effectively sleep on the write(), but this mechanism
> alone does not let the socket layers know what's going on.
> Therefore they must emulate what would have happened, otherwise
> TCP cannot possibly see that the connection is application window
> size limited.
> Handle this, therefore, like SUNRPC by setting SOCK_NOSPACE and
> bumping the ->sk_write_count as needed when we hit the send buffer
> This should make TCP send buffer size auto-tuning and the
> ->sk_write_space() callback invocations actually happen.
Thanks, pushed to