FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Fedora Directory

 
 
LinkBack Thread Tools
 
Old 01-10-2011, 03:49 PM
Jazcek Braden
 
Default Bug related to file descriptors

I am having an issue with running 398 server on fedora core 14 using
the default packages (389 1.2.7.5-1).

What is happening is that every once and a while a client gets
disconnected with a T2 error, after that happens no new connections
can be made to the server until the server is restart, all existing
connections continue to work as expected. Reading only the access log
messages it looks like it might have something to do with the fd
handle. I only say this because once the first disconnect occurs, the
conn number is incremented by the fd number isn't and any connection
that tries to use the fd increment has an error. I was wondering if
any body can suggest the logging options on how to diagnose this
better or if this is a known bug in this version of 389 server. Below
is an exceprt from the access logs.
[06/Jan/2011:21:29:11 -0700] conn=473 op=-1 fd=81 closed - T2
[06/Jan/2011:21:38:36 -0700] conn=490 fd=81 slot=81 connection from 10.128.0.142
to 10.128.0.129
[06/Jan/2011:21:38:37 -0700] conn=490 op=-1 fd=81 closed - T2
[06/Jan/2011:21:38:38 -0700] conn=491 fd=81 slot=81 connection from 10.128.0.142
to 10.128.0.129
[06/Jan/2011:21:38:38 -0700] conn=491 op=-1 fd=81 closed - T2
...repeats until server is restarted...
--
Jazcek Braden
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 01-10-2011, 05:16 PM
Rich Megginson
 
Default Bug related to file descriptors

On 01/10/2011 09:49 AM, Jazcek Braden wrote:
> I am having an issue with running 398 server on fedora core 14 using
> the default packages (389 1.2.7.5-1).
>
> What is happening is that every once and a while a client gets
> disconnected with a T2 error, after that happens no new connections
> can be made to the server until the server is restart, all existing
> connections continue to work as expected. Reading only the access log
> messages it looks like it might have something to do with the fd
> handle. I only say this because once the first disconnect occurs, the
> conn number is incremented by the fd number isn't and any connection
> that tries to use the fd increment has an error. I was wondering if
> any body can suggest the logging options on how to diagnose this
> better or if this is a known bug in this version of 389 server. Below
> is an exceprt from the access logs.
> [06/Jan/2011:21:29:11 -0700] conn=473 op=-1 fd=81 closed - T2
> [06/Jan/2011:21:38:36 -0700] conn=490 fd=81 slot=81 connection from 10.128.0.142
> to 10.128.0.129
> [06/Jan/2011:21:38:37 -0700] conn=490 op=-1 fd=81 closed - T2
> [06/Jan/2011:21:38:38 -0700] conn=491 fd=81 slot=81 connection from 10.128.0.142
> to 10.128.0.129
> [06/Jan/2011:21:38:38 -0700] conn=491 op=-1 fd=81 closed - T2
> ...repeats until server is restarted...
Thanks. Please file a bug.
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 01-10-2011, 05:21 PM
crashingdaily
 
Default Bug related to file descriptors

On Jan 10, 2011, at 11:49 AM, Jazcek Braden wrote:

> I am having an issue with running 398 server on fedora core 14 using
> the default packages (389 1.2.7.5-1).
>
> What is happening is that every once and a while a client gets
> disconnected with a T2 error, after that happens no new connections
> can be made to the server until the server is restart, all existing
> connections continue to work as expected. Reading only the access log
> messages it looks like it might have something to do with the fd
> handle. I only say this because once the first disconnect occurs, the
> conn number is incremented by the fd number isn't and any connection
> that tries to use the fd increment has an error. I was wondering if
> any body can suggest the logging options on how to diagnose this
> better or if this is a known bug in this version of 389 server. Below
> is an exceprt from the access logs.
> [06/Jan/2011:21:29:11 -0700] conn=473 op=-1 fd=81 closed - T2
> [06/Jan/2011:21:38:36 -0700] conn=490 fd=81 slot=81 connection from
> 10.128.0.142
> to 10.128.0.129
> [06/Jan/2011:21:38:37 -0700] conn=490 op=-1 fd=81 closed - T2
> [06/Jan/2011:21:38:38 -0700] conn=491 fd=81 slot=81 connection from
> 10.128.0.142
> to 10.128.0.129
> [06/Jan/2011:21:38:38 -0700] conn=491 op=-1 fd=81 closed - T2
> ...repeats until server is restarted...
> --
> Jazcek Braden


I have a similar/same problem here after upgrading from 389-ds-
base-1.2.6.1-2.el5.x86_64 to 389-ds-base-1.2.7.5-1.el5.x86_64 , but
only on one of my 3 replicas.

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 01-10-2011, 05:24 PM
"Hendricks, Todd"
 
Default Bug related to file descriptors

I too am experiencing the same issue (389-ds 1.2.1-1.el5):

[10/Jan/2011:12:04:26 -0600] conn=187 fd=69 slot=69 connection from
172.16.13.21 to 172.16.16.23
[10/Jan/2011:12:04:26 -0600] conn=187 op=-1 fd=69 closed - T2
[10/Jan/2011:12:04:26 -0600] conn=188 fd=69 slot=69 connection from
172.16.13.21 to 172.16.16.23
[10/Jan/2011:12:04:26 -0600] conn=188 op=-1 fd=69 closed - T2
[10/Jan/2011:12:04:26 -0600] conn=189 fd=69 slot=69 connection from
172.16.13.21 to 172.16.16.23
[10/Jan/2011:12:04:26 -0600] conn=189 op=-1 fd=69 closed - T2
[10/Jan/2011:12:04:27 -0600] conn=190 fd=69 slot=69 connection from
172.16.13.22 to 172.16.16.23
[10/Jan/2011:12:04:27 -0600] conn=190 op=-1 fd=69 closed - T2
[10/Jan/2011:12:04:27 -0600] conn=191 fd=69 slot=69 connection from
172.16.9.31 to 172.16.16.23
[10/Jan/2011:12:04:27 -0600] conn=191 op=-1 fd=69 closed - T2
[10/Jan/2011:12:04:30 -0600] conn=192 fd=69 slot=69 connection from
172.16.16.254 to 172.16.16.23
[10/Jan/2011:12:04:30 -0600] conn=192 op=-1 fd=69 closed - T2
[10/Jan/2011:12:04:31 -0600] conn=193 fd=69 slot=69 connection from
172.16.4.21 to 172.16.16.23
[10/Jan/2011:12:04:31 -0600] conn=193 op=-1 fd=69 closed - T2

I'll happily contribute to the bug report as well when it is filed. Thanks!

- Todd


On 1/10/11 12:16 PM, "Rich Megginson" <rmeggins@redhat.com> wrote:

> On 01/10/2011 09:49 AM, Jazcek Braden wrote:
>> I am having an issue with running 398 server on fedora core 14 using
>> the default packages (389 1.2.7.5-1).
>>
>> What is happening is that every once and a while a client gets
>> disconnected with a T2 error, after that happens no new connections
>> can be made to the server until the server is restart, all existing
>> connections continue to work as expected. Reading only the access log
>> messages it looks like it might have something to do with the fd
>> handle. I only say this because once the first disconnect occurs, the
>> conn number is incremented by the fd number isn't and any connection
>> that tries to use the fd increment has an error. I was wondering if
>> any body can suggest the logging options on how to diagnose this
>> better or if this is a known bug in this version of 389 server. Below
>> is an exceprt from the access logs.
>> [06/Jan/2011:21:29:11 -0700] conn=473 op=-1 fd=81 closed - T2
>> [06/Jan/2011:21:38:36 -0700] conn=490 fd=81 slot=81 connection from
>> 10.128.0.142
>> to 10.128.0.129
>> [06/Jan/2011:21:38:37 -0700] conn=490 op=-1 fd=81 closed - T2
>> [06/Jan/2011:21:38:38 -0700] conn=491 fd=81 slot=81 connection from
>> 10.128.0.142
>> to 10.128.0.129
>> [06/Jan/2011:21:38:38 -0700] conn=491 op=-1 fd=81 closed - T2
>> ...repeats until server is restarted...
> Thanks. Please file a bug.
> --
> 389 users mailing list
> 389-users@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 01-10-2011, 05:25 PM
"Hendricks, Todd"
 
Default Bug related to file descriptors

Correction: version is 1.2.7.5-1.el5 (looked at the wrong package!)

- Todd


On 1/10/11 12:24 PM, "Todd Hendricks" <thendricks@playboy.com> wrote:

> I too am experiencing the same issue (389-ds 1.2.1-1.el5):
>
> [10/Jan/2011:12:04:26 -0600] conn=187 fd=69 slot=69 connection from
> 172.16.13.21 to 172.16.16.23
> [10/Jan/2011:12:04:26 -0600] conn=187 op=-1 fd=69 closed - T2
> [10/Jan/2011:12:04:26 -0600] conn=188 fd=69 slot=69 connection from
> 172.16.13.21 to 172.16.16.23
> [10/Jan/2011:12:04:26 -0600] conn=188 op=-1 fd=69 closed - T2
> [10/Jan/2011:12:04:26 -0600] conn=189 fd=69 slot=69 connection from
> 172.16.13.21 to 172.16.16.23
> [10/Jan/2011:12:04:26 -0600] conn=189 op=-1 fd=69 closed - T2
> [10/Jan/2011:12:04:27 -0600] conn=190 fd=69 slot=69 connection from
> 172.16.13.22 to 172.16.16.23
> [10/Jan/2011:12:04:27 -0600] conn=190 op=-1 fd=69 closed - T2
> [10/Jan/2011:12:04:27 -0600] conn=191 fd=69 slot=69 connection from
> 172.16.9.31 to 172.16.16.23
> [10/Jan/2011:12:04:27 -0600] conn=191 op=-1 fd=69 closed - T2
> [10/Jan/2011:12:04:30 -0600] conn=192 fd=69 slot=69 connection from
> 172.16.16.254 to 172.16.16.23
> [10/Jan/2011:12:04:30 -0600] conn=192 op=-1 fd=69 closed - T2
> [10/Jan/2011:12:04:31 -0600] conn=193 fd=69 slot=69 connection from
> 172.16.4.21 to 172.16.16.23
> [10/Jan/2011:12:04:31 -0600] conn=193 op=-1 fd=69 closed - T2
>
> I'll happily contribute to the bug report as well when it is filed. Thanks!
>
> - Todd
>
>
> On 1/10/11 12:16 PM, "Rich Megginson" <rmeggins@redhat.com> wrote:
>
>> On 01/10/2011 09:49 AM, Jazcek Braden wrote:
>>> I am having an issue with running 398 server on fedora core 14 using
>>> the default packages (389 1.2.7.5-1).
>>>
>>> What is happening is that every once and a while a client gets
>>> disconnected with a T2 error, after that happens no new connections
>>> can be made to the server until the server is restart, all existing
>>> connections continue to work as expected. Reading only the access log
>>> messages it looks like it might have something to do with the fd
>>> handle. I only say this because once the first disconnect occurs, the
>>> conn number is incremented by the fd number isn't and any connection
>>> that tries to use the fd increment has an error. I was wondering if
>>> any body can suggest the logging options on how to diagnose this
>>> better or if this is a known bug in this version of 389 server. Below
>>> is an exceprt from the access logs.
>>> [06/Jan/2011:21:29:11 -0700] conn=473 op=-1 fd=81 closed - T2
>>> [06/Jan/2011:21:38:36 -0700] conn=490 fd=81 slot=81 connection from
>>> 10.128.0.142
>>> to 10.128.0.129
>>> [06/Jan/2011:21:38:37 -0700] conn=490 op=-1 fd=81 closed - T2
>>> [06/Jan/2011:21:38:38 -0700] conn=491 fd=81 slot=81 connection from
>>> 10.128.0.142
>>> to 10.128.0.129
>>> [06/Jan/2011:21:38:38 -0700] conn=491 op=-1 fd=81 closed - T2
>>> ...repeats until server is restarted...
>> Thanks. Please file a bug.
>> --
>> 389 users mailing list
>> 389-users@lists.fedoraproject.org
>> https://admin.fedoraproject.org/mailman/listinfo/389-users
>
> --
> 389 users mailing list
> 389-users@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 01-10-2011, 05:37 PM
"Jeremy A. Mates"
 
Default Bug related to file descriptors

2011/1/10 Jazcek Braden <jazcek@gmail.com>:
> I am having an issue with running 398 server on fedora core 14 using
> the default packages (389 1.2.7.5-1).
>
> What is happening is that every once and a while a client gets
> disconnected with a T2 error, after that happens no new connections
> can be made to the server until the server is restart, all existing
> connections continue to work as expected. *Reading only the access log
> messages it looks like it might have something to do with the fd
> handle. *I only say this because once the first disconnect occurs, the
> conn number is incremented by the fd number isn't and any connection
> that tries to use the fd increment has an error. *I was wondering if
> any body can suggest the logging options on how to diagnose this
> better or if this is a known bug in this version of 389 server. *Below
> is an exceprt from the access logs.

I've seen a similar bug for the same 389 package versions running on
RHEL 5.5 systems, though only when the LDAP servers are exposed to
production traffic. Re-issuing the same search queries via a Net::LDAP
perl script has never reproduced the crash, even when the test script
is scaled up to or beyond production traffic levels and connection
numbers. Next up is replaying actual production traffic from a tcpdump
via new TCP connections...

The error message from a issue-a-query-every-minute monitoring script is one of:

IO::Socket::INET: connect: Connection refused - (at tcp connect time)
I/O Error Connection reset by peer - (at LDAP bind() time)

Jeremy
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 01-10-2011, 05:59 PM
Rich Megginson
 
Default Bug related to file descriptors

On 01/10/2011 11:37 AM, Jeremy A. Mates wrote:
> 2011/1/10 Jazcek Braden<jazcek@gmail.com>:
>> I am having an issue with running 398 server on fedora core 14 using
>> the default packages (389 1.2.7.5-1).
>>
>> What is happening is that every once and a while a client gets
>> disconnected with a T2 error, after that happens no new connections
>> can be made to the server until the server is restart, all existing
>> connections continue to work as expected. Reading only the access log
>> messages it looks like it might have something to do with the fd
>> handle. I only say this because once the first disconnect occurs, the
>> conn number is incremented by the fd number isn't and any connection
>> that tries to use the fd increment has an error. I was wondering if
>> any body can suggest the logging options on how to diagnose this
>> better or if this is a known bug in this version of 389 server. Below
>> is an exceprt from the access logs.
> I've seen a similar bug for the same 389 package versions running on
> RHEL 5.5 systems, though only when the LDAP servers are exposed to
> production traffic. Re-issuing the same search queries via a Net::LDAP
> perl script has never reproduced the crash,
What crash? Is there also a crash?
> even when the test script
> is scaled up to or beyond production traffic levels and connection
> numbers. Next up is replaying actual production traffic from a tcpdump
> via new TCP connections...
>
> The error message from a issue-a-query-every-minute monitoring script is one of:
>
> IO::Socket::INET: connect: Connection refused - (at tcp connect time)
> I/O Error Connection reset by peer - (at LDAP bind() time)
>
> Jeremy
> --
> 389 users mailing list
> 389-users@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 01-10-2011, 09:23 PM
"Jeremy A. Mates"
 
Default Bug related to file descriptors

Got slapd to wedge while under strace:

https://bugzilla.redhat.com/show_bug.cgi?id=668619

Jeremy
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 

Thread Tools




All times are GMT. The time now is 02:18 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org