Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Fedora Directory (http://www.linux-archive.org/fedora-directory/)
-   -   Troublesome Connection reset by peer errors (http://www.linux-archive.org/fedora-directory/644793-troublesome-connection-reset-peer-errors.html)

"Alther, Nicholas" 03-14-2012 10:58 PM

Troublesome Connection reset by peer errors
 
I am seeking some assistance in isolating a re-occurring problem we are experiencing with our 389 DS Version 1.2.8.3 installation. We use the directory server for user authentication to our website. Every couple of days we start getting
errors from our website login application reporting a user authentication timed out. These timeouts get more frequent as time passes. Our fix now is to restart the directory server which fixes the problem for a couple of days then the timeouts start happening
again. I traced one application timeout back to the ds access logs and found the following entry at the same time:

*

[14/Mar/2012:10:23:01 -0500] conn=14730 op=-1 fd=1093 closed error 104 (Connection reset by peer) - TCP connection reset by peer.

*

I looked through the older logs and the only time this conn/fd was used was two days ago. Here are the access log entries:

*

[12/Mar/2012:14:33:06 -0500] conn=14730 fd=1093 slot=1093 connection from 10.1.xx.xx to 10.1.xx.xx

[12/Mar/2012:14:33:06 -0500] conn=14730 op=0 BIND dn="uid,dc=domain,dc=com" method=128 version=3

[12/Mar/2012:14:33:06 -0500] conn=14730 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=theManager,dc=domain,dc=com"

[12/Mar/2012:14:33:06 -0500] conn=14730 op=1 SRCH base="ou=users,ou=external,dc=domain,dc=com" scope=2 filter="(&(uid=xxxxx)(objectClass=inetUser))" attrs="1.1"

[12/Mar/2012:14:33:06 -0500] conn=14730 op=1 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:33:06 -0500] conn=14730 op=2 BIND dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=co m" method=128 version=3

[12/Mar/2012:14:33:06 -0500] conn=14730 op=2 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=co m"

[12/Mar/2012:14:33:06 -0500] conn=14730 op=3 BIND dn="uid,dc=domain,dc=com" method=128 version=3

[12/Mar/2012:14:33:06 -0500] conn=14730 op=3 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=theManager,dc= domain,dc=com"

[12/Mar/2012:14:35:20 -0500] conn=14730 op=4 SRCH base="ou=groups,ou=external,dc= domain,dc=com" scope=2 filter="(&(cn=domain)(|(objectClass=groupOfURLs)(o bjectClass=groupOfNames)))" attrs="1.1"

[12/Mar/2012:14:35:20 -0500] conn=14730 op=4 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:35:20 -0500] conn=14730 op=5 SRCH base="ou=groups,ou=external,dc= domain,dc=com" scope=2 filter="(&(member=cn=domain,ou=groups,ou=external, dc=domain,dc=com)(objectClass=groupOfNames))" attrs="cn"

[12/Mar/2012:14:35:20 -0500] conn=14730 op=5 RESULT err=0 tag=101 nentries=0 etime=0

[12/Mar/2012:14:36:50 -0500] conn=14730 op=6 SRCH base="ou=users,ou=external,dc=domain,dc=com" scope=2 filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"

[12/Mar/2012:14:36:50 -0500] conn=14730 op=6 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:36:50 -0500] conn=14730 op=7 BIND dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=c om" method=128 version=3

[12/Mar/2012:14:36:50 -0500] conn=14730 op=7 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=c om"

[12/Mar/2012:14:36:50 -0500] conn=14730 op=8 BIND dn="uid=theManager,dc=domain,dc=com" method=128 version=3

[12/Mar/2012:14:36:50 -0500] conn=14730 op=8 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=theManager,dc=domain,dc=com"

[12/Mar/2012:14:37:02 -0500] conn=14730 op=9 SRCH base="ou=groups,ou=external,dc=domain,dc=com" scope=2 filter="(&(cn=domain)(|(objectClass=groupOfURLs)(o bjectClass=groupOfNames)))" attrs="1.1"

[12/Mar/2012:14:37:02 -0500] conn=14730 op=9 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:37:02 -0500] conn=14730 op=10 SRCH base="ou=groups,ou=external,dc=domain,dc=com" scope=2 filter="(&(member=cn=domain,ou=groups,ou=external, dc=domain,dc=com)(objectClass=groupOfNames))" attrs="cn"

[12/Mar/2012:14:37:02 -0500] conn=14730 op=10 RESULT err=0 tag=101 nentries=0 etime=0

[12/Mar/2012:14:39:35 -0500] conn=14730 op=11 SRCH base="ou=users,ou=external,dc=domain,dc=com" scope=2 filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"

[12/Mar/2012:14:39:35 -0500] conn=14730 op=11 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:39:35 -0500] conn=14730 op=12 BIND dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=c om" method=128 version=3

[12/Mar/2012:14:39:35 -0500] conn=14730 op=12 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=c om"

[12/Mar/2012:14:39:35 -0500] conn=14730 op=13 BIND dn="uid=theManager,dc=domain,dc=com" method=128 version=3

[12/Mar/2012:14:39:35 -0500] conn=14730 op=13 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=theManager,dc=domain,dc=com"

[12/Mar/2012:14:40:23 -0500] conn=14730 op=14 SRCH base="ou=groups,ou=external,dc=domain,dc=com" scope=2 filter="(&(cn=domain)(|(objectClass=groupOfURLs)(o bjectClass=groupOfNames)))" attrs="1.1"

[12/Mar/2012:14:40:23 -0500] conn=14730 op=14 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:40:23 -0500] conn=14730 op=15 SRCH base="ou=groups,ou=external,dc=domain,dc=com" scope=2 filter="(&(member=cn=domain,ou=groups,ou=external, dc=domain,dc=com)(objectClass=groupOfNames))" attrs="cn"

[12/Mar/2012:14:40:23 -0500] conn=14730 op=15 RESULT err=0 tag=101 nentries=0 etime=0

*

The scenario seems to be that the DS works fine after a restart until it runs out of unused connections and/or file descriptors (max FDs= 8192). When it starts recycling connections and/or file descriptors the 104 errors start appearing
more often in the access logs and we start getting more authentication errors. *We suspect that the original connection never got terminated correctly but don’t know if it is the application that is at fault or a DS setting.


*

Our servers have been tuned according to the wiki doc at
http://directory.fedoraproject.org/wiki/Performance_Tuning#Linux

We have set our idle “timeout” to 60 seconds and search “timelimit” to 120 seconds with no change in behavior.


*

Watching netstat -nap | grep slapd shows established connections that do not drop off, just continually grow.


*

Any help would be greatly appreciated.

*

Nicholas J Alther

Sr. Software Developer/Analyst

Black Hills Corporation

Phone: 605.721.2158

Cell:**** 605.593.1899

*

*

*

Nicholas J Alther

Sr. Software Developer/Analyst

Phone: 605.721.2158

Cell:**** 605.593.1899

*







This electronic message transmission contains information from Black Hills Corporation, its affiliate or subsidiary, which may be confidential or privileged. The information is intended to be for the use of the individual or entity named above. If you are not
the intended recipient, be aware the disclosure, copying, distribution or use of the contents of this information is prohibited. If you received this electronic transmission in error, please reply to sender immediately; then delete this message without copying
it or further reading.




--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Rich Megginson 03-14-2012 11:07 PM

Troublesome Connection reset by peer errors
 
On 03/14/2012 05:58 PM, Alther, Nicholas wrote:




I am seeking some assistance in isolating a
re-occurring problem we are experiencing with our 389 DS
Version 1.2.8.3 installation. We use the directory server for
user authentication to our website. Every couple of days we
start getting errors from our website login application
reporting a user authentication timed out. These timeouts get
more frequent as time passes. Our fix now is to restart the
directory server which fixes the problem for a couple of days
then the timeouts start happening again. I traced one
application timeout back to the ds access logs and found the
following entry at the same time:

*

[14/Mar/2012:10:23:01 -0500] conn=14730
op=-1 fd=1093 closed error 104 (Connection reset by peer) -
TCP connection reset by peer.

*

I looked through the older logs and the
only time this conn/fd was used was two days ago. Here are the
access log entries:

*

[12/Mar/2012:14:33:06 -0500] conn=14730
fd=1093 slot=1093 connection from 10.1.xx.xx to 10.1.xx.xx

[12/Mar/2012:14:33:06 -0500] conn=14730
op=0 BIND dn="uid,dc=domain,dc=com" method=128 version=3

[12/Mar/2012:14:33:06 -0500] conn=14730
op=0 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"

[12/Mar/2012:14:33:06 -0500] conn=14730
op=1 SRCH base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxx)(objectClass=inetUser))" attrs="1.1"

[12/Mar/2012:14:33:06 -0500] conn=14730
op=1 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:33:06 -0500] conn=14730
op=2 BIND dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=co m"
method=128 version=3

[12/Mar/2012:14:33:06 -0500] conn=14730
op=2 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=co m"

[12/Mar/2012:14:33:06 -0500] conn=14730
op=3 BIND dn="uid,dc=domain,dc=com" method=128 version=3

[12/Mar/2012:14:33:06 -0500] conn=14730
op=3 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc= domain,dc=com"

[12/Mar/2012:14:35:20 -0500] conn=14730
op=4 SRCH base="ou=groups,ou=external,dc= domain,dc=com"
scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(o bjectClass=groupOfNames)))"
attrs="1.1"

[12/Mar/2012:14:35:20 -0500] conn=14730
op=4 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:35:20 -0500] conn=14730
op=5 SRCH base="ou=groups,ou=external,dc= domain,dc=com"
scope=2
filter="(&(member=cn=domain,ou=groups,ou=external, dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"

[12/Mar/2012:14:35:20 -0500] conn=14730
op=5 RESULT err=0 tag=101 nentries=0 etime=0

[12/Mar/2012:14:36:50 -0500] conn=14730
op=6 SRCH base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"

[12/Mar/2012:14:36:50 -0500] conn=14730
op=6 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:36:50 -0500] conn=14730
op=7 BIND
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=c om"
method=128 version=3

[12/Mar/2012:14:36:50 -0500] conn=14730
op=7 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=c om"

[12/Mar/2012:14:36:50 -0500] conn=14730
op=8 BIND dn="uid=theManager,dc=domain,dc=com" method=128
version=3

[12/Mar/2012:14:36:50 -0500] conn=14730
op=8 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"

[12/Mar/2012:14:37:02 -0500] conn=14730
op=9 SRCH base="ou=groups,ou=external,dc=domain,dc=com"
scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(o bjectClass=groupOfNames)))"
attrs="1.1"

[12/Mar/2012:14:37:02 -0500] conn=14730
op=9 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:37:02 -0500] conn=14730
op=10 SRCH base="ou=groups,ou=external,dc=domain,dc=com"
scope=2
filter="(&(member=cn=domain,ou=groups,ou=external, dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"

[12/Mar/2012:14:37:02 -0500] conn=14730
op=10 RESULT err=0 tag=101 nentries=0 etime=0

[12/Mar/2012:14:39:35 -0500] conn=14730
op=11 SRCH base="ou=users,ou=external,dc=domain,dc=com"
scope=2 filter="(&(uid=xxxxxx)(objectClass=inetUser))"
attrs="1.1"

[12/Mar/2012:14:39:35 -0500] conn=14730
op=11 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:39:35 -0500] conn=14730
op=12 BIND
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=c om"
method=128 version=3

[12/Mar/2012:14:39:35 -0500] conn=14730
op=12 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=c om"

[12/Mar/2012:14:39:35 -0500] conn=14730
op=13 BIND dn="uid=theManager,dc=domain,dc=com" method=128
version=3

[12/Mar/2012:14:39:35 -0500] conn=14730
op=13 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"

[12/Mar/2012:14:40:23 -0500] conn=14730
op=14 SRCH base="ou=groups,ou=external,dc=domain,dc=com"
scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(o bjectClass=groupOfNames)))"
attrs="1.1"

[12/Mar/2012:14:40:23 -0500] conn=14730
op=14 RESULT err=0 tag=101 nentries=1 etime=0

[12/Mar/2012:14:40:23 -0500] conn=14730
op=15 SRCH base="ou=groups,ou=external,dc=domain,dc=com"
scope=2
filter="(&(member=cn=domain,ou=groups,ou=external, dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"

[12/Mar/2012:14:40:23 -0500] conn=14730
op=15 RESULT err=0 tag=101 nentries=0 etime=0

*

The scenario seems to be that the DS works
fine after a restart until it runs out of unused connections
and/or file descriptors (max FDs= 8192). When it starts
recycling connections and/or file descriptors the 104 errors
start appearing more often in the access logs and we start
getting more authentication errors. *We suspect that the
original connection never got terminated correctly but don’t
know if it is the application that is at fault or a DS
setting.




We fixed some of these sorts of connection issues in 1.2.9.9.* I
suggest upgrading to that release.



In the meantime, you could try to lower the nsslapd-idletimeout
and/or the nsslapd-ioblocktimeout

http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#cnconfig-nsslapd_idletimeout_Default_Idle_Timeout



*

Our servers have been tuned according to
the wiki doc at
http://directory.fedoraproject.org/wiki/Performance_Tuning#Linux

We have set our
idle “timeout” to 60 seconds and search “timelimit” to 120
seconds with no change in behavior.


*

Watching netstat -nap | grep slapd shows
established connections that do not drop off, just continually
grow.


*

Any help would be greatly appreciated.

*

Nicholas J Alther

Sr. Software Developer/Analyst

Black Hills Corporation

Phone: 605.721.2158

Cell:**** 605.593.1899

*

*

*

Nicholas J Alther

Sr. Software Developer/Analyst

Phone: 605.721.2158

Cell:**** 605.593.1899

*







This electronic message transmission contains information from
Black Hills Corporation, its affiliate or subsidiary, which may
be confidential or privileged. The information is intended to be
for the use of the individual or entity named above. If you are
not the intended recipient, be aware the disclosure, copying,
distribution or use of the contents of this information is
prohibited. If you received this electronic transmission in
error, please reply to sender immediately; then delete this
message without copying it or further reading.







--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users





--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users


All times are GMT. The time now is 12:35 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.