FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor


 
 
LinkBack Thread Tools
 
Old 06-20-2008, 02:50 PM
"debu"
 
Default MMR issue

* * *

Hi Guys,



I am stuck in a very crucial FDS server issue, it would be great if any one of you can help me somehow.



We are upgrading from Fedora Directory Service from 1.0.4 to 1.1.0-3



We have one existing Server with 1.0.4



Now To one server we have initialized the data base and we were able to load the full DB. But, and when we start the replication we see the following error, and the incremental update is not happening.



We are going for a multi master replication.





Here is the error.



On Supplier: (FDS Version 1.0.4) OS: Red Hat Enterprise Linux ES release 4 (Nahant)





[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin - agmt="cn=Replication_to_10.91.X.Y" (10:8888): Unable to acquire replica: Excessive clock skew between the supplier and the consumer. Replication is aborting.



[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin - agmt="cn=Replication_to_10.91.X.Y" (10:8888): Incremental update failed and requires administrator action







On consumer: (FD version 1.1.0-3) OS: Red Hat Enterprise Linux Server release 5.1 (Tikanga)







[17/Jun/2008:11:12:59 +051800] NSMMReplicationPlugin - conn=46251 op=1975 replica="o=TejaUsers": Unable to acquire replica: error: excessive clock skew



[17/Jun/2008:11:23:34 +051800] - csngen_adjust_time: adjustment limit exceeded; value - 86401, limit - 86400



[17/Jun/2008:11:23:34 +051800] NSMMReplicationPlugin - conn=46461 op=792 replica="o=TejaUsers": Unable to acquire replica: error: excessive clock skew





Now, My doubt is we succeded in a test environment with the same, with the only diference that we had the same OS in both the server, rest all same. Our servers are perfectly synced with NTP also.



Please help in this scenario..



Regards

~Debajit





--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users
 
Old 06-20-2008, 04:06 PM
"Chris St. Pierre"
 
Default MMR issue

Did you try the workaround in the bug report I sent to you on the
Redhat list? What were your results?

For reference, that bug is https://bugzilla.redhat.com/show_bug.cgi?id=233642

Chris St. Pierre
Unix Systems Administrator
Nebraska Wesleyan University

On Fri, 20 Jun 2008, debu wrote:




Hi Guys,

I am stuck in a very crucial FDS server issue, it would be great if any one of you can help me somehow.

We are upgrading from Fedora Directory Service from 1.0.4 to 1.1.0-3

We have one existing Server with 1.0.4

Now To one server we have initialized the data base and we were able to load the full DB. But, and when we start the replication we see the following error, and the incremental update is not happening.

We are going for a multi master replication.


Here is the error.

On Supplier: (FDS Version 1.0.4) OS: Red Hat Enterprise Linux ES release 4 (Nahant)


[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin - agmt="cn=Replication_to_10.91.X.Y" (10:8888): Unable to acquire replica: Excessive clock skew between the supplier and the consumer. Replication is aborting.

[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin - agmt="cn=Replication_to_10.91.X.Y" (10:8888): Incremental update failed and requires administrator action



On consumer: (FD version 1.1.0-3) OS: Red Hat Enterprise Linux Server release 5.1 (Tikanga)



[17/Jun/2008:11:12:59 +051800] NSMMReplicationPlugin - conn=46251 op=1975 replica="o=TejaUsers": Unable to acquire replica: error: excessive clock skew

[17/Jun/2008:11:23:34 +051800] - csngen_adjust_time: adjustment limit exceeded; value - 86401, limit - 86400

[17/Jun/2008:11:23:34 +051800] NSMMReplicationPlugin - conn=46461 op=792 replica="o=TejaUsers": Unable to acquire replica: error: excessive clock skew


Now, My doubt is we succeded in a test environment with the same, with the only diference that we had the same OS in both the server, rest all same. Our servers are perfectly synced with NTP also.

Please help in this scenario..

Regards
~Debajit


--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users
 
Old 06-21-2008, 06:33 PM
"debu"
 
Default MMR issue

Hi Chris,



Thanks for your round about way.

As you suggested we have removed everything, along with that we have reinstalled the latest version in two machines, and kept one machine as is (in total we have 3 machines).



Now, we configured these two servers in multi-master mode and initialized one of them from the old machine. Then all the data got pushed into the new servers. These two new machines are replicating properly.



but the replication agreement between the old server and new server is breaking. But we used the console interface to push the delta of updates. But the process is very slow, may be because we haven't done db2ldif to dump the data.



We are planning to push delta of updates from old server to 2 new servers (using the console interface) and remove the old server from the system.



Then these two servers will become primary point of live interaction for read and write.*



Since, we can't afford for downtime, we have done like this.



Till now the replication is happening fine.



hope this continues.



Thank you very much for your help.



Regards,

-Debu,vivek





On Fri, 20 Jun 2008 Chris St.Pierre wrote :

>Did you try the workaround in the bug report I sent to you on the

>Redhat list?* What were your results?

>

>For reference, that bug is https://bugzilla.redhat.com/show_bug.cgi?id=233642

>

>Chris St. Pierre

>Unix Systems Administrator

>Nebraska Wesleyan University

>

>On Fri, 20 Jun 2008, debu wrote:

>

>>

>>

>>Hi Guys,

>>

>>I am stuck in a very crucial FDS server issue, it would be great if any one of you can help me somehow.

>>

>>We are upgrading from Fedora Directory Service from 1.0.4 to 1.1.0-3

>>

>>We have one existing Server with 1.0.4

>>

>>Now To one server we have initialized the data base and we were able to load the full DB. But, and when we start the replication we see the following error, and the incremental update is not happening.

>>

>>We are going for a multi master replication.

>>

>>

>>Here is the error.

>>

>>On Supplier: (FDS Version 1.0.4) OS: Red Hat Enterprise Linux ES release 4 (Nahant)

>>

>>

>>[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin - agmt="cn=Replication_to_10.91.X.Y" (10:8888): Unable to acquire replica: Excessive clock skew between the supplier and the consumer. Replication is aborting.

>>

>>[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin - agmt="cn=Replication_to_10.91.X.Y" (10:8888): Incremental update failed and requires administrator action

>>

>>

>>

>>On consumer: (FD version 1.1.0-3) OS: Red Hat Enterprise Linux Server release 5.1 (Tikanga)

>>

>>

>>

>>[17/Jun/2008:11:12:59 +051800] NSMMReplicationPlugin - conn=46251 op=1975 replica="o=TejaUsers": Unable to acquire replica: error: excessive clock skew

>>

>>[17/Jun/2008:11:23:34 +051800] - csngen_adjust_time: adjustment limit exceeded; value - 86401, limit - 86400

>>

>>[17/Jun/2008:11:23:34 +051800] NSMMReplicationPlugin - conn=46461 op=792 replica="o=TejaUsers": Unable to acquire replica: error: excessive clock skew

>>

>>

>>Now, My doubt is we succeded in a test environment with the same, with the only diference that we had the same OS in both the server, rest all same. Our servers are perfectly synced with NTP also.

>>

>>Please help in this scenario..

>>

>>Regards

>>~Debajit







--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users
 
Old 06-23-2008, 03:11 PM
Rich Megginson
 
Default MMR issue

debu wrote:


Hi Chris,

Thanks for your round about way.
As you suggested we have removed everything, along with that we have
reinstalled the latest version in two machines, and kept one machine
as is (in total we have 3 machines).


Now, we configured these two servers in multi-master mode and
initialized one of them from the old machine. Then all the data got
pushed into the new servers. These two new machines are replicating
properly.


but the replication agreement between the old server and new server is
breaking. But we used the console interface to push the delta of
updates. But the process is very slow, may be because we haven't done
db2ldif to dump the data.


We are planning to push delta of updates from old server to 2 new
servers (using the console interface) and remove the old server from
the system.


Then these two servers will become primary point of live interaction
for read and write.


Since, we can't afford for downtime, we have done like this.

Till now the replication is happening fine.

hope this continues.

Thank you very much for your help.

We are working on a fix for the time skew issue. However, we need your
help. The bug https://bugzilla.redhat.com/show_bug.cgi?id=233642 has
attached to it a script which will provide us with some much needed
data. You basically run this on your masters like this:

readNsState.py /etc/dirsrv/slapd-yourinstance/dse.ldif
The data that it prints out is very useful for help with debugging this
problem. You can either attach the output to the bug, or just email the
output to me.


Anyone else interested in helping? Anyone have MMR running? Please run
the script and either attach the output to the bug or just send it to me.



Regards,
-Debu,vivek


On Fri, 20 Jun 2008 Chris St.Pierre wrote :
>Did you try the workaround in the bug report I sent to you on the
>Redhat list? What were your results?
>
>For reference, that bug is
https://bugzilla.redhat.com/show_bug.cgi?id=233642

>
>Chris St. Pierre
>Unix Systems Administrator
>Nebraska Wesleyan University
>
>On Fri, 20 Jun 2008, debu wrote:
>
>>
>>
>>Hi Guys,
>>
>>I am stuck in a very crucial FDS server issue, it would be great if
any one of you can help me somehow.

>>
>>We are upgrading from Fedora Directory Service from 1.0.4 to 1.1.0-3
>>
>>We have one existing Server with 1.0.4
>>
>>Now To one server we have initialized the data base and we were able
to load the full DB. But, and when we start the replication we see the
following error, and the incremental update is not happening.

>>
>>We are going for a multi master replication.
>>
>>
>>Here is the error.
>>
>>On Supplier: (FDS Version 1.0.4) OS: Red Hat Enterprise Linux ES
release 4 (Nahant)

>>
>>
>>[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin -
agmt="cn=Replication_to_10.91.X.Y" (10:8888): Unable to acquire
replica: Excessive clock skew between the supplier and the consumer.
Replication is aborting.

>>
>>[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin -
agmt="cn=Replication_to_10.91.X.Y" (10:8888): Incremental update
failed and requires administrator action

>>
>>
>>
>>On consumer: (FD version 1.1.0-3) OS: Red Hat Enterprise Linux
Server release 5.1 (Tikanga)

>>
>>
>>
>>[17/Jun/2008:11:12:59 +051800] NSMMReplicationPlugin - conn=46251
op=1975 replica="o=TejaUsers": Unable to acquire replica: error:
excessive clock skew

>>
>>[17/Jun/2008:11:23:34 +051800] - csngen_adjust_time: adjustment
limit exceeded; value - 86401, limit - 86400

>>
>>[17/Jun/2008:11:23:34 +051800] NSMMReplicationPlugin - conn=46461
op=792 replica="o=TejaUsers": Unable to acquire replica: error:
excessive clock skew

>>
>>
>>Now, My doubt is we succeded in a test environment with the same,
with the only diference that we had the same OS in both the server,
rest all same. Our servers are perfectly synced with NTP also.

>>
>>Please help in this scenario..
>>
>>Regards
>>~Debajit



Sharekhan Zero

------------------------------------------------------------------------

--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users



--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users
 
Old 06-23-2008, 04:35 PM
Gary Windham
 
Default MMR issue

We have a downtime scheduled for our production FDS instance (used for
our campus authentication service) this Friday, in order to
reestablish MMR. After reestablishing MMR we will be monitoring with
the script, so we should be able to provide some data shortly.


Thanks,
--Gary

--
Gary Windham
Senior Enterprise Systems Architect
The University of Arizona, UITS
+1 520 626 5981

On Jun 23, 2008, at 8:11 AM, Rich Megginson wrote:


debu wrote:


Hi Chris,

Thanks for your round about way.
As you suggested we have removed everything, along with that we
have reinstalled the latest version in two machines, and kept one
machine as is (in total we have 3 machines).


Now, we configured these two servers in multi-master mode and
initialized one of them from the old machine. Then all the data got
pushed into the new servers. These two new machines are replicating
properly.


but the replication agreement between the old server and new server
is breaking. But we used the console interface to push the delta of
updates. But the process is very slow, may be because we haven't
done db2ldif to dump the data.


We are planning to push delta of updates from old server to 2 new
servers (using the console interface) and remove the old server
from the system.


Then these two servers will become primary point of live
interaction for read and write.

Since, we can't afford for downtime, we have done like this.

Till now the replication is happening fine.

hope this continues.

Thank you very much for your help.

We are working on a fix for the time skew issue. However, we need
your help. The bug https://bugzilla.redhat.com/show_bug.cgi?
id=233642 has attached to it a script which will provide us with
some much needed data. You basically run this on your masters like
this:

readNsState.py /etc/dirsrv/slapd-yourinstance/dse.ldif
The data that it prints out is very useful for help with debugging
this problem. You can either attach the output to the bug, or just
email the output to me.


Anyone else interested in helping? Anyone have MMR running? Please
run the script and either attach the output to the bug or just send
it to me.



Regards,
-Debu,vivek


On Fri, 20 Jun 2008 Chris St.Pierre wrote :
>Did you try the workaround in the bug report I sent to you on the
>Redhat list? What were your results?
>
>For reference, that bug is https://bugzilla.redhat.com/show_bug.cgi?id=233642
>
>Chris St. Pierre
>Unix Systems Administrator
>Nebraska Wesleyan University
>
>On Fri, 20 Jun 2008, debu wrote:
>
>>
>>
>>Hi Guys,
>>
>>I am stuck in a very crucial FDS server issue, it would be great
if any one of you can help me somehow.

>>
>>We are upgrading from Fedora Directory Service from 1.0.4 to
1.1.0-3

>>
>>We have one existing Server with 1.0.4
>>
>>Now To one server we have initialized the data base and we were
able to load the full DB. But, and when we start the replication we
see the following error, and the incremental update is not happening.

>>
>>We are going for a multi master replication.
>>
>>
>>Here is the error.
>>
>>On Supplier: (FDS Version 1.0.4) OS: Red Hat Enterprise Linux ES
release 4 (Nahant)

>>
>>
>>[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin -
agmt="cn=Replication_to_10.91.X.Y" (10:8888): Unable to acquire
replica: Excessive clock skew between the supplier and the
consumer. Replication is aborting.

>>
>>[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin -
agmt="cn=Replication_to_10.91.X.Y" (10:8888): Incremental update
failed and requires administrator action

>>
>>
>>
>>On consumer: (FD version 1.1.0-3) OS: Red Hat Enterprise Linux
Server release 5.1 (Tikanga)

>>
>>
>>
>>[17/Jun/2008:11:12:59 +051800] NSMMReplicationPlugin - conn=46251
op=1975 replica="o=TejaUsers": Unable to acquire replica: error:
excessive clock skew

>>
>>[17/Jun/2008:11:23:34 +051800] - csngen_adjust_time: adjustment
limit exceeded; value - 86401, limit - 86400

>>
>>[17/Jun/2008:11:23:34 +051800] NSMMReplicationPlugin - conn=46461
op=792 replica="o=TejaUsers": Unable to acquire replica: error:
excessive clock skew

>>
>>
>>Now, My doubt is we succeded in a test environment with the same,
with the only diference that we had the same OS in both the server,
rest all same. Our servers are perfectly synced with NTP also.

>>
>>Please help in this scenario..
>>
>>Regards
>>~Debajit



Sharekhan Zero

------------------------------------------------------------------------

--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users



--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users



--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users
 
Old 06-23-2008, 04:48 PM
Rich Megginson
 
Default MMR issue

Gary Windham wrote:
We have a downtime scheduled for our production FDS instance (used for
our campus authentication service) this Friday, in order to
reestablish MMR. After reestablishing MMR we will be monitoring with
the script, so we should be able to provide some data shortly.

Excellent. Thanks!


Thanks,
--Gary

--
Gary Windham
Senior Enterprise Systems Architect
The University of Arizona, UITS
+1 520 626 5981

On Jun 23, 2008, at 8:11 AM, Rich Megginson wrote:


debu wrote:


Hi Chris,

Thanks for your round about way.
As you suggested we have removed everything, along with that we have
reinstalled the latest version in two machines, and kept one machine
as is (in total we have 3 machines).


Now, we configured these two servers in multi-master mode and
initialized one of them from the old machine. Then all the data got
pushed into the new servers. These two new machines are replicating
properly.


but the replication agreement between the old server and new server
is breaking. But we used the console interface to push the delta of
updates. But the process is very slow, may be because we haven't
done db2ldif to dump the data.


We are planning to push delta of updates from old server to 2 new
servers (using the console interface) and remove the old server from
the system.


Then these two servers will become primary point of live interaction
for read and write.

Since, we can't afford for downtime, we have done like this.

Till now the replication is happening fine.

hope this continues.

Thank you very much for your help.

We are working on a fix for the time skew issue. However, we need
your help. The bug
https://bugzilla.redhat.com/show_bug.cgi?id=233642 has attached to it
a script which will provide us with some much needed data. You
basically run this on your masters like this:

readNsState.py /etc/dirsrv/slapd-yourinstance/dse.ldif
The data that it prints out is very useful for help with debugging
this problem. You can either attach the output to the bug, or just
email the output to me.


Anyone else interested in helping? Anyone have MMR running? Please
run the script and either attach the output to the bug or just send
it to me.



Regards,
-Debu,vivek


On Fri, 20 Jun 2008 Chris St.Pierre wrote :
>Did you try the workaround in the bug report I sent to you on the
>Redhat list? What were your results?
>
>For reference, that bug is
https://bugzilla.redhat.com/show_bug.cgi?id=233642

>
>Chris St. Pierre
>Unix Systems Administrator
>Nebraska Wesleyan University
>
>On Fri, 20 Jun 2008, debu wrote:
>
>>
>>
>>Hi Guys,
>>
>>I am stuck in a very crucial FDS server issue, it would be great
if any one of you can help me somehow.

>>
>>We are upgrading from Fedora Directory Service from 1.0.4 to 1.1.0-3
>>
>>We have one existing Server with 1.0.4
>>
>>Now To one server we have initialized the data base and we were
able to load the full DB. But, and when we start the replication we
see the following error, and the incremental update is not happening.

>>
>>We are going for a multi master replication.
>>
>>
>>Here is the error.
>>
>>On Supplier: (FDS Version 1.0.4) OS: Red Hat Enterprise Linux ES
release 4 (Nahant)

>>
>>
>>[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin -
agmt="cn=Replication_to_10.91.X.Y" (10:8888): Unable to acquire
replica: Excessive clock skew between the supplier and the consumer.
Replication is aborting.

>>
>>[17/Jun/2008:11:23:35 +051800] NSMMReplicationPlugin -
agmt="cn=Replication_to_10.91.X.Y" (10:8888): Incremental update
failed and requires administrator action

>>
>>
>>
>>On consumer: (FD version 1.1.0-3) OS: Red Hat Enterprise Linux
Server release 5.1 (Tikanga)

>>
>>
>>
>>[17/Jun/2008:11:12:59 +051800] NSMMReplicationPlugin - conn=46251
op=1975 replica="o=TejaUsers": Unable to acquire replica: error:
excessive clock skew

>>
>>[17/Jun/2008:11:23:34 +051800] - csngen_adjust_time: adjustment
limit exceeded; value - 86401, limit - 86400

>>
>>[17/Jun/2008:11:23:34 +051800] NSMMReplicationPlugin - conn=46461
op=792 replica="o=TejaUsers": Unable to acquire replica: error:
excessive clock skew

>>
>>
>>Now, My doubt is we succeded in a test environment with the same,
with the only diference that we had the same OS in both the server,
rest all same. Our servers are perfectly synced with NTP also.

>>
>>Please help in this scenario..
>>
>>Regards
>>~Debajit



Sharekhan Zero

------------------------------------------------------------------------



--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users



--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users



--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users


--
Fedora-directory-users mailing list
Fedora-directory-users@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-directory-users
 
Old 08-03-2012, 06:51 PM
Reinhard Nappert
 
Default MMR issue

Hi,
*
I have the following 389 DS version deployed:* 389-Directory/1.2.8.2 B2011.130.190
*
I have a 3 box multi-master replication setup in a ring:*******
*
*
**** ************* /******* **** /********* ***** /****** **** /** ******* /**
***********…** C **-----* *A*** -----*** B** -----* C** ----- A …
******** *****/***** ****** /***** ********* /***** **** /***** ** **/*****
*
The replication agreements for “A” and “C” and for “B” and “C” work fine, but I have an issue for the agreements for the “A” and “B” connection.
*
I see the following in the errors file:
*
Server A:
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": Begin incremental protocol
[19/Jul/2012:07:28:50 -0300] - csngen_adjust_time: gen state before 5007e1610000:1342693727:0:2
[19/Jul/2012:07:28:50 -0300] - _csngen_adjust_local_time: gen state before 5007e1610000:1342693727:0:2
[19/Jul/2012:07:28:50 -0300] - _csngen_adjust_local_time: gen state after 5007e1640000:1342693730:0:2
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=BASE": Replica in use locking_purl=conn=7831 id=3
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 replica="o=BASE": Unable to acquire replica: error: replica busy locked by conn=7831 id=3 for incremental update
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=umc": StartNSDS90ReplicationRequest: response=1 rc=0
*
This kind of error is logged in an interval of about 1 second, where the local_time differs 5007e1610000:1342693727:0:2
*
*
Server B:
[19/Jul/2012:13:28:48 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Timed out). Will retry later.
[19/Jul/2012:13:34:17 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Can't contact LDAP server). Will retry later.
[19/Jul/2012:13:44:25 -0300] slapi_ldap_bind - Error: timeout after [0.0] seconds reading bind response for [cn=replication,cn=config] mech [SIMPLE]
[19/Jul/2012:13:44:25 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth failed: LDAP error 85 (Timed out) ((null))
[19/Jul/2012:13:44:25 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth resumed
*
Sometimes, I also see the following error
[20/Jul/2012:11:28:39 -0300] slapi_ldap_bind - Error: could not send bind request for id [cn= replication,cn=config] mech [SIMPLE]: error 91 (Can't connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 (Operation now in progress)
[20/Jul/2012:11:28:39 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth failed: LDAP error 91 (Can't connect to the LDAP server) ((null))
[20/Jul/2012:11:30:30 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth resumed
*
I don’t see any indication that Server B was down at that time.
*
I did see the Bug 571677 (https://bugzilla.redhat.com/show_bug.cgi?id=571677), but there was no deletion of a replicaconflict object.
*
Did anybody encounter this kind of issue? The next question would be: How to recover the MMR environment.
*
Thanks,
-Reinhard
*
*
*
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 08-07-2012, 04:47 PM
Reinhard Nappert
 
Default MMR issue

Has somebody seen this problem as well?
*
-Reinhard
*
From: 389-users-bounces@lists.fedoraproject.org [mailto:389-users-bounces@lists.fedoraproject.org] On Behalf Of Reinhard Nappert
Sent: Friday, August 03, 2012 2:51 PM
To: 389-users@lists.fedoraproject.org
Subject: [389-users] MMR issue
*
Hi,
*
I have the following 389 DS version deployed:* 389-Directory/1.2.8.2 B2011.130.190
*
I have a 3 box multi-master replication setup in a ring:*******
*
*
**** ************* /******* **** /********* ***** /****** **** /** ******* /**
***********…** C **-----* *A*** -----*** B** -----* C** ----- A …
******** *****/***** ****** /***** ********* /***** **** /***** ** **/*****
*
The replication agreements for “A” and “C” and for “B” and “C” work fine, but I have an issue for the agreements for the “A” and “B” connection.
*
I see the following in the errors file:
*
Server A:
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": Begin incremental protocol
[19/Jul/2012:07:28:50 -0300] - csngen_adjust_time: gen state before 5007e1610000:1342693727:0:2
[19/Jul/2012:07:28:50 -0300] - _csngen_adjust_local_time: gen state before 5007e1610000:1342693727:0:2
[19/Jul/2012:07:28:50 -0300] - _csngen_adjust_local_time: gen state after 5007e1640000:1342693730:0:2
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=BASE": Replica in use locking_purl=conn=7831 id=3
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 replica="o=BASE": Unable to acquire replica: error: replica busy locked by conn=7831 id=3 for incremental update
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": StartNSDS90ReplicationRequest: response=1 rc=0
*
This kind of error is logged in an interval of about 1 second, where the local_time differs 5007e1610000:1342693727:0:2
*
*
Server B:
[19/Jul/2012:13:28:48 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Timed out). Will retry later.
[19/Jul/2012:13:34:17 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Can't contact LDAP server). Will retry later.
[19/Jul/2012:13:44:25 -0300] slapi_ldap_bind - Error: timeout after [0.0] seconds reading bind response for [cn=replication,cn=config] mech [SIMPLE]
[19/Jul/2012:13:44:25 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth failed: LDAP error 85 (Timed out) ((null))
[19/Jul/2012:13:44:25 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth resumed
*
Sometimes, I also see the following error
[20/Jul/2012:11:28:39 -0300] slapi_ldap_bind - Error: could not send bind request for id [cn= replication,cn=config] mech [SIMPLE]: error 91 (Can't connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 (Operation now in progress)
[20/Jul/2012:11:28:39 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth failed: LDAP error 91 (Can't connect to the LDAP server) ((null))
[20/Jul/2012:11:30:30 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth resumed
*
I don’t see any indication that Server B was down at that time.
*
I did see the Bug 571677 (https://bugzilla.redhat.com/show_bug.cgi?id=571677), but there was no deletion of a replicaconflict object.
*
Did anybody encounter this kind of issue? The next question would be: How to recover the MMR environment.
*
Thanks,
-Reinhard
*
*
*
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 08-07-2012, 05:11 PM
Grzegorz Dwornicki
 
Default MMR issue

Hi

I must say this ldap replication connections look quite unusual. Can you provide more information about:
- type of replication servers? Some servers i guest are masters and some are maybe slaves?
- Does errors occur when you try to initiate replication manually?


Some errors suggests that there maybe other replication/ldap operations in progress, then target server sends message about lockout:
[19/Jul/2012:07:28:50 -0300]
NSMMReplicationPlugin - conn=7835 op=160267 repl="o=BASE": Replica in
use locking_purl=conn=7831 id=3
[19/Jul/2012:07:28:50
-0300] NSMMReplicationPlugin - conn=7835 op=160267 replica="o=BASE":
Unable to acquire replica: error: replica busy locked by conn=7831 id=3
for incremental update
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": StartNSDS90ReplicationRequest: response=1 rc=0

Other error suggest that there mey be no connection between servers. Maybe target server is to busy to respond or maybe network/firewall problem:


[19/Jul/2012:13:28:48 -0300] NSMMReplicationPlugin -
agmt="cn=A-to-B" (A:389): Unable to receive the response for a
startReplication extended operation to consumer (Timed out). Will retry
later.
[19/Jul/2012:13:34:17
-0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to
receive the response for a startReplication extended operation to
consumer (Can't contact LDAP server). Will retry later.
(...)

Please provide infromation about replication types. Try manually initiated replication and monitor logs carefully. This may provide more information. If you want to push updates from one server to others, then please consider using multi-master connections and hub server (look in red hat docs for more details)


Greg.

2012/8/7 Reinhard Nappert <rnappert@juniper.net>

Has somebody seen this problem as well?
*

-Reinhard
*

From: 389-users-bounces@lists.fedoraproject.org [mailto:389-users-bounces@lists.fedoraproject.org] On Behalf Of Reinhard Nappert

Sent: Friday, August 03, 2012 2:51 PM
To: 389-users@lists.fedoraproject.org
Subject: [389-users] MMR issue

*
Hi,
*
I have the following 389 DS version deployed:* 389-Directory/1.2.8.2 B2011.130.190

*
I have a 3 box multi-master replication setup in a ring:*******
*
*

**** ************* /******* **** /********* ***** /****** **** /** ******* /**
***********…** C **-----* *A*** -----*** B** -----* C** ----- A …

******** *****/***** ****** /***** ********* /***** **** /***** ** **/*****
*
The replication agreements for “A” and “C” and for “B” and “C” work fine, but I have an issue for the agreements for the “A” and “B” connection.

*
I see the following in the errors file:
*
Server A:

[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": Begin incremental protocol
[19/Jul/2012:07:28:50 -0300] - csngen_adjust_time: gen state before 5007e1610000:1342693727:0:2

[19/Jul/2012:07:28:50 -0300] - _csngen_adjust_local_time: gen state before 5007e1610000:1342693727:0:2
[19/Jul/2012:07:28:50 -0300] - _csngen_adjust_local_time: gen state after 5007e1640000:1342693730:0:2

[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=BASE": Replica in use locking_purl=conn=7831 id=3
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 replica="o=BASE": Unable to acquire replica: error: replica busy locked by conn=7831 id=3 for incremental update

[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": StartNSDS90ReplicationRequest: response=1 rc=0

*
This kind of error is logged in an interval of about 1 second, where the local_time differs 5007e1610000:1342693727:0:2
*

*
Server B:
[19/Jul/2012:13:28:48 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Timed out). Will retry later.

[19/Jul/2012:13:34:17 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Can't contact LDAP server). Will retry later.

[19/Jul/2012:13:44:25 -0300] slapi_ldap_bind - Error: timeout after [0.0] seconds reading bind response for [cn=replication,cn=config] mech [SIMPLE]
[19/Jul/2012:13:44:25 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth failed: LDAP error 85 (Timed out) ((null))

[19/Jul/2012:13:44:25 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth resumed
*

Sometimes, I also see the following error
[20/Jul/2012:11:28:39 -0300] slapi_ldap_bind - Error: could not send bind request for id [cn= replication,cn=config] mech [SIMPLE]: error 91 (Can't connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 (Operation now in progress)

[20/Jul/2012:11:28:39 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth failed: LDAP error 91 (Can't connect to the LDAP server) ((null))

[20/Jul/2012:11:30:30 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth resumed
*

I don’t see any indication that Server B was down at that time.
*
I did see the Bug 571677 (https://bugzilla.redhat.com/show_bug.cgi?id=571677), but there was no deletion of a replicaconflict object.

*
Did anybody encounter this kind of issue? The next question would be: How to recover the MMR environment.
*

Thanks,
-Reinhard
*
*
*


--

389 users mailing list

389-users@lists.fedoraproject.org

https://admin.fedoraproject.org/mailman/listinfo/389-users


--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 08-07-2012, 05:17 PM
Reinhard Nappert
 
Default MMR issue

All of those servers are masters. This is a multi master environment.
*
Your point of having a firewall in between the servers is a good one! I don’t have any access to the deployment, though. It is worth it, to investigate.
*
Thanks
-Reinhard
*
From: 389-users-bounces@lists.fedoraproject.org [mailto:389-users-bounces@lists.fedoraproject.org] On Behalf Of Grzegorz Dwornicki
Sent: Tuesday, August 07, 2012 1:12 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] MMR issue
*
Hi

I must say this ldap replication connections look quite unusual. Can you provide more information about:
- type of replication servers? Some servers i guest are masters and some are maybe slaves?
- Does errors occur when you try to initiate replication manually?

Some errors suggests that there maybe other replication/ldap operations in progress, then target server sends message about lockout:
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=BASE": Replica in use locking_purl=conn=7831 id=3
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 replica="o=BASE": Unable to acquire replica: error: replica busy locked by conn=7831 id=3 for incremental update
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": StartNSDS90ReplicationRequest: response=1 rc=0

Other error suggest that there mey be no connection between servers. Maybe target server is to busy to respond or maybe network/firewall problem:
[19/Jul/2012:13:28:48 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Timed out). Will retry later.
[19/Jul/2012:13:34:17 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Can't contact LDAP server). Will retry later.
(...)

Please provide infromation about replication types. Try manually initiated replication and monitor logs carefully. This may provide more information. If you want to push updates from one server to others, then please consider using multi-master connections and hub server (look in red hat docs for more details)

Greg.
2012/8/7 Reinhard Nappert <rnappert@juniper.net>
Has somebody seen this problem as well?
*
-Reinhard
*
From: 389-users-bounces@lists.fedoraproject.org [mailto:389-users-bounces@lists.fedoraproject.org] On Behalf Of Reinhard Nappert
Sent: Friday, August 03, 2012 2:51 PM
To: 389-users@lists.fedoraproject.org
Subject: [389-users] MMR issue
*
Hi,
*
I have the following 389 DS version deployed:* 389-Directory/1.2.8.2 B2011.130.190
*
I have a 3 box multi-master replication setup in a ring:*******
*
*
**** ************* /******* **** /********* ***** /****** **** /** ******* /**
***********…** C **-----* *A*** -----*** B** -----* C** ----- A …
******** *****/***** ****** /***** ********* /***** **** /***** ** **/*****
*
The replication agreements for “A” and “C” and for “B” and “C” work fine, but I have an issue for the agreements for the “A” and “B” connection.
*
I see the following in the errors file:
*
Server A:
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": Begin incremental protocol
[19/Jul/2012:07:28:50 -0300] - csngen_adjust_time: gen state before 5007e1610000:1342693727:0:2
[19/Jul/2012:07:28:50 -0300] - _csngen_adjust_local_time: gen state before 5007e1610000:1342693727:0:2
[19/Jul/2012:07:28:50 -0300] - _csngen_adjust_local_time: gen state after 5007e1640000:1342693730:0:2
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=BASE": Replica in use locking_purl=conn=7831 id=3
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 replica="o=BASE": Unable to acquire replica: error: replica busy locked by conn=7831 id=3 for incremental update
[19/Jul/2012:07:28:50 -0300] NSMMReplicationPlugin - conn=7835 op=160267 repl="o=base": StartNSDS90ReplicationRequest: response=1 rc=0
*
This kind of error is logged in an interval of about 1 second, where the local_time differs 5007e1610000:1342693727:0:2
*
*
Server B:
[19/Jul/2012:13:28:48 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Timed out). Will retry later.
[19/Jul/2012:13:34:17 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Unable to receive the response for a startReplication extended operation to consumer (Can't contact LDAP server). Will retry later.
[19/Jul/2012:13:44:25 -0300] slapi_ldap_bind - Error: timeout after [0.0] seconds reading bind response for [cn=replication,cn=config] mech [SIMPLE]
[19/Jul/2012:13:44:25 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth failed: LDAP error 85 (Timed out) ((null))
[19/Jul/2012:13:44:25 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth resumed
*
Sometimes, I also see the following error
[20/Jul/2012:11:28:39 -0300] slapi_ldap_bind - Error: could not send bind request for id [cn= replication,cn=config] mech [SIMPLE]: error 91 (Can't connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 (Operation now in progress)
[20/Jul/2012:11:28:39 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth failed: LDAP error 91 (Can't connect to the LDAP server) ((null))
[20/Jul/2012:11:30:30 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (A:389): Replication bind with SIMPLE auth resumed
*
I don’t see any indication that Server B was down at that time.
*
I did see the Bug 571677 (https://bugzilla.redhat.com/show_bug.cgi?id=571677), but there was no deletion of a replicaconflict object.
*
Did anybody encounter this kind of issue? The next question would be: How to recover the MMR environment.
*
Thanks,
-Reinhard
*
*
*

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
*
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 

Thread Tools




All times are GMT. The time now is 07:25 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org