FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Fedora Directory

 
 
LinkBack Thread Tools
 
Old 06-15-2010, 05:04 PM
Rich Megginson
 
Default how to quickly recover from a corrupt database in multiple master configuration

mark benschop wrote:
> Hi All,
>
> I'm having a problem on a CentOs Directory Server 8.1 multiple master
> setup.
> The database of one of the servers has been marked as corrupt and has
> been brought offline by the Directory Server.
Can you post any relevant error messages from the error log of the server?
> Ldapclients querying the ldapserver for e.g. loggin in of users get an
> errormessage, effectively disabling users to log in.
What error message?
>
> I'm wondering what the best method is to recover from this situation.
> I can think of a few :
> 1) Starting the ldapserver, deleting the database, recreating it and
> restoring a backup.

> 2) Starting the ldapserver, deleting the database and reinitialising
> the server from the other master.
If you reinitialize the problem server from another server, you don't
need to delete the database, reinit will do that for you.
>
> Can anyone give me some hints if this wil work or would another
> approach be better ?
>
> Thanks for your advise,
> Mark
> ------------------------------------------------------------------------
>
> --
> 389 users mailing list
> 389-users@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 06-16-2010, 07:39 PM
Rich Megginson
 
Default how to quickly recover from a corrupt database in multiple master configuration

mark benschop wrote:
> Hi Rich,
> Thanks for your reply.
> Please find the logging from the problems below.
> The serverb55 is one of 2 servers in a multiple masters configuration
> that consists of serverb55 and serverb05.
>
> The problem I inititially had was that I had 2 entries that could not
> be deleted serverb55.
>
> Here's logging from the access file.
> ================================================== =====================
> access.20100614-092820:[15/Jun/2010:09:20:49 +0200] conn=342177 op=7
> SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=0
> filter="(objectClass=*)" attrs=ALL
> access.20100614-092820:[15/Jun/2010:09:20:49 +0200] conn=342177 op=7
> RESULT err=0 tag=101 nentries=1 etime=0
> access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=8
> SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=1
> filter="(objectClass=*)" attrs="objectClass"
> access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=8
> RESULT err=0 tag=101 nentries=0 etime=0 notes=U
> access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=9
> DEL dn="uid=dbeijk, ou=people, dc=directory,dc=intern"
> access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=9
> RESULT err=1 tag=107 nentries=0 etime=0 csn=4c172a21000000370000
> access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=10
> SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=1
> filter="(objectClass=*)" attrs="objectClass"
> access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=10
> RESULT err=0 tag=101 nentries=0 etime=0 notes=U
> ================================================== =====================
>
> LDAP error 1 i found means 'unwiling to perform'. First I thought
> something might be wrong with the entry itself.
> The error log found in the error log from the serverb55 I've added
> below seemed to point in that direction.
>
>
> When I logged on the the other ldapserver, serverb05, I tried to
> delete the same entry to see if this slapd had the same issue but here
> it worked.
> Replicating the delete didn't. The following error was logged to the
> errorlog of this :
>
> ================================================== ======================
> [15/Jun/2010:09:35:17 +0200] NSMMReplicationPlugin -
> agmt="cn=serverb55" (serverb55:636): Consumer failed to replay change
> (uniqueid a276337c-5dc511df-852cfef8-667fa4d4, CSN
> 4c172d36000000050000): Operations error. Will retry later.
> ================================================== =====================
>
> So there seemed to be a problem with the serverb55 only.
> Since I assumed the database got somehow corrupt or inconsistent I've
> tried the following steps to try and recreate the database or had it
> checked in order to get it right again.
> First there's the errors from the account that could not be deleted.
> I 'reinitialised the consumer' from the working serverb05 to the
> problematic serverb55.
> Then I restarted the slapd.
> Made an export of the database and imported that.
> Slapd stopped the database.
>
> Please find the logging from /var/log/dirsrv/slapd-serverb55/errors
> from the actions leading to the problem of the fatal server stop.
> ================================================== ====================
> CentOS-Directory/8.1.0
> B2009.134.1334
>
> serverb55:636
> (/etc/dirsrv/slapd-serverb55)
>
>
>
> [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "uidNumber" required by
> object class "posixAccount"
> [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "gidNumber" required by
> object class "posixAccount"
> [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" -- attribute "uidNumber" not allowed
> [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "uid" required by object
> class "posixAccount"
> [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "cn" required by object
> class "posixAccount"
> [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "homeDirectory" required by
> object class "posixAccount"
> [15/Jun/2010:09:23:04 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "uidNumber" required by
> object class "posixAccount"
> [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "uidNumber" required by
> object class "posixAccount"
> [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "gidNumber" required by
> object class "posixAccount"
> [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" required attribute "objectclass" missing
> [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" required attribute "objectclass" missing
> [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "uid" required by object
> class "posixAccount"
> [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "cn" required by object
> class "posixAccount"
> [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People,
> dc=directory,dc=intern" missing attribute "homeDirectory" required by
> object class "posixAccount"
> [15/Jun/2010:09:24:56 +0200] - Entry "uid=DEL *.*, ou=People,
> dc=directory,dc=intern" missing attribute "homeDirectory" required by
> object class "posixAccount"
> [15/Jun/2010:09:50:20 +0200] - Entry "cn=wchiman, ou=people,
> dc=directory,dc=intern" -- attribute "uidNumber" not allowed
> [15/Jun/2010:10:12:43 +0200] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=directory,dc=intern is going
> offline; disabling replication
> [15/Jun/2010:10:12:43 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:10:12:43 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:10:12:43 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:10:12:43 +0200] - WARNING: Import is running with
> nsslapd-db-private-import-mem on; No other process is allowed to
> access the database
> [15/Jun/2010:10:12:48 +0200] - import userRoot: Workers finished;
> cleaning up...
> [15/Jun/2010:10:12:49 +0200] - import userRoot: Workers cleaned up.
> [15/Jun/2010:10:12:49 +0200] - import userRoot: Indexing complete.
> Post-processing...
> [15/Jun/2010:10:12:49 +0200] - import userRoot: Flushing caches...
> [15/Jun/2010:10:12:49 +0200] - import userRoot: Closing files...
> [15/Jun/2010:10:12:49 +0200] - import userRoot: Import complete.
> Processed 4849 entries in 6 seconds. (808.17 entries/sec)
> [15/Jun/2010:10:12:49 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:10:12:49 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:10:12:49 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:10:12:49 +0200] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=directory,dc=intern is coming
> online; enabling replication
> [15/Jun/2010:10:12:49 +0200] NSMMReplicationPlugin -
> replica_reload_ruv: Warning: new data for replica
> dc=directory,dc=intern does not match the data in the changelog.
> Recreating the changelog file. This could affect replication with
> replica's consumers in which case the consumers should be reinitialized.
> [15/Jun/2010:10:12:49 +0200] - skipping cos definition
> cn=nsAccountInactivation_cos,dc=directory,dc=inter n--no templates found
> [15/Jun/2010:10:55:43 +0200] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=directory,dc=intern is going
> offline; disabling replication
> [15/Jun/2010:10:55:43 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:10:55:43 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:10:55:43 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:10:55:43 +0200] - WARNING: Import is running with
> nsslapd-db-private-import-mem on; No other process is allowed to
> access the database
> [15/Jun/2010:10:55:49 +0200] - import userRoot: Workers finished;
> cleaning up...
> [15/Jun/2010:10:55:49 +0200] - import userRoot: Workers cleaned up.
> [15/Jun/2010:10:55:49 +0200] - import userRoot: Indexing complete.
> Post-processing...
> [15/Jun/2010:10:55:49 +0200] - import userRoot: Flushing caches...
> [15/Jun/2010:10:55:49 +0200] - import userRoot: Closing files...
> [15/Jun/2010:10:55:49 +0200] - import userRoot: Import complete.
> Processed 4850 entries in 5 seconds. (970.00 entries/sec)
> [15/Jun/2010:10:55:49 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:10:55:49 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:10:55:49 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:10:55:49 +0200] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=directory,dc=intern is coming
> online; enabling replication
> [15/Jun/2010:10:55:49 +0200] NSMMReplicationPlugin -
> replica_reload_ruv: Warning: new data for replica
> dc=directory,dc=intern does not match the data in the changelog.
> Recreating the changelog file. This could affect replication with
> replica's consumers in which case the consumers should be reinitialized.
> [15/Jun/2010:10:55:49 +0200] - skipping cos definition
> cn=nsAccountInactivation_cos,dc=directory,dc=inter n--no templates found
> [15/Jun/2010:10:59:57 +0200] - slapd shutting down - signaling
> operation threads
> [15/Jun/2010:10:59:57 +0200] - slapd shutting down - waiting for 26
> threads to terminate
> [15/Jun/2010:10:59:57 +0200] - slapd shutting down - closing down
> internal subsystems and plugins
> [15/Jun/2010:10:59:58 +0200] - Waiting for 4 database threads to stop
> [15/Jun/2010:10:59:59 +0200] - All database threads now stopped
> [15/Jun/2010:10:59:59 +0200] - slapd stopped.
> CentOS-Directory/8.1.0 B2009.134.1334
> <host>:<port> (/etc/dirsrv/slapd-serverb55)
>
> [15/Jun/2010:11:00:01 +0200] - Entry "cn=schema" single-valued
> attribute "modifyTimestamp" has multiple values
> CentOS-Directory/8.1.0 B2009.134.1334
> serverb55:636 (/etc/dirsrv/slapd-serverb55)
>
> [15/Jun/2010:11:00:01 +0200] - CentOS-Directory/8.1.0 B2009.134.1334
> starting up
> [15/Jun/2010:11:00:01 +0200] - I'm resizing my cache now...cache was
> 20000000 and is now 8000000
> [15/Jun/2010:11:00:01 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:11:00:01 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:11:00:01 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:11:00:01 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:11:00:01 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:11:00:01 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:11:00:01 +0200] - skipping cos definition
> cn=nsAccountInactivation_cos,dc=directory,dc=inter n--no templates found
> [15/Jun/2010:11:00:01 +0200] NSMMReplicationPlugin -
> replica_check_for_data_reload: Warning: data for replica
> dc=directory,dc=intern was reloaded and it no longer matches the data
> in the changelog (replica data > changelog). Recreating the changelog
> file. This could affect replication with replica's consumers in which
> case the consumers should be reinitialized.
> [15/Jun/2010:11:00:01 +0200] - skipping cos definition
> cn=nsAccountInactivation_cos,dc=directory,dc=inter n--no templates found
> [15/Jun/2010:11:00:01 +0200] - slapd started. Listening on All
> Interfaces port 389 for LDAP requests
> [15/Jun/2010:11:00:01 +0200] - Listening on All Interfaces port 636
> for LDAPS requests
> [15/Jun/2010:11:29:59 +0200] - slapd shutting down - signaling
> operation threads
> [15/Jun/2010:11:29:59 +0200] - slapd shutting down - closing down
> internal subsystems and plugins
> [15/Jun/2010:11:30:00 +0200] - Waiting for 4 database threads to stop
> [15/Jun/2010:11:30:00 +0200] - All database threads now stopped
> [15/Jun/2010:11:30:00 +0200] - slapd stopped.
> CentOS-Directory/8.1.0 B2009.134.1334
> <host>:<port> (/etc/dirsrv/slapd-serverb55)
>
> [15/Jun/2010:11:30:03 +0200] - Entry "cn=schema" single-valued
> attribute "modifyTimestamp" has multiple values
> CentOS-Directory/8.1.0 B2009.134.1334
> serverb55:636 (/etc/dirsrv/slapd-serverb55)
>
> [15/Jun/2010:11:30:03 +0200] - CentOS-Directory/8.1.0 B2009.134.1334
> starting up
> [15/Jun/2010:11:30:03 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:11:30:03 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:11:30:03 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:11:30:03 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:11:30:03 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:11:30:03 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:11:30:03 +0200] - skipping cos definition
> cn=nsAccountInactivation_cos,dc=directory,dc=inter n--no templates found
> [15/Jun/2010:11:30:03 +0200] - skipping cos definition
> cn=nsAccountInactivation_cos,dc=directory,dc=inter n--no templates found
> [15/Jun/2010:11:30:03 +0200] - slapd started. Listening on All
> Interfaces port 389 for LDAP requests
> [15/Jun/2010:11:30:03 +0200] - Listening on All Interfaces port 636
> for LDAPS requests
> [15/Jun/2010:11:40:44 +0200] - Beginning export of 'userroot'
> [15/Jun/2010:11:40:44 +0200] - export userRoot: Processed 139 entries
> (100%).
> [15/Jun/2010:11:40:44 +0200] - Export finished.
> [15/Jun/2010:11:46:12 +0200] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=directory,dc=intern is going
> offline; disabling replication
> [15/Jun/2010:11:46:12 +0200] - attrcrypt_unwrap_key: failed to unwrap
> key for cipher AES
> [15/Jun/2010:11:46:12 +0200] - Failed to retrieve key for cipher AES
> in attrcrypt_cipher_init
> [15/Jun/2010:11:46:12 +0200] - Failed to initialize cipher AES in
> attrcrypt_init
> [15/Jun/2010:11:46:12 +0200] - WARNING: Import is running with
> nsslapd-db-private-import-mem on; No other process is allowed to
> access the database
> [15/Jun/2010:11:46:14 +0200] - libdb: page 1: illegal page type or format
> [15/Jun/2010:11:46:14 +0200] - libdb: PANIC: Invalid argument
> [15/Jun/2010:11:46:14 +0200] - FATAL ERROR at by MCC ou=people
> dc=directory dc=intern (77); server stopping as database recovery needed.
> CentOS-Directory/8.1.0 B2009.134.1334
> <host>:<port> (/etc/dirsrv/slapd-serverb55
> ================================================== ====================
>
> Finally my questions :
> What could be the cause of the problem ?
Not sure - never seen this particular error before - any disk errors in
/var/log/messages?

Can you try doing a reinit of this server from the other server?
> What would be the best procedure to get the serverb55 up and running
> again ?
Doing a reinit of this server from the other server.
>
> Thanks for any advise.
>
> Regards,
> Mark
>
>
>
> =======
>
>
> On Tue, Jun 15, 2010 at 7:04 PM, Rich Megginson <rmeggins@redhat.com
> <mailto:rmeggins@redhat.com>> wrote:
>
> mark benschop wrote:
> > Hi All,
> >
> > I'm having a problem on a CentOs Directory Server 8.1 multiple
> master
> > setup.
> > The database of one of the servers has been marked as corrupt
> and has
> > been brought offline by the Directory Server.
> Can you post any relevant error messages from the error log of the
> server?
> > Ldapclients querying the ldapserver for e.g. loggin in of users
> get an
> > errormessage, effectively disabling users to log in.
> What error message?
> >
> > I'm wondering what the best method is to recover from this
> situation.
> > I can think of a few :
> > 1) Starting the ldapserver, deleting the database, recreating it and
> > restoring a backup.
>
> > 2) Starting the ldapserver, deleting the database and reinitialising
> > the server from the other master.
> If you reinitialize the problem server from another server, you don't
> need to delete the database, reinit will do that for you.
> >
> > Can anyone give me some hints if this wil work or would another
> > approach be better ?
> >
> > Thanks for your advise,
> > Mark
> >
> ------------------------------------------------------------------------
> >
> > --
> > 389 users mailing list
> > 389-users@lists.fedoraproject.org
> <mailto:389-users@lists.fedoraproject.org>
> > https://admin.fedoraproject.org/mailman/listinfo/389-users
>
> --
> 389 users mailing list
> 389-users@lists.fedoraproject.org
> <mailto:389-users@lists.fedoraproject.org>
> https://admin.fedoraproject.org/mailman/listinfo/389-users
>
>
> ------------------------------------------------------------------------
>
> --
> 389 users mailing list
> 389-users@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 

Thread Tools




All times are GMT. The time now is 08:23 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org