FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Fedora Directory

 
 
LinkBack Thread Tools
 
Old 04-16-2012, 05:59 PM
Russell Beall
 
Default memory consumption

Hi,
I have been working with 389 as a potential replacement for Sun DS and I have found it to be an excellent choice in every aspect except the final tests I have been running.
I am running with the current version for RedHat 6, but not the latest from the rmeggins repo:Name * * * *: 389-dsArch * * * *: noarchVersion * * : 1.2.2
Name * * * *: 389-ds-baseArch * * * *: x86_64Version * * : 1.2.9.14
I have searched through all the release notes and part of the 389-users list archive for clues as to the possible memory leaks and/or patches in the latest releases, but no information has been forthcoming.
The behavior I am seeing is a total memory consumption occurring over a large quantity of ldapmodify operations. *To test this, I reduced the size of the directory from 10GB down to about 1.7GB. *Then I set up a loop that would run an ldapmodify that would delete of most of those entries followed by an ldamodify that would add all those deleted back in (and then repeat indefinitely).
The directory starts out by loading all the entries into the cache and using a few GB of ram to hold everything. *Eventually, this loop causes the entire 32GB of ram to be consumed even though the total size of the directory does not change (e.g. currententrycachesize: 1791096898).
Replication is enabled to a consumer and the change log is up to 7.4 GB, and it doesn't seem to want to clean out the change log even with short purge times and short entry lifetimes configured, so perhaps something is at issue here. *The consumer has processed all updates and also seems to exhibit overconsumption of memory.
Are there any pointers related to this?
If there is no information about this, is there a documentation page that might instruct me in the correct way to attach valgrind to the ns-slapd process so I can see if there is some kind of huge leak?
Thanks very much,Russ.
==============================Russell Beall
Programmer Analyst IVEnterprise Identity ManagementUniversity*of*Southern Californiabeall@usc.edu=========================== ===



--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 04-16-2012, 08:01 PM
Rich Megginson
 
Default memory consumption

On 04/16/2012 11:59 AM, Russell Beall wrote:
Hi,



I have been working with 389 as a potential replacement for
Sun DS and I have found it to be an excellent choice in every
aspect except the final tests I have been running.



I am running with the current version for RedHat 6, but not
the latest from the rmeggins repo:

Name * * * *: 389-ds
Arch * * * *: noarch
Version * * : 1.2.2





Name * * * *: 389-ds-base
Arch * * * *: x86_64
Version * * : 1.2.9.14




I have searched through all the release notes and part of the
389-users list archive for clues as to the possible memory leaks
and/or patches in the latest releases, but no information has
been forthcoming.



The behavior I am seeing is a total memory consumption
occurring over a large quantity of ldapmodify operations. *To
test this, I reduced the size of the directory from 10GB down to
about 1.7GB. *Then I set up a loop that would run an ldapmodify
that would delete of most of those entries followed by an
ldamodify that would add all those deleted back in (and then
repeat indefinitely).



The directory starts out by loading all the entries into the
cache and using a few GB of ram to hold everything. *Eventually,
this loop causes the entire 32GB of ram to be consumed even
though the total size of the directory does not change (e.g.
currententrycachesize: 1791096898).



Replication is enabled to a consumer and the change log is up
to 7.4 GB, and it doesn't seem to want to clean out the change
log even with short purge times and short entry lifetimes
configured, so perhaps something is at issue here.



Can you post your purge/trimming configuration parameters?




The consumer has processed all updates and also seems to
exhibit overconsumption of memory.



Do you see any issue if you don't use replication at all?** That is,
is this issue related to replication?







Are there any pointers related to this?



Are you seeing https://fedorahosted.org/389/ticket/51 ?

When you start to see memory growth, are you using all of your
cache?







If there is no information about this, is there a
documentation page that might instruct me in the correct way to
attach valgrind to the ns-slapd process so I can see if there is
some kind of huge leak?



AFAIK you can't attach valgrind to a running process.

try this:



service dirsrv stop

( . /etc/sysconfig/dirsrv ; . /etc/sysconfig/dirsrv-INSTANCENAME ;
valgrind -q* --tool=memcheck --leak-check=yes --leak-resolution=high
--num-callers=50
--log-file=/var/log/dirsrv/slapd-INSTANCENAME/valgrind.log
/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-INSTANCENAME -i
/var/run/dirsrv/slapd-INSTANCENAME.pid -w
/var/run/dirsrv/slapd-INSTANCENAME.startpid -d 0 ) &



valgrind will log to /var/log/dirsrv/slapd-INSTANCENAME/valgrind.log



Note that running your server with valgrind will really cripple the
performance - may be unusable in a production environment - you may
also run afoul of selinux

valgrind will not report memory leaks until you shutdown the server
(just kill -15 <pid of ns-slapd or valgrind>)





Thanks very much,
Russ.



==============================




Russell Beall

Programmer Analyst IV
Enterprise Identity
Management
University*of*Southern
California
beall@usc.edu



==============================


















--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users





--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 04-16-2012, 09:22 PM
Russell Beall
 
Default memory consumption

On Apr 16, 2012, at 1:50 PM, Rich Megginson wrote:

I would still like to know which parameters you set and the values
you used.

When I first tried this, the change log was set to unlimited, (the default), and the purge delay was set to 7 days I think. *I reduced the purge delay to about 15 minutes. and about 10 minutes for the change log entry age. *These were changed using the console and then the server was restarted. *I'm not sure of this exactly because those settings were deleted when I deleted the replication agreement and stopped the changelogging.










The consumer has processed all updates and also seems
to exhibit overconsumption of memory.



Do you see any issue if you don't use replication at all?**
That is, is this issue related to replication?





Yes, I haven't seen it jump up all the way yet, but it is
up to 13GB after only a few loops.



This was without replication enabled.



When I re-enabled replication just now to see if the excess
entries would be replicated to the consumer, I see that
thousands of entries are being sent over during consumer
initialization. *It just finished replicating and sent 7639
entries. *There are only about 800 valid entries. *Searching
from the base suffix for all available current DN values on
the master results in only "# numEntries: 855"




I wonder if they are tombstone entries?* Are you doing ldapdelete
operations at all?

Could be. *Yes, the entries are deleted using ldapdelete over a file of DN values. *I was incorrect when I mentioned that I was using ldapmodify in this case.
I'm not very familiar with the concept of "tombstone" but I used dbscan to look over the id2entry file, and it contains all 7639 entries that were being sent over. *If the tombstone issue is causing the server to hold onto duplicate entries, then maybe I just have to run deletions in a different way...

After restarting the master and doing another consumer
initialization, there seems to remain a record of these excess
entries and the consumer is being initialized with them even
though the change log is currently empty at 16K.




Consumer initialization does not use the changelog - it reads the
entries directly from the database and sends them to the consumer.




Perhaps there is a problem with the way I am deleting and
adding entries and they are being duplicated behind the scenes
somehow??? *I'm just using a db2ldif export of those entries
and running ldapadd over that file...





Are there any pointers related to this?



Are you seeing*https://fedorahosted.org/389/ticket/51*?

When you start to see memory growth, are you using all of
your cache?





I am not using MMR (pretty sure unless that came enabled by
default), nor am I using GSSAPI.



The memory usage pushes well beyond the cache size.




The thing about ticket 51 (not sure if it is in the ticket, perhaps
it is in the linked bugzilla bug) is that the memory growth is only
seen _if the entry cache is maxed out_.* That is, if you are able to
keep the entry cache max size well above the actual amount of data
used, you do not see the memory growth.* We have also run valgrind
but have not seen any "real" memory leaks.* Note that entries stored
in the entry cache will be reported as "leaks" because we do not
free the entry cache at shutdown.* Our best guess for the memory
growth issues in ticket 51 is that either there is a very subtle
memory leak related to entry "churn" as entries are evicted and
stored in the entry cache, or the memory fragmentation that results
from that churn.

Initially, the entry cache was set to 12G, far in excess of the database in its reduced form.












If there is no information about this, is there a
documentation page that might instruct me in the correct
way to attach valgrind to the ns-slapd process so I can
see if there is some kind of huge leak?



AFAIK you can't attach valgrind to a running process.

try this:



service dirsrv stop

( . /etc/sysconfig/dirsrv ; .
/etc/sysconfig/dirsrv-INSTANCENAME ; valgrind -q*
--tool=memcheck --leak-check=yes --leak-resolution=high
--num-callers=50
--log-file=/var/log/dirsrv/slapd-INSTANCENAME/valgrind.log
/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-INSTANCENAME -i
/var/run/dirsrv/slapd-INSTANCENAME.pid -w
/var/run/dirsrv/slapd-INSTANCENAME.startpid -d 0 ) &



valgrind will log to
/var/log/dirsrv/slapd-INSTANCENAME/valgrind.log



Note that running your server with valgrind will really
cripple the performance - may be unusable in a production
environment - you may also run afoul of selinux

valgrind will not report memory leaks until you shutdown the
server (just kill -15 <pid of ns-slapd or valgrind>)



I'll give this a shot and see what happens.



Looks like we already have some kind of handle to the
situation since the excess entries are already being reported by
the entry caches.

? Not sure what you mean here.

This means that the server is creating and holding onto excess entries, and the server is reporting this fact. *This would be a different type of leak than lost memory where valgrind would be needed to see.
Regards,Russ.





Thanks so much for your advices!
Russ.












--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 04-16-2012, 09:51 PM
Rich Megginson
 
Default memory consumption

On 04/16/2012 03:22 PM, Russell Beall wrote:



On Apr 16, 2012, at 1:50 PM, Rich Megginson wrote:




I would still like to
know which parameters you set and the values you used.






When I first tried this, the change log was set to
unlimited, (the default), and the purge delay was set to 7
days I think. *I reduced the purge delay to about 15 minutes.
and about 10 minutes for the change log entry age. *These were
changed using the console and then the server was restarted.
*I'm not sure of this exactly because those settings were
deleted when I deleted the replication agreement and stopped
the changelogging.




ok - should be trimming something - not sure what's going on here -
could be a bug
















The consumer has processed all updates and
also seems to exhibit overconsumption of memory.



Do you see any issue if you don't use replication at
all?** That is, is this issue related to
replication?





Yes, I haven't seen it jump up all the way yet, but
it is up to 13GB after only a few loops.



This was without replication enabled.



When I re-enabled replication just now to see if
the excess entries would be replicated to the
consumer, I see that thousands of entries are being
sent over during consumer initialization. *It just
finished replicating and sent 7639 entries. *There are
only about 800 valid entries. *Searching from the base
suffix for all available current DN values on the
master results in only "# numEntries: 855"




I wonder if they are tombstone entries?* Are you doing
ldapdelete operations at all?






Could be. *Yes, the entries are deleted using ldapdelete
over a file of DN values. *I was incorrect when I mentioned
that I was using ldapmodify in this case.




Ok, then they are definitely tombstone entries.



The purge setting controls how long tombstone entries are kept
before they are cleaned up.* There is a thread inside the directory
server that runs every hour (by default) that cleans up tombstones
that are older than the purge delay.



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html-single/Administration_Guide/index.html#Multi_Master_Replication-Configuring_the_Read_Write_Replicas_on_the_Supplie r_Servers



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#Replicati on_Attributes_under_cnreplica_cnsuffixName_cnmappi ng_tree_cnconfig-nsDS5ReplicaPurgeDelay



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#Replicati on_Attributes_under_cnreplica_cnsuffixName_cnmappi ng_tree_cnconfig-nsDS5ReplicaTombstonePurgeInterval



Note that these parameters have nothing to do with the changelog -
the changelog has its own separate trimming attributes:



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#cnchangel og5-nsslapd_changelogmaxage_Max_Changelog_Age



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#cnchangel og5-nsslapd_changelogmaxentries_Max_Changelog_Records






I'm not very familiar with the concept of "tombstone" but I
used dbscan to look over the id2entry file, and it contains
all 7639 entries that were being sent over. *If the tombstone
issue is causing the server to hold onto duplicate entries,
then maybe I just have to run deletions in a different way...






After restarting the master and doing another
consumer initialization, there seems to remain a
record of these excess entries and the consumer is
being initialized with them even though the change log
is currently empty at 16K.




Consumer initialization does not use the changelog - it
reads the entries directly from the database and sends them
to the consumer.








Perhaps there is a problem with the way I am
deleting and adding entries and they are being
duplicated behind the scenes somehow??? *I'm just
using a db2ldif export of those entries and running
ldapadd over that file...





Are there any pointers related to this?



Are you seeing*https://fedorahosted.org/389/ticket/51*?

When you start to see memory growth, are you using
all of your cache?





I am not using MMR (pretty sure unless that came
enabled by default), nor am I using GSSAPI.



The memory usage pushes well beyond the cache size.




The thing about ticket 51 (not sure if it is in the ticket,
perhaps it is in the linked bugzilla bug) is that the memory
growth is only seen _if the entry cache is maxed out_.* That
is, if you are able to keep the entry cache max size well
above the actual amount of data used, you do not see the
memory growth.* We have also run valgrind but have not seen
any "real" memory leaks.* Note that entries stored in the
entry cache will be reported as "leaks" because we do not
free the entry cache at shutdown.* Our best guess for the
memory growth issues in ticket 51 is that either there is a
very subtle memory leak related to entry "churn" as entries
are evicted and stored in the entry cache, or the memory
fragmentation that results from that churn.






Initially, the entry cache was set to 12G, far in excess of
the database in its reduced form.




But do you eventually see the cache usage grow to at or near the max
cache size?


















If there is no information about this, is
there a documentation page that might instruct
me in the correct way to attach valgrind to the
ns-slapd process so I can see if there is some
kind of huge leak?



AFAIK you can't attach valgrind to a running
process.

try this:



service dirsrv stop

( . /etc/sysconfig/dirsrv ; .
/etc/sysconfig/dirsrv-INSTANCENAME ; valgrind -q*
--tool=memcheck --leak-check=yes
--leak-resolution=high --num-callers=50
--log-file=/var/log/dirsrv/slapd-INSTANCENAME/valgrind.log
/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-INSTANCENAME
-i /var/run/dirsrv/slapd-INSTANCENAME.pid -w
/var/run/dirsrv/slapd-INSTANCENAME.startpid -d 0 )
&



valgrind will log to
/var/log/dirsrv/slapd-INSTANCENAME/valgrind.log



Note that running your server with valgrind will
really cripple the performance - may be unusable in
a production environment - you may also run afoul of
selinux

valgrind will not report memory leaks until you
shutdown the server (just kill -15 <pid of
ns-slapd or valgrind>)



I'll give this a shot and see what happens.



Looks like we already have some kind of handle to the
situation since the excess entries are already being
reported by the entry caches.

? Not sure what you mean here.






This means that the server is creating and holding onto
excess entries, and the server is reporting this fact. *This
would be a different type of leak than lost memory where
valgrind would be needed to see.




The excess entries are probably tombstone entries - you should see
roughly a tombstone entry being added every time you delete an
entry.








Regards,
Russ.








Thanks so much for your advices!
Russ.



















--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users





--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 
Old 04-16-2012, 10:11 PM
Russell Beall
 
Default memory consumption

Thanks so much for your detailed information and responsiveness.
This is encouraging to find some support available to the user community and adds significantly to the likelihood of selecting this product as a replacement.
I'm going to rerun some tests at length with and without replication settings causing tombstone creation. *Now that I understand this concept, I can vary my tests to space it out a bit and give time for cleanup to happen as well as see what happens to the cache. *I'm not sure of the answer to your question of whether the entry cache was filled, though I believe it was. *I'll have to hit it again over time and see what happens.
Thanks again,Russ.
On Apr 16, 2012, at 2:51 PM, Rich Megginson wrote:




On 04/16/2012 03:22 PM, Russell Beall wrote:



On Apr 16, 2012, at 1:50 PM, Rich Megginson wrote:




I would still like to
know which parameters you set and the values you used.






When I first tried this, the change log was set to
unlimited, (the default), and the purge delay was set to 7
days I think. *I reduced the purge delay to about 15 minutes.
and about 10 minutes for the change log entry age. *These were
changed using the console and then the server was restarted.
*I'm not sure of this exactly because those settings were
deleted when I deleted the replication agreement and stopped
the changelogging.




ok - should be trimming something - not sure what's going on here -
could be a bug
















The consumer has processed all updates and
also seems to exhibit overconsumption of memory.



Do you see any issue if you don't use replication at
all?** That is, is this issue related to
replication?





Yes, I haven't seen it jump up all the way yet, but
it is up to 13GB after only a few loops.



This was without replication enabled.



When I re-enabled replication just now to see if
the excess entries would be replicated to the
consumer, I see that thousands of entries are being
sent over during consumer initialization. *It just
finished replicating and sent 7639 entries. *There are
only about 800 valid entries. *Searching from the base
suffix for all available current DN values on the
master results in only "# numEntries: 855"




I wonder if they are tombstone entries?* Are you doing
ldapdelete operations at all?






Could be. *Yes, the entries are deleted using ldapdelete
over a file of DN values. *I was incorrect when I mentioned
that I was using ldapmodify in this case.




Ok, then they are definitely tombstone entries.



The purge setting controls how long tombstone entries are kept
before they are cleaned up.* There is a thread inside the directory
server that runs every hour (by default) that cleans up tombstones
that are older than the purge delay.



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html-single/Administration_Guide/index.html#Multi_Master_Replication-Configuring_the_Read_Write_Replicas_on_the_Supplie r_Servers



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#Replicati on_Attributes_under_cnreplica_cnsuffixName_cnmappi ng_tree_cnconfig-nsDS5ReplicaPurgeDelay



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#Replicati on_Attributes_under_cnreplica_cnsuffixName_cnmappi ng_tree_cnconfig-nsDS5ReplicaTombstonePurgeInterval



Note that these parameters have nothing to do with the changelog -
the changelog has its own separate trimming attributes:



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#cnchangel og5-nsslapd_changelogmaxage_Max_Changelog_Age



http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#cnchangel og5-nsslapd_changelogmaxentries_Max_Changelog_Records






I'm not very familiar with the concept of "tombstone" but I
used dbscan to look over the id2entry file, and it contains
all 7639 entries that were being sent over. *If the tombstone
issue is causing the server to hold onto duplicate entries,
then maybe I just have to run deletions in a different way...






After restarting the master and doing another
consumer initialization, there seems to remain a
record of these excess entries and the consumer is
being initialized with them even though the change log
is currently empty at 16K.




Consumer initialization does not use the changelog - it
reads the entries directly from the database and sends them
to the consumer.








Perhaps there is a problem with the way I am
deleting and adding entries and they are being
duplicated behind the scenes somehow??? *I'm just
using a db2ldif export of those entries and running
ldapadd over that file...





Are there any pointers related to this?



Are you seeing*https://fedorahosted.org/389/ticket/51*?

When you start to see memory growth, are you using
all of your cache?





I am not using MMR (pretty sure unless that came
enabled by default), nor am I using GSSAPI.



The memory usage pushes well beyond the cache size.




The thing about ticket 51 (not sure if it is in the ticket,
perhaps it is in the linked bugzilla bug) is that the memory
growth is only seen _if the entry cache is maxed out_.* That
is, if you are able to keep the entry cache max size well
above the actual amount of data used, you do not see the
memory growth.* We have also run valgrind but have not seen
any "real" memory leaks.* Note that entries stored in the
entry cache will be reported as "leaks" because we do not
free the entry cache at shutdown.* Our best guess for the
memory growth issues in ticket 51 is that either there is a
very subtle memory leak related to entry "churn" as entries
are evicted and stored in the entry cache, or the memory
fragmentation that results from that churn.






Initially, the entry cache was set to 12G, far in excess of
the database in its reduced form.




But do you eventually see the cache usage grow to at or near the max
cache size?


















If there is no information about this, is
there a documentation page that might instruct
me in the correct way to attach valgrind to the
ns-slapd process so I can see if there is some
kind of huge leak?



AFAIK you can't attach valgrind to a running
process.

try this:



service dirsrv stop

( . /etc/sysconfig/dirsrv ; .
/etc/sysconfig/dirsrv-INSTANCENAME ; valgrind -q*
--tool=memcheck --leak-check=yes
--leak-resolution=high --num-callers=50
--log-file=/var/log/dirsrv/slapd-INSTANCENAME/valgrind.log
/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-INSTANCENAME
-i /var/run/dirsrv/slapd-INSTANCENAME.pid -w
/var/run/dirsrv/slapd-INSTANCENAME.startpid -d 0 )
&



valgrind will log to
/var/log/dirsrv/slapd-INSTANCENAME/valgrind.log



Note that running your server with valgrind will
really cripple the performance - may be unusable in
a production environment - you may also run afoul of
selinux

valgrind will not report memory leaks until you
shutdown the server (just kill -15 <pid of
ns-slapd or valgrind>)



I'll give this a shot and see what happens.



Looks like we already have some kind of handle to the
situation since the excess entries are already being
reported by the entry caches.

? Not sure what you mean here.






This means that the server is creating and holding onto
excess entries, and the server is reporting this fact. *This
would be a different type of leak than lost memory where
valgrind would be needed to see.




The excess entries are probably tombstone entries - you should see
roughly a tombstone entry being added every time you delete an
entry.








Regards,
Russ.








Thanks so much for your advices!
Russ.



















--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users






--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
 

Thread Tools




All times are GMT. The time now is 07:34 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org