FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Fedora User

 
 
LinkBack Thread Tools
 
Old 04-27-2012, 09:10 PM
Konstantin Svist
 
Default Memory loss after long uptime

Hi all,

I have a strange recurring problem on some of my servers, maybe someone
can help me figure out what might be causing it.


After running a MySQL machine with fairly high load for a while
(~month), RAM usage in top stops making sense. Normally, RES column
accounts for everything currently present in RAM (not swap) and
corresponds pretty well with mem_used-buffers-cache. This is always the
case after a fresh boot and seems to be the case on most other servers.

But that's not the case here:

59098540k used - 1633832k cached - 25824k buffers = 57438884 (54.7G)
supposedly used by all processes

sum(RES column from top) ~= 35127296k (33.5G)

So where's the other 21G??

I'm pretty sure I'm not nitpicking here, this is >30% of total system
RAM unaccounted for. I've tried stopping all non-OS specific processes
(and restarting some services that seemed to eat more RAM that they
should have (irqbalance)) -- and memory is not reclaimed.


Over the few years that I've seen this problem, I've already replaced
all the hardware and upgraded Fedora from 8 to 14 (currently running
2.6.35.14-95.fc14.x86_64) and MySQL and other code without any sign of
improvement.


Interestingly, another machine with same hardware & OS runs MySQL in
slave mode to replicate the DB -- that machine has uptime of 134 days
and does not exhibit the same symptoms. In fact, here is its mem footprint:


63166496k used - 19786032k cached - 2881484k buffers = 40498980k (38.6G)
sum(RES column from top) ~= 45.5G (which makes sense since a few
RAM-hungry processes share memory)


Please help!

top - 13:19:00 up 41 days, 2:29, 25 users, load average: 3.44, 4.32, 3.97
Tasks: 349 total, 2 running, 347 sleeping, 0 stopped, 0 zombie
Cpu0 : 76.5%us, 0.0%sy, 0.0%ni, 23.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.0%us, 5.9%sy, 0.0%ni, 29.4%id, 64.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 5.9%us, 0.0%sy, 0.0%ni, 88.2%id, 5.9%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 5.6%us, 0.0%sy, 0.0%ni, 94.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 5.3%us, 5.3%sy, 15.8%ni, 73.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 66110332k total, 59098540k used, 7011792k free, 25824k buffers
Swap: 8388604k total, 4009604k used, 4379000k free, 1633832k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3953 mysql 20 0 34.7g 30g 4000 S 80.0 48.5 57931:46 /usr/local/mysql/libexec/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/var --user=mysql --log-error=/usr/loc
3132 wrkr 20 0 2931m 2.4g 1476 R 5.7 3.8 1219:32 /usr/local/bin/python ...
10921 wrkr 20 0 9184m 483m 3080 S 0.0 0.7 43:03.02 /usr/local/bin/python ...
26682 wrkr 30 10 261m 118m 1048 S 11.4 0.2 11:47.18 mytop
2548 wrkr 20 0 520m 94m 1388 S 0.0 0.1 116:57.50 /usr/local/bin/python ...
4046 wrkr 20 0 181m 59m 228 S 0.0 0.1 17:30.62 SCREEN -a -A
3009 redis 20 0 132m 45m 380 S 0.0 0.1 974:29.29 /usr/local/redis/bin/redis-server ...conf
4228 nobody 20 0 302m 32m 132 S 0.0 0.1 0:14.33 nginx: worker process
4232 nobody 20 0 302m 32m 44 S 0.0 0.1 0:10.04 nginx: worker process
4230 nobody 20 0 302m 32m 0 S 0.0 0.1 0:14.73 nginx: worker process
4226 nobody 20 0 302m 32m 0 S 0.0 0.1 0:14.49 nginx: worker process
4231 nobody 20 0 302m 32m 56 S 0.0 0.1 0:13.01 nginx: worker process
4233 nobody 20 0 302m 32m 60 S 0.0 0.1 0:14.36 nginx: worker process
4235 nobody 20 0 302m 32m 32 S 0.0 0.1 0:14.35 nginx: worker process
4229 nobody 20 0 298m 32m 0 S 0.0 0.1 0:15.37 nginx: worker process
4227 nobody 20 0 302m 32m 12 S 0.0 0.1 0:15.49 nginx: worker process
4234 nobody 20 0 302m 32m 0 S 0.0 0.1 0:15.46 nginx: worker process
10300 root 20 0 298m 32m 0 S 0.0 0.1 0:00.28 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
3185 wrkr 20 0 255m 17m 416 S 0.0 0.0 18:12.48 /usr/local/bin/python ...
3067 wrkr 20 0 247m 12m 0 S 0.0 0.0 12:23.26 /usr/local/bin/python ...
5186 gdm 20 0 443m 9644 28 S 0.0 0.0 1:26.96 /usr/libexec/gnome-settings-daemon --gconf-prefix=/apps/gdm/simple-greeter/settings-manager-plugins
4031 root 20 0 133m 9564 684 S 0.0 0.0 2:32.52 /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-205vfs/database -nolisten tcp vt1
5280 gdm 20 0 547m 5728 2724 S 0.0 0.0 2:38.00 /usr/libexec/gdm-simple-greeter
30374 root 25 5 130m 5476 1716 S 0.0 0.0 0:00.21 /usr/local/mysql/bin/mysql --port=3306 ...
3740 root 20 0 29888 3228 0 S 0.0 0.0 0:00.05 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/usr/local/mysql/var --pid-file=
2019 root 20 0 200m 3028 672 S 0.0 0.0 20:43.13 /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid
27868 root 20 0 22632 2816 0 S 0.0 0.0 0:00.03 tail -F /usr/local/mysql/var/log-error.err
23248 root 20 0 22632 2704 0 S 0.0 0.0 0:00.00 tail -F /usr/local/mysql/var/log-error.err
30435 root 25 5 127m 2560 1716 S 0.0 0.0 0:00.02 /usr/local/mysql/bin/mysql --port=3306 ...
5116 gdm 20 0 139m 2320 244 S 0.0 0.0 0:14.86 /usr/libexec/gconfd-2
26225 root 20 0 129m 2164 980 S 0.0 0.0 0:00.18 zsh
4357 root 20 0 130m 2140 680 S 0.0 0.0 0:01.55 /bin/zsh
5266 gdm 20 0 390m 1912 392 S 0.0 0.0 0:21.57 metacity
15765 root 20 0 130m 1732 532 S 0.0 0.0 0:00.34 /bin/zsh
28228 root 20 0 94312 1592 524 S 0.0 0.0 0:01.02 sendmail: accepting connections
4312 root 20 0 130m 1396 60 S 0.0 0.0 0:00.86 /bin/zsh
30437 root 20 0 15272 1332 828 R 11.4 0.0 0:00.10 top
30370 root 25 5 107m 1320 1124 S 0.0 0.0 0:00.00 /bin/bash /opt/...sh
30431 root 25 5 107m 1320 1124 S 0.0 0.0 0:00.00 /bin/bash /opt/...sh
11212 gdm 20 0 343m 1300 240 S 0.0 0.0 0:00.91 /usr/bin/gnome-screensaver --no-daemon
4270 root 20 0 130m 1252 4 S 0.0 0.0 0:03.06 /bin/zsh
27698 smmsp 20 0 75960 1236 372 S 0.0 0.0 0:00.00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
27407 root 20 0 114m 1172 864 S 0.0 0.0 0:14.99 /bin/zsh /opt/...sh
27408 root 20 0 114m 1144 864 S 0.0 0.0 0:07.28 /bin/zsh /opt/...sh
--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 04-27-2012, 10:52 PM
Rick Stevens
 
Default Memory loss after long uptime

On 04/27/2012 02:10 PM, Konstantin Svist wrote:

Hi all,

I have a strange recurring problem on some of my servers, maybe someone
can help me figure out what might be causing it.

After running a MySQL machine with fairly high load for a while
(~month), RAM usage in top stops making sense. Normally, RES column
accounts for everything currently present in RAM (not swap) and
corresponds pretty well with mem_used-buffers-cache. This is always the
case after a fresh boot and seems to be the case on most other servers.
But that's not the case here:

59098540k used - 1633832k cached - 25824k buffers = 57438884 (54.7G)
supposedly used by all processes
sum(RES column from top) ~= 35127296k (33.5G)

So where's the other 21G??

I'm pretty sure I'm not nitpicking here, this is >30% of total system
RAM unaccounted for. I've tried stopping all non-OS specific processes
(and restarting some services that seemed to eat more RAM that they
should have (irqbalance)) -- and memory is not reclaimed.

Over the few years that I've seen this problem, I've already replaced
all the hardware and upgraded Fedora from 8 to 14 (currently running
2.6.35.14-95.fc14.x86_64) and MySQL and other code without any sign of
improvement.

Interestingly, another machine with same hardware & OS runs MySQL in
slave mode to replicate the DB -- that machine has uptime of 134 days
and does not exhibit the same symptoms. In fact, here is its mem footprint:

63166496k used - 19786032k cached - 2881484k buffers = 40498980k (38.6G)
sum(RES column from top) ~= 45.5G (which makes sense since a few
RAM-hungry processes share memory)


We've seen this before in some of our MySQL platforms. It appears that
some of the older MySQL servers hemorrhage shared memory segments which
will not be reclaimed when the process is terminated.

You might try having a look at the output of ipcs after stopping MySQL
and see if your missing memory is in one or more of the shm segments.
If so, you can reclaim it by using "ipcrm -m <shmid>". You'd be
surprised at how many programs don't release IPC resources--especially
if they are rudely terminated (e.g. SIGSEGV or SIGKILL).

Also have a serious look at updating your MySQL platform if possible.
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com -
- AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 -
- -
- Try to look unimportant. The bad guys may be low on ammo. -
----------------------------------------------------------------------
--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 04-28-2012, 01:20 AM
Konstantin Svist
 
Default Memory loss after long uptime

On 04/27/2012 03:52 PM, Rick Stevens wrote:

On 04/27/2012 02:10 PM, Konstantin Svist wrote:

Hi all,

I have a strange recurring problem on some of my servers, maybe someone
can help me figure out what might be causing it.

After running a MySQL machine with fairly high load for a while
(~month), RAM usage in top stops making sense. Normally, RES column
accounts for everything currently present in RAM (not swap) and
corresponds pretty well with mem_used-buffers-cache. This is always the
case after a fresh boot and seems to be the case on most other servers.
But that's not the case here:

59098540k used - 1633832k cached - 25824k buffers = 57438884 (54.7G)
supposedly used by all processes
sum(RES column from top) ~= 35127296k (33.5G)

So where's the other 21G??

I'm pretty sure I'm not nitpicking here, this is >30% of total system
RAM unaccounted for. I've tried stopping all non-OS specific processes
(and restarting some services that seemed to eat more RAM that they
should have (irqbalance)) -- and memory is not reclaimed.

Over the few years that I've seen this problem, I've already replaced
all the hardware and upgraded Fedora from 8 to 14 (currently running
2.6.35.14-95.fc14.x86_64) and MySQL and other code without any sign of
improvement.

Interestingly, another machine with same hardware & OS runs MySQL in
slave mode to replicate the DB -- that machine has uptime of 134 days
and does not exhibit the same symptoms. In fact, here is its mem
footprint:


63166496k used - 19786032k cached - 2881484k buffers = 40498980k (38.6G)
sum(RES column from top) ~= 45.5G (which makes sense since a few
RAM-hungry processes share memory)


We've seen this before in some of our MySQL platforms. It appears that
some of the older MySQL servers hemorrhage shared memory segments which
will not be reclaimed when the process is terminated.

You might try having a look at the output of ipcs after stopping MySQL
and see if your missing memory is in one or more of the shm segments.
If so, you can reclaim it by using "ipcrm -m <shmid>". You'd be
surprised at how many programs don't release IPC resources--especially
if they are rudely terminated (e.g. SIGSEGV or SIGKILL).

Also have a serious look at updating your MySQL platform if possible.
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com -
- AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 -
- -
- Try to look unimportant. The bad guys may be low on ammo. -
----------------------------------------------------------------------



Thanks Rick,

I tried ipcs on a few machines with same symptoms, but it doesn't look
like there's a lot of shared memory mentioned there.
The MySQL machine had already been rebooted, so I'll keep checking it...
but doesn't look promising so far.



--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 04-28-2012, 05:02 AM
Chris Adams
 
Default Memory loss after long uptime

Once upon a time, Rick Stevens <ricks@alldigital.com> said:
> You might try having a look at the output of ipcs after stopping MySQL
> and see if your missing memory is in one or more of the shm segments.
> If so, you can reclaim it by using "ipcrm -m <shmid>". You'd be
> surprised at how many programs don't release IPC resources--especially
> if they are rudely terminated (e.g. SIGSEGV or SIGKILL).

Are you sure? I've been running MySQL for a long time, and I don't
remember it ever using SysV IPC. I certainly don't see it using that
now, even on an old version I still have running (much older than the
OP's F14).

To the OP: if you think you are seeing RAM in use that isn't reflected
when comparing the output of "free" to "ps"/"top" process usage, it
could be in other kernel buffers. Check out "slabtop" (has to run as
root); there are other kernel caches that "free" doesn't know about,
especially the dentry and inode caches. These will show up as just more
kernel RAM in use, but really they are caches that should be discarded
as needed (just like the old buffers/cache lines in "free").
--
Chris Adams <cmadams@hiwaay.net>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.
--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 
Old 04-30-2012, 05:55 PM
Konstantin Svist
 
Default Memory loss after long uptime

On 04/27/2012 10:02 PM, Chris Adams wrote:

Once upon a time, Rick Stevens<ricks@alldigital.com> said:

You might try having a look at the output of ipcs after stopping MySQL
and see if your missing memory is in one or more of the shm segments.
If so, you can reclaim it by using "ipcrm -m<shmid>". You'd be
surprised at how many programs don't release IPC resources--especially
if they are rudely terminated (e.g. SIGSEGV or SIGKILL).

Are you sure? I've been running MySQL for a long time, and I don't
remember it ever using SysV IPC. I certainly don't see it using that
now, even on an old version I still have running (much older than the
OP's F14).

To the OP: if you think you are seeing RAM in use that isn't reflected
when comparing the output of "free" to "ps"/"top" process usage, it
could be in other kernel buffers. Check out "slabtop" (has to run as
root); there are other kernel caches that "free" doesn't know about,
especially the dentry and inode caches. These will show up as just more
kernel RAM in use, but really they are caches that should be discarded
as needed (just like the old buffers/cache lines in "free").



Here's slabtop from one machine that's using too much RAM:

Active / Total Objects (% used) : 8516783 / 8967394 (95.0%)
Active / Total Slabs (% used) : 270701 / 270701 (100.0%)
Active / Total Caches (% used) : 68 / 101 (67.3%)
Active / Total Size (% used) : 5113995.32K / 5341810.93K (95.7%)
Minimum / Average / Maximum Object : 0.01K / 0.59K / 8.00K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
3836868 3691157 96% 1.00K 119903 32 3836896K kmalloc-1024
3624368 3573469 98% 0.25K 113262 32 906096K kmalloc-256
570560 553390 96% 0.50K 17830 32 285280K kmalloc-512
197526 196494 99% 0.55K 3468 57 110976K radix_tree_node
140992 114638 81% 0.12K 4406 32 17624K kmalloc-128
113856 29876 26% 0.06K 1779 64 7116K kmalloc-64
40448 40445 99% 0.01K 79 512 316K kmalloc-8
35904 35485 98% 0.08K 704 51 2816K sysfs_dir_cache
32724 26778 81% 1.69K 1818 18 58176K TCP
29190 16929 57% 0.19K 695 42 5560K dentry
26112 16386 62% 0.02K 102 256 408K ext4_io_page
...


Looks like the kmalloc-##s are taking up most of the space.
Question is, what (if anything) can be done about this on a running
system? Any way to reclaim that memory?



--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org
 

Thread Tools




All times are GMT. The time now is 07:12 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org