FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > EXT3 Users

 
 
LinkBack Thread Tools
 
Old 06-09-2008, 05:33 PM
 
Default 2GB memory limit running fsck on a +6TB device

Dear Srs,

That's the scenario: +6TB device on a 3ware 9550SX RAID controller, running
Debian Etch 32bits, with 2.6.25.4 kernel, and defaults e2fsprogs version,
"1.39+1.40-WIP-2006.11.14+dfsg-2etch1".

Running "tune2fs" returns that filesystem is in EXT3_ERROR_FS state, "clean
with errors":

# tune2fs -l /dev/sda4
tune2fs 1.40.10 (21-May-2008)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 7701b70e-f776-417b-bf31-3693dba56f86
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal dir_index filetype needs_recovery
sparse_super large_file
Default mount options: (none)
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 792576000
Block count: 1585146848

It's a backup storage server, with more than 113 million files, this's the
output of "df -i":

# df -i /backup/
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda4 792576000 113385959 679190041 15% /backup


Running fsck.ext3 or fsck.ext2 I get:

# fsck.ext3 /dev/sda4
e2fsck 1.40.10 (21-May-2008)
Adding dirhash hint to filesystem.

/dev/sda4 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Error allocating directory block array: Memory allocation failed
e2fsck: aborted

With some straces:

================================================== ==============================
gettimeofday({1213032482, 940738}, NULL) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 0}, ru_stime={0, 16001}, ...}) = 0
write(1, "Pass 1: Checking ", 17Pass 1: Checking ) = 17
write(1, "inode", 5inode) = 5
write(1, "s, ", 3s, ) = 3
write(1, "block", 5block) = 5
write(1, "s, and sizes
", 13s, and sizes
) = 13
mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x404fa000
mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x46376000
mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x4c1f2000
mmap2(NULL, 198148096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x5206e000
mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x5dd66000
mmap2(NULL, 748892160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x63be2000
mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x77488000) = 0x80ab000
mmap2(NULL, 1866375168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE,
-1, 0) = 0x90615000
munmap(0x90615000, 962560) = 0
munmap(0x90800000, 86016) = 0
mprotect(0x90700000, 135168, PROT_READ|PROT_WRITE) = 0
mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = -1 ENOMEM (Cannot allocate memory)
================================================== ==============================

Appears that fsck is trying to use more than 2GB memory to store inode
table relationship. System has 4GB of physical RAM and 4GB of swap, is
there anyway to limit the memory used by fsck or any solution to check this
filesystem? Running fsck with a 64bit LiveCD will solve the problem?

I also tried with last e2fsprogs stable release 1.40.10, getting the same
error :-/

Regards,

--
Santi Saez

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-09-2008, 09:33 PM
Theodore Tso
 
Default 2GB memory limit running fsck on a +6TB device

On Mon, Jun 09, 2008 at 07:33:48PM +0200, santi@usansolo.net wrote:
> It's a backup storage server, with more than 113 million files, this's the
> output of "df -i":
>
> Appears that fsck is trying to use more than 2GB memory to store inode
> table relationship. System has 4GB of physical RAM and 4GB of swap, is
> there anyway to limit the memory used by fsck or any solution to check this
> filesystem? Running fsck with a 64bit LiveCD will solve the problem?

Yes, running with a 64-bit Live CD is one way to solve the problem.

If you are using e2fsprogs 1.40.10, there is another solution that may
help. Create an /etc/e2fsck.conf file with the following contents:

[scratch_files]
directory = /var/cache/e2fsck

...and then make sure /var/cache/e2fsck exists by running the command
"mkdir /var/cache/e2fsck".

This will cause e2fsck to store certain data structures which grow
large with backup servers that have a vast number of hard-linked files
in /var/cache/e2fsck instead of in memory. This will slow down e2fsck
by approximately 25%, but for large filesystems where you couldn't
otherwise get e2fsck to complete because you're exhausting the 2GB VM
per-process limitation for 32-bit systems, it should allow you to run
through to completion.

- Ted

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-09-2008, 09:50 PM
Andreas Dilger
 
Default 2GB memory limit running fsck on a +6TB device

On Jun 09, 2008 19:33 +0200, santi@usansolo.net wrote:
> That's the scenario: +6TB device on a 3ware 9550SX RAID controller, running
> Debian Etch 32bits, with 2.6.25.4 kernel, and defaults e2fsprogs version,
> "1.39+1.40-WIP-2006.11.14+dfsg-2etch1".
>
> Running "tune2fs" returns that filesystem is in EXT3_ERROR_FS state, "clean
> with errors":
>
> # tune2fs -l /dev/sda4
> tune2fs 1.40.10 (21-May-2008)
> Filesystem volume name: <none>
> Last mounted on: <not available>
> Filesystem UUID: 7701b70e-f776-417b-bf31-3693dba56f86
> Filesystem magic number: 0xEF53
> Filesystem revision #: 1 (dynamic)
> Filesystem features: has_journal dir_index filetype needs_recovery
> sparse_super large_file
> Default mount options: (none)
> Filesystem state: clean with errors
> Errors behavior: Continue
> Filesystem OS type: Linux
> Inode count: 792576000
> Block count: 1585146848
>
> It's a backup storage server, with more than 113 million files, this's the
> output of "df -i":
>
> # df -i /backup/
> Filesystem Inodes IUsed IFree IUse% Mounted on
> /dev/sda4 792576000 113385959 679190041 15% /backup
>
>
> Running fsck.ext3 or fsck.ext2 I get:
>
> # fsck.ext3 /dev/sda4
> e2fsck 1.40.10 (21-May-2008)
> Adding dirhash hint to filesystem.
>
> /dev/sda4 contains a file system with errors, check forced.
> Pass 1: Checking inodes, blocks, and sizes

I recall that e2fsck allocates on the order of 3 * block_count / 8 bytes,
and 5 * inode_count / 8 bytes, so in your case this is about:

(5 * 1585146848 + 3 * 792576000) / 8 = 1287932780 bytes = 1.2GB

at a minimum, but my estimates might be incorrect.

> mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x404fa000

Judging by the return values of these functions, this is a 32-bit system,
and it is entirely possible that you are exceeding the per-process memory
allocation limit.

> mmap2(NULL, 748892160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x63be2000
> mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = -1 ENOMEM (Cannot allocate memory)

Hmm, it seems a bit excessive to allocate 1.8GB in a single chunk.

> Error allocating directory block array: Memory allocation failed
> e2fsck: aborted

This message is a bit tricky to nail down because it doesn't exist anywhere
in the code directly. It is encoded into "e2fsck abbreviations", and
the expansion that is normally in the corresponding comment is different.
It is PR_1_ALLOCATE_DBCOUNT returned from the call chain:
ext2fs_init_dblist->
make_dblist->
ext2fs_get_num_dirs()

which is counting the number of directories in the filesystem, and allocating
two 12-byte array element for each one. This implies you have 77M directories
in your filesystem, or an average of only 10 files per directory?

> Appears that fsck is trying to use more than 2GB memory to store inode
> table relationship. System has 4GB of physical RAM and 4GB of swap, is
> there anyway to limit the memory used by fsck or any solution to check this
> filesystem?

I don't know offhand how important the dblist structure is, so I'm not
sure if there is a way to reduce the memory usage for it. I believe
that in low-memory situations it is possible to use tdb in newer versions
of e2fsck for the dblist, but I don't know much of the details.

> Running fsck with a 64bit LiveCD will solve the problem?

Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM
for e2fsck and be able to check the filesystem.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-09-2008, 10:08 PM
Carlo Wood
 
Default 2GB memory limit running fsck on a +6TB device

On Mon, Jun 09, 2008 at 03:50:32PM -0600, Andreas Dilger wrote:
> > Running fsck with a 64bit LiveCD will solve the problem?
>
> Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM
> for e2fsck and be able to check the filesystem.

We had a simular problem with ext3grep.
You have to realize that every mmap uses memory
address space, even if it's a map to disk.
Therefore, on a 32bit machine, if the total
of all normal allocations plus all simultaneous
mmap's exceeds 4GB then you "run out of memory",
even if -say- only 1 GB is really allocated
and >3GB of the disk is mmap-ed.

In that case a 64bit machine would solve the
problem because then all ram (2 GB I read in
the Subject) can be used for normal allocations
while any disk mmap has cazillion address space
left for itself.

--
Carlo Wood <carlo@alinoe.com>

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-09-2008, 10:37 PM
Theodore Tso
 
Default 2GB memory limit running fsck on a +6TB device

On Mon, Jun 09, 2008 at 03:50:32PM -0600, Andreas Dilger wrote:
> This message is a bit tricky to nail down because it doesn't exist anywhere
> in the code directly. It is encoded into "e2fsck abbreviations", and
> the expansion that is normally in the corresponding comment is different.
> It is PR_1_ALLOCATE_DBCOUNT returned from the call chain:
> ext2fs_init_dblist->
> make_dblist->
> ext2fs_get_num_dirs()
>
> which is counting the number of directories in the filesystem, and allocating
> two 12-byte array element for each one. This implies you have 77M directories
> in your filesystem, or an average of only 10 files per directory?

There are a number of backup solutions that use hardlinks to conserve
space between increment snapshots. So yeah, with these worklodas
you'll see something like 80-85M inodes, of which 77M-odd will be
directories. When you combine the vast number of directories used by
these filesystems, and the fact that e2fsck tries to opimize memory
use by observing that on most normal filesystems, most files have
n_link count of 1, which is NOT true on these filesystems used for
backups, e2fsck's tricks to optimize for speed by caching information
to avoid re-reading them from disk end up costing a large amount of
memory.

> I don't know offhand how important the dblist structure is, so I'm not
> sure if there is a way to reduce the memory usage for it. I believe
> that in low-memory situations it is possible to use tdb in newer versions
> of e2fsck for the dblist, but I don't know much of the details.

Yep, please see [scratch_files] section in e2fsck.conf. It is
described in the e2fsck.conf(5) man page.

- Ted

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-09-2008, 10:57 PM
Andreas Dilger
 
Default 2GB memory limit running fsck on a +6TB device

On Jun 09, 2008 18:37 -0400, Theodore Ts'o wrote:
> On Mon, Jun 09, 2008 at 03:50:32PM -0600, Andreas Dilger wrote:
> > I don't know offhand how important the dblist structure is, so I'm not
> > sure if there is a way to reduce the memory usage for it. I believe
> > that in low-memory situations it is possible to use tdb in newer versions
> > of e2fsck for the dblist, but I don't know much of the details.
>
> Yep, please see [scratch_files] section in e2fsck.conf. It is
> described in the e2fsck.conf(5) man page.

Hmm, maybe if the ext2fs_init_dblist() function returns PR_1_ALLOCATE_DBCOUNT
this should be a user-fixable problem that asks if the user wants to use
an on-disk tdb file in /var/tmp, and if that is a "no" then point them at
the right section in /etc/e2fsck.conf?

I don't think it is reasonable to default to using /tmp, because it might
be a RAM-backed filesystem, and I suspect in most cases the root filesystem
will not run out of memory in this way... Even if it fails because /var/tmp
is read-only, or too small, it is no worse off than it is today.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-10-2008, 03:36 AM
Greg Trounson
 
Default 2GB memory limit running fsck on a +6TB device

Andreas Dilger wrote:

On Jun 09, 2008 19:33 +0200, santi@usansolo.net wrote:

...

Running fsck with a 64bit LiveCD will solve the problem?


Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM
for e2fsck and be able to check the filesystem.


Couldn't you achieve the same thing just by enabling PAE on your 32-bit kernel?

Greg

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-10-2008, 01:18 PM
Theodore Tso
 
Default 2GB memory limit running fsck on a +6TB device

On Tue, Jun 10, 2008 at 03:36:52PM +1200, Greg Trounson wrote:
> Andreas Dilger wrote:
>> On Jun 09, 2008 19:33 +0200, santi@usansolo.net wrote:
> ...
>>> Running fsck with a 64bit LiveCD will solve the problem?
>> Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM
>> for e2fsck and be able to check the filesystem.
>
> Couldn't you achieve the same thing just by enabling PAE on your 32-bit
> kernel?

No, that doesn't increase the amount address space available to the
user process, which is the limitation here. You can have 16 GB of
physical memory, but 2**32 is still 4GB, and the kernel needs address
space, so that means userspace will have at most 3GB of space to
itself.

- Ted

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-10-2008, 03:34 PM
 
Default 2GB memory limit running fsck on a +6TB device

On Mon, 9 Jun 2008 17:33:20 -0400, Theodore Tso <tytso@mit.edu> wrote:

> If you are using e2fsprogs 1.40.10, there is another solution that may
> help. Create an /etc/e2fsck.conf file with the following contents:
>
> [scratch_files]
> directory = /var/cache/e2fsck

(..)

> This will cause e2fsck to store certain data structures which grow
> large with backup servers that have a vast number of hard-linked files
> in /var/cache/e2fsck instead of in memory. This will slow down e2fsck
> by approximately 25%, but for large filesystems where you couldn't
> otherwise get e2fsck to complete because you're exhausting the 2GB VM
> per-process limitation for 32-bit systems, it should allow you to run
> through to completion.

I'm trying with fsck.ext3 v1.40.8, backported from Lenny's package to Etch,
instead of v1.40.10 because we have the same sceneario in all backup
servers running BackupPC, and package must be distributed. If needed, we
can make test with the latest version ;-)

fsck.ext3 started 4 hours ago, and still is in "Pass 1: Checking inodes,
blocks, and sizes", that's normal knowing that the filesystem has +113
million inodes?

I will send more info as requested Ted in "Call for testers w/ using
BackupPC" [1], but now this is the scenario:

- fsck.ext3 is using more than 2GB of memory and no swap, server has 4GB
phisycal RAM + 2GB of swap, this's the output of "pmap -d" with memory
map:

# pmap -d 7014
7014: fsck.ext3 -y /dev/sda4
Address Kbytes Mode Offset Device Mapping
(..)
242fd000 1834768 rw--- 00000000242fd000 000:00000 [ anon ]
942c2000 582604 rw--- 00000000942c2000 000:00000 [ anon ]
(..)

All the output is available at: http://pastebin.com/f67115de2


- Files in "/var/cache/e2fsck" appears that grow very slow, I think, 300Kb
per hour aprox, now that's the size:

# ls -lh /var/cache/e2fsck/
total 170M
-rw------- 1 root root 76M 2008-06-10 17:24
7701b70e-f776-417b-bf31-3693dba56f86-dirinfo-VkmFXP
-rw------- 1 root root 95M 2008-06-10 17:24
7701b70e-f776-417b-bf31-3693dba56f86-icount-YO08bu


- fsck is using 100% of one CPU, it's dual processor motherboard, output of
strace available at:

http://pastebin.com/f68389cce


- More info:
* Kernel 2.6.25.4, i686 arch on a Debian Etch box.
* Storage: 3ware 9550SXU-16ML, 5.91 TB in a RAID-5 with 14 500GB SATA
disks (ST3500630AS), 64kB stripe size (array is in optimal state)


Thanks all for the advices :-)

[1] http://www.redhat.com/archives/ext3-users/2007-April/msg00017.html

--
Santi Saez

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-10-2008, 06:38 PM
Theodore Tso
 
Default 2GB memory limit running fsck on a +6TB device

On Tue, Jun 10, 2008 at 05:34:35PM +0200, santi@usansolo.net wrote:
>
> fsck.ext3 started 4 hours ago, and still is in "Pass 1: Checking inodes,
> blocks, and sizes", that's normal knowing that the filesystem has +113
> million inodes?
>

It depends on a lot of things; how big are your files on average, the
speed of your hard drive, and whether /var/cache/e2fsck is on the same
disk as the partition which you are checking, or on a separate spindle
(guess which is better :-).

It's always a good idea when running e2fsck (aka fsck.ext3) directly
and/or on a terminal/console to include as command-line options "-C
0". This will display a progress bar, so you can gauge how it is
doing. (0 through 70% is pass 1, which requires scanning the inode
table and following all of the indirect blocks.)

- Ted

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 

Thread Tools




All times are GMT. The time now is 11:20 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org