2GB memory limit running fsck on a +6TB device
Dear Srs,
That's the scenario: +6TB device on a 3ware 9550SX RAID controller, running Debian Etch 32bits, with 2.6.25.4 kernel, and defaults e2fsprogs version, "1.39+1.40-WIP-2006.11.14+dfsg-2etch1". Running "tune2fs" returns that filesystem is in EXT3_ERROR_FS state, "clean with errors": # tune2fs -l /dev/sda4 tune2fs 1.40.10 (21-May-2008) Filesystem volume name: <none> Last mounted on: <not available> Filesystem UUID: 7701b70e-f776-417b-bf31-3693dba56f86 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal dir_index filetype needs_recovery sparse_super large_file Default mount options: (none) Filesystem state: clean with errors Errors behavior: Continue Filesystem OS type: Linux Inode count: 792576000 Block count: 1585146848 It's a backup storage server, with more than 113 million files, this's the output of "df -i": # df -i /backup/ Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda4 792576000 113385959 679190041 15% /backup Running fsck.ext3 or fsck.ext2 I get: # fsck.ext3 /dev/sda4 e2fsck 1.40.10 (21-May-2008) Adding dirhash hint to filesystem. /dev/sda4 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Error allocating directory block array: Memory allocation failed e2fsck: aborted With some straces: ================================================== ============================== gettimeofday({1213032482, 940738}, NULL) = 0 getrusage(RUSAGE_SELF, {ru_utime={0, 0}, ru_stime={0, 16001}, ...}) = 0 write(1, "Pass 1: Checking ", 17Pass 1: Checking ) = 17 write(1, "inode", 5inode) = 5 write(1, "s, ", 3s, ) = 3 write(1, "block", 5block) = 5 write(1, "s, and sizes ", 13s, and sizes ) = 13 mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x404fa000 mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x46376000 mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4c1f2000 mmap2(NULL, 198148096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x5206e000 mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x5dd66000 mmap2(NULL, 748892160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x63be2000 mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) brk(0x77488000) = 0x80ab000 mmap2(NULL, 1866375168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x90615000 munmap(0x90615000, 962560) = 0 munmap(0x90800000, 86016) = 0 mprotect(0x90700000, 135168, PROT_READ|PROT_WRITE) = 0 mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) ================================================== ============================== Appears that fsck is trying to use more than 2GB memory to store inode table relationship. System has 4GB of physical RAM and 4GB of swap, is there anyway to limit the memory used by fsck or any solution to check this filesystem? Running fsck with a 64bit LiveCD will solve the problem? I also tried with last e2fsprogs stable release 1.40.10, getting the same error :-/ Regards, -- Santi Saez _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
On Mon, Jun 09, 2008 at 07:33:48PM +0200, santi@usansolo.net wrote:
> It's a backup storage server, with more than 113 million files, this's the > output of "df -i": > > Appears that fsck is trying to use more than 2GB memory to store inode > table relationship. System has 4GB of physical RAM and 4GB of swap, is > there anyway to limit the memory used by fsck or any solution to check this > filesystem? Running fsck with a 64bit LiveCD will solve the problem? Yes, running with a 64-bit Live CD is one way to solve the problem. If you are using e2fsprogs 1.40.10, there is another solution that may help. Create an /etc/e2fsck.conf file with the following contents: [scratch_files] directory = /var/cache/e2fsck ...and then make sure /var/cache/e2fsck exists by running the command "mkdir /var/cache/e2fsck". This will cause e2fsck to store certain data structures which grow large with backup servers that have a vast number of hard-linked files in /var/cache/e2fsck instead of in memory. This will slow down e2fsck by approximately 25%, but for large filesystems where you couldn't otherwise get e2fsck to complete because you're exhausting the 2GB VM per-process limitation for 32-bit systems, it should allow you to run through to completion. - Ted _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
On Jun 09, 2008 19:33 +0200, santi@usansolo.net wrote:
> That's the scenario: +6TB device on a 3ware 9550SX RAID controller, running > Debian Etch 32bits, with 2.6.25.4 kernel, and defaults e2fsprogs version, > "1.39+1.40-WIP-2006.11.14+dfsg-2etch1". > > Running "tune2fs" returns that filesystem is in EXT3_ERROR_FS state, "clean > with errors": > > # tune2fs -l /dev/sda4 > tune2fs 1.40.10 (21-May-2008) > Filesystem volume name: <none> > Last mounted on: <not available> > Filesystem UUID: 7701b70e-f776-417b-bf31-3693dba56f86 > Filesystem magic number: 0xEF53 > Filesystem revision #: 1 (dynamic) > Filesystem features: has_journal dir_index filetype needs_recovery > sparse_super large_file > Default mount options: (none) > Filesystem state: clean with errors > Errors behavior: Continue > Filesystem OS type: Linux > Inode count: 792576000 > Block count: 1585146848 > > It's a backup storage server, with more than 113 million files, this's the > output of "df -i": > > # df -i /backup/ > Filesystem Inodes IUsed IFree IUse% Mounted on > /dev/sda4 792576000 113385959 679190041 15% /backup > > > Running fsck.ext3 or fsck.ext2 I get: > > # fsck.ext3 /dev/sda4 > e2fsck 1.40.10 (21-May-2008) > Adding dirhash hint to filesystem. > > /dev/sda4 contains a file system with errors, check forced. > Pass 1: Checking inodes, blocks, and sizes I recall that e2fsck allocates on the order of 3 * block_count / 8 bytes, and 5 * inode_count / 8 bytes, so in your case this is about: (5 * 1585146848 + 3 * 792576000) / 8 = 1287932780 bytes = 1.2GB at a minimum, but my estimates might be incorrect. > mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > 0) = 0x404fa000 Judging by the return values of these functions, this is a 32-bit system, and it is entirely possible that you are exceeding the per-process memory allocation limit. > mmap2(NULL, 748892160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > 0) = 0x63be2000 > mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, > -1, 0) = -1 ENOMEM (Cannot allocate memory) Hmm, it seems a bit excessive to allocate 1.8GB in a single chunk. > Error allocating directory block array: Memory allocation failed > e2fsck: aborted This message is a bit tricky to nail down because it doesn't exist anywhere in the code directly. It is encoded into "e2fsck abbreviations", and the expansion that is normally in the corresponding comment is different. It is PR_1_ALLOCATE_DBCOUNT returned from the call chain: ext2fs_init_dblist-> make_dblist-> ext2fs_get_num_dirs() which is counting the number of directories in the filesystem, and allocating two 12-byte array element for each one. This implies you have 77M directories in your filesystem, or an average of only 10 files per directory? > Appears that fsck is trying to use more than 2GB memory to store inode > table relationship. System has 4GB of physical RAM and 4GB of swap, is > there anyway to limit the memory used by fsck or any solution to check this > filesystem? I don't know offhand how important the dblist structure is, so I'm not sure if there is a way to reduce the memory usage for it. I believe that in low-memory situations it is possible to use tdb in newer versions of e2fsck for the dblist, but I don't know much of the details. > Running fsck with a 64bit LiveCD will solve the problem? Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM for e2fsck and be able to check the filesystem. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
On Mon, Jun 09, 2008 at 03:50:32PM -0600, Andreas Dilger wrote:
> > Running fsck with a 64bit LiveCD will solve the problem? > > Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM > for e2fsck and be able to check the filesystem. We had a simular problem with ext3grep. You have to realize that every mmap uses memory address space, even if it's a map to disk. Therefore, on a 32bit machine, if the total of all normal allocations plus all simultaneous mmap's exceeds 4GB then you "run out of memory", even if -say- only 1 GB is really allocated and >3GB of the disk is mmap-ed. In that case a 64bit machine would solve the problem because then all ram (2 GB I read in the Subject) can be used for normal allocations while any disk mmap has cazillion address space left for itself. -- Carlo Wood <carlo@alinoe.com> _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
On Mon, Jun 09, 2008 at 03:50:32PM -0600, Andreas Dilger wrote:
> This message is a bit tricky to nail down because it doesn't exist anywhere > in the code directly. It is encoded into "e2fsck abbreviations", and > the expansion that is normally in the corresponding comment is different. > It is PR_1_ALLOCATE_DBCOUNT returned from the call chain: > ext2fs_init_dblist-> > make_dblist-> > ext2fs_get_num_dirs() > > which is counting the number of directories in the filesystem, and allocating > two 12-byte array element for each one. This implies you have 77M directories > in your filesystem, or an average of only 10 files per directory? There are a number of backup solutions that use hardlinks to conserve space between increment snapshots. So yeah, with these worklodas you'll see something like 80-85M inodes, of which 77M-odd will be directories. When you combine the vast number of directories used by these filesystems, and the fact that e2fsck tries to opimize memory use by observing that on most normal filesystems, most files have n_link count of 1, which is NOT true on these filesystems used for backups, e2fsck's tricks to optimize for speed by caching information to avoid re-reading them from disk end up costing a large amount of memory. > I don't know offhand how important the dblist structure is, so I'm not > sure if there is a way to reduce the memory usage for it. I believe > that in low-memory situations it is possible to use tdb in newer versions > of e2fsck for the dblist, but I don't know much of the details. Yep, please see [scratch_files] section in e2fsck.conf. It is described in the e2fsck.conf(5) man page. - Ted _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
On Jun 09, 2008 18:37 -0400, Theodore Ts'o wrote:
> On Mon, Jun 09, 2008 at 03:50:32PM -0600, Andreas Dilger wrote: > > I don't know offhand how important the dblist structure is, so I'm not > > sure if there is a way to reduce the memory usage for it. I believe > > that in low-memory situations it is possible to use tdb in newer versions > > of e2fsck for the dblist, but I don't know much of the details. > > Yep, please see [scratch_files] section in e2fsck.conf. It is > described in the e2fsck.conf(5) man page. Hmm, maybe if the ext2fs_init_dblist() function returns PR_1_ALLOCATE_DBCOUNT this should be a user-fixable problem that asks if the user wants to use an on-disk tdb file in /var/tmp, and if that is a "no" then point them at the right section in /etc/e2fsck.conf? I don't think it is reasonable to default to using /tmp, because it might be a RAM-backed filesystem, and I suspect in most cases the root filesystem will not run out of memory in this way... Even if it fails because /var/tmp is read-only, or too small, it is no worse off than it is today. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
Andreas Dilger wrote:
On Jun 09, 2008 19:33 +0200, santi@usansolo.net wrote: ... Running fsck with a 64bit LiveCD will solve the problem? Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM for e2fsck and be able to check the filesystem. Couldn't you achieve the same thing just by enabling PAE on your 32-bit kernel? Greg _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
On Tue, Jun 10, 2008 at 03:36:52PM +1200, Greg Trounson wrote:
> Andreas Dilger wrote: >> On Jun 09, 2008 19:33 +0200, santi@usansolo.net wrote: > ... >>> Running fsck with a 64bit LiveCD will solve the problem? >> Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM >> for e2fsck and be able to check the filesystem. > > Couldn't you achieve the same thing just by enabling PAE on your 32-bit > kernel? No, that doesn't increase the amount address space available to the user process, which is the limitation here. You can have 16 GB of physical memory, but 2**32 is still 4GB, and the kernel needs address space, so that means userspace will have at most 3GB of space to itself. - Ted _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
On Mon, 9 Jun 2008 17:33:20 -0400, Theodore Tso <tytso@mit.edu> wrote:
> If you are using e2fsprogs 1.40.10, there is another solution that may > help. Create an /etc/e2fsck.conf file with the following contents: > > [scratch_files] > directory = /var/cache/e2fsck (..) > This will cause e2fsck to store certain data structures which grow > large with backup servers that have a vast number of hard-linked files > in /var/cache/e2fsck instead of in memory. This will slow down e2fsck > by approximately 25%, but for large filesystems where you couldn't > otherwise get e2fsck to complete because you're exhausting the 2GB VM > per-process limitation for 32-bit systems, it should allow you to run > through to completion. I'm trying with fsck.ext3 v1.40.8, backported from Lenny's package to Etch, instead of v1.40.10 because we have the same sceneario in all backup servers running BackupPC, and package must be distributed. If needed, we can make test with the latest version ;-) fsck.ext3 started 4 hours ago, and still is in "Pass 1: Checking inodes, blocks, and sizes", that's normal knowing that the filesystem has +113 million inodes? I will send more info as requested Ted in "Call for testers w/ using BackupPC" [1], but now this is the scenario: - fsck.ext3 is using more than 2GB of memory and no swap, server has 4GB phisycal RAM + 2GB of swap, this's the output of "pmap -d" with memory map: # pmap -d 7014 7014: fsck.ext3 -y /dev/sda4 Address Kbytes Mode Offset Device Mapping (..) 242fd000 1834768 rw--- 00000000242fd000 000:00000 [ anon ] 942c2000 582604 rw--- 00000000942c2000 000:00000 [ anon ] (..) All the output is available at: http://pastebin.com/f67115de2 - Files in "/var/cache/e2fsck" appears that grow very slow, I think, 300Kb per hour aprox, now that's the size: # ls -lh /var/cache/e2fsck/ total 170M -rw------- 1 root root 76M 2008-06-10 17:24 7701b70e-f776-417b-bf31-3693dba56f86-dirinfo-VkmFXP -rw------- 1 root root 95M 2008-06-10 17:24 7701b70e-f776-417b-bf31-3693dba56f86-icount-YO08bu - fsck is using 100% of one CPU, it's dual processor motherboard, output of strace available at: http://pastebin.com/f68389cce - More info: * Kernel 2.6.25.4, i686 arch on a Debian Etch box. * Storage: 3ware 9550SXU-16ML, 5.91 TB in a RAID-5 with 14 500GB SATA disks (ST3500630AS), 64kB stripe size (array is in optimal state) Thanks all for the advices :-) [1] http://www.redhat.com/archives/ext3-users/2007-April/msg00017.html -- Santi Saez _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
2GB memory limit running fsck on a +6TB device
On Tue, Jun 10, 2008 at 05:34:35PM +0200, santi@usansolo.net wrote:
> > fsck.ext3 started 4 hours ago, and still is in "Pass 1: Checking inodes, > blocks, and sizes", that's normal knowing that the filesystem has +113 > million inodes? > It depends on a lot of things; how big are your files on average, the speed of your hard drive, and whether /var/cache/e2fsck is on the same disk as the partition which you are checking, or on a separate spindle (guess which is better :-). It's always a good idea when running e2fsck (aka fsck.ext3) directly and/or on a terminal/console to include as command-line options "-C 0". This will display a progress bar, so you can gauge how it is doing. (0 through 70% is pass 1, which requires scanning the inode table and following all of the indirect blocks.) - Ted _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
| All times are GMT. The time now is 04:31 PM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.