Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   EXT3 Users (http://www.linux-archive.org/ext3-users/)
-   -   How to generate a large file allocating space (http://www.linux-archive.org/ext3-users/446279-how-generate-large-file-allocating-space.html)

Alex Bligh 10-31-2010 09:12 AM

How to generate a large file allocating space
 
I want to generate or extend a large file in an ext4 filesystem allocating
space (i.e. not creating a sparse file) but not actually writing any data.
I realise that this will result in the file containing the contents of
whatever was there on the disk before, which is a possible security problem
in some circumstances, but it isn't a problem here.

Ideally what I'd like is a "make unsparse" bit of code. I'm happy for this
to use the libraries, and work on an unmounted fs (indeed that is probably
better).

Supplementary question: can I assume that if a non-sparse file is on disk
and never opened, and never unlinked, then the sectors used to to store
that file's data will never change irrespective of other operations on the
ext4 filesystem? IE nothing is shuffling where ext4 files are stored.

--
Alex Bligh

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Alex Bligh 10-31-2010 02:05 PM

How to generate a large file allocating space
 
--On 31 October 2010 10:23:51 -0500 Bruno Wolff III <bruno@wolff.to> wrote:


On Sun, Oct 31, 2010 at 11:12:41 +0100,
Alex Bligh <alex@alex.org.uk> wrote:

I want to generate or extend a large file in an ext4 filesystem
allocating space (i.e. not creating a sparse file) but not actually
writing any data. I realise that this will result in the file containing
the contents of whatever was there on the disk before, which is a
possible security problem in some circumstances, but it isn't a problem
here.


There isn't going to be a way to do that through the file system, because
as you note it is a security problem.

What is the high level thing you are trying to accomplish here? Modifying
the filesystem offline seems risky and maybe there is a safer way to
accomplish your goals.


I am trying to allocate huge files on ext4. I will then read the extents
within the file and write to the disk at a block level rather than using
ext4 (the FS will not be mounted at this point). This will allow me to
have several iSCSI clients hitting the same LUN r/w safely. And at
some point when I know the relevant iSCSI stuff has stopped and been
flushed to disk, I may unlink the file.

As I have total control of what's on the disk, I don't really care
if previous content is exposed. If I write many Gigabyets of zeroes,
that's going to take a long time, and be totally unnecessary, since
I already have my own internal map of the data I will write into these
huge files.

Yes, I know this is deep scary voodoo, but that's ok. I can get the
extent list the same way as "filefrag -v" gets it. What I can't
currently work out (using either the library, or doing it with the
volume mounted) is how to extend a file AND allocate the extents
(as opposed to doing it sparse).


Supplementary question: can I assume that if a non-sparse file is on disk
and never opened, and never unlinked, then the sectors used to to store
that file's data will never change irrespective of other operations on
the ext4 filesystem? IE nothing is shuffling where ext4 files are stored.


I think SSDs will move stuff around at a very low level. They would look
like they are at the same place to stuff access the device like a disk,
but physically would be stored in a different hardware location.

With normal disks, you'd only see this if the device got a read error, but
was able to successfully read a marginal sector and remap it to a spare
sector. But again, stuff talking to the disk will see it at the same
address.


Sure, that's no problem because the offset into the block device stays
the same, even if physically the file is in a different place. So the
extent list will stay the same for the file.

--
Alex Bligh

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Bruno Wolff III 10-31-2010 02:23 PM

How to generate a large file allocating space
 
On Sun, Oct 31, 2010 at 11:12:41 +0100,
Alex Bligh <alex@alex.org.uk> wrote:
> I want to generate or extend a large file in an ext4 filesystem allocating
> space (i.e. not creating a sparse file) but not actually writing any data.
> I realise that this will result in the file containing the contents of
> whatever was there on the disk before, which is a possible security problem
> in some circumstances, but it isn't a problem here.

There isn't going to be a way to do that through the file system, because
as you note it is a security problem.

What is the high level thing you are trying to accomplish here? Modifying
the filesystem offline seems risky and maybe there is a safer way to
accomplish your goals.

> Supplementary question: can I assume that if a non-sparse file is on disk
> and never opened, and never unlinked, then the sectors used to to store
> that file's data will never change irrespective of other operations on the
> ext4 filesystem? IE nothing is shuffling where ext4 files are stored.

I think SSDs will move stuff around at a very low level. They would look
like they are at the same place to stuff access the device like a disk,
but physically would be stored in a different hardware location.

With normal disks, you'd only see this if the device got a read error, but
was able to successfully read a marginal sector and remap it to a spare
sector. But again, stuff talking to the disk will see it at the same address.

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Alex Bligh 10-31-2010 02:34 PM

How to generate a large file allocating space
 
Matija,

--On 31 October 2010 17:19:49 +0100 Matija Nalis <mnalis-ml@voyager.hr>
wrote:



Well, some metadata will have to be written, but not data.
shouldn't posix_fallocate(3) and/or fallocate(2) do that?

I haven't got EXT4 around ATM, but IIRC it should work on it too.
On XFS it seems to work too:


That's /almost/ perfect:

$ fallocate -l 1073741824 testfile
$ filefrag -v testfile
Filesystem type is: ef53
File size of testfile is 1073741824 (262144 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 14819328 30720 unwritten
1 30720 14850048 30720 unwritten
2 61440 14880768 30720 unwritten
3 92160 14911488 30720 unwritten
4 122880 14942208 2048 unwritten
5 124928 14946304 14944255 30720 unwritten
6 155648 14977024 30720 unwritten
7 186368 15007744 30720 unwritten
8 217088 15038464 30720 unwritten
9 247808 15069184 14336 unwritten,eof
testfile: 2 extents found

I think all I need do is clear the unwritten flag in each of the extents.
Else I think if I read the file using ext4 later (i.e. after I've
written directly to the sectors concerned) it will appear to be
empty. Any idea how I do that?

--
Alex Bligh

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Matija Nalis 10-31-2010 03:19 PM

How to generate a large file allocating space
 
On Sun, Oct 31, 2010 at 11:12:41AM +0100, Alex Bligh wrote:
> I want to generate or extend a large file in an ext4 filesystem allocating
> space (i.e. not creating a sparse file) but not actually writing any data.

Well, some metadata will have to be written, but not data.
shouldn't posix_fallocate(3) and/or fallocate(2) do that?

I haven't got EXT4 around ATM, but IIRC it should work on it too.
On XFS it seems to work too:

# time fallocate -l 3000000000 /stuff/tmp/bla
fallocate -l 3000000000 /stuff/tmp/bla 0,00s user 0,00s system 0% cpu 0,402 total
# du -h /stuff/tmp/bla
2,8G /stuff/tmp/bla
# du -bh /stuff/tmp/bla
2,8G /stuff/tmp/bla
# rm -f /stuff/tmp/bla

fallocate(1) is from util-linux on my Debian Squeeze

Oppose that to dramatically slower dd(1), which fills them with zeros explicitely:
# time dd if=/dev/zero of=/stuff/tmp/bla count=30000 bs=100000
time dd if=/dev/zero of=/stuff/tmp/bla count=30000 bs=100000
30000+0 records in
30000+0 records out
3000000000 bytes (3,0 GB) copied, 31,2581 s, 96,0 MB/s
dd if=/dev/zero of=/stuff/tmp/bla count=30000 bs=100000 0,00s user 3,41s system 10% cpu 31,341 total
# du -h /stuff/tmp/bla
2,8G /stuff/tmp/bla


--
Opinions above are GNU-copylefted.

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Alex Bligh 10-31-2010 05:09 PM

How to generate a large file allocating space
 
--On 31 October 2010 19:46:09 +0100 Matija Nalis <mnalis-ml@voyager.hr>
wrote:



Sorry, I don't. debugfs(8) only appears to have read-only support for
reading extents, and not for (re-)writing them, so I guess you'll have to
find some function in libext2fs if it exists (or write your own if it
doesn't) to use on unmounted fs.


Yes. I need to iterate through the extents. debugfs does that but I don't
know how to change the flag or (more relevantly) whether it is safe to do
so.

--
Alex Bligh

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Matija Nalis 10-31-2010 05:46 PM

How to generate a large file allocating space
 
On Sun, Oct 31, 2010 at 04:34:44PM +0100, Alex Bligh wrote:
> That's /almost/ perfect:
>
> 9 247808 15069184 14336 unwritten,eof
> testfile: 2 extents found
>
> I think all I need do is clear the unwritten flag in each of the extents.
> Else I think if I read the file using ext4 later (i.e. after I've
> written directly to the sectors concerned) it will appear to be
> empty.

Yes, it would appear empty. It is due to security concers others mentioned
too.

> Any idea how I do that?

Sorry, I don't. debugfs(8) only appears to have read-only support for
reading extents, and not for (re-)writing them, so I guess you'll have to
find some function in libext2fs if it exists (or write your own if it
doesn't) to use on unmounted fs.

--
Opinions above are GNU-copylefted.

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Andreas Dilger 11-01-2010 05:13 AM

How to generate a large file allocating space
 
On 2010-10-31, at 09:05, Alex Bligh wrote:
> I am trying to allocate huge files on ext4. I will then read the extents
> within the file and write to the disk at a block level rather than using
> ext4 (the FS will not be mounted at this point). This will allow me to
> have several iSCSI clients hitting the same LUN r/w safely. And at
> some point when I know the relevant iSCSI stuff has stopped and been
> flushed to disk, I may unlink the file.

Hmm, why not simply use a cluster filesystem to do this?

GFS and OCFS both handle shared writers for the same SAN disk (AFAIK), and Lustre uses ext4 as the underlying filesystem, and though it doesn't allow direct client writes to the same disk it will allow writing at 95% of the underlying raw disk performance from multiple clients.

Cheers, Andreas






_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Alex Bligh 11-01-2010 05:14 AM

How to generate a large file allocating space
 
--On 1 November 2010 00:13:33 -0600 Andreas Dilger <adilger@dilger.ca>
wrote:



Hmm, why not simply use a cluster filesystem to do this?

GFS and OCFS both handle shared writers for the same SAN disk (AFAIK),
and Lustre uses ext4 as the underlying filesystem, and though it doesn't
allow direct client writes to the same disk it will allow writing at 95%
of the underlying raw disk performance from multiple clients.


Essentially because none of them do exactly what I need them to do,
so I am reinventing a slightly different wheel...

--
Alex Bligh

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Andreas Dilger 11-01-2010 08:45 PM

How to generate a large file allocating space
 
On 2010-11-01, at 00:14, Alex Bligh wrote:
> --On 1 November 2010 00:13:33 -0600 Andreas Dilger <adilger@dilger.ca> wrote:
>> Hmm, why not simply use a cluster filesystem to do this?
>>
>> GFS and OCFS both handle shared writers for the same SAN disk (AFAIK),
>> and Lustre uses ext4 as the underlying filesystem, and though it doesn't
>> allow direct client writes to the same disk it will allow writing at 95%
>> of the underlying raw disk performance from multiple clients.
>
> Essentially because none of them do exactly what I need them to do,
> so I am reinventing a slightly different wheel...

Personally, I hate re-inventing things vs. improving something to make it do what you want, since it means (probably) that your code will be used by you alone, while making an improvement to an existing cluster filesystem will both meet your needs and allow others to benefit as well.

What is it you really want to do in the end? Shared concurrent writers to the same file? High-bandwidth IO to the underlying disk?

Cheers, Andreas






_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users


All times are GMT. The time now is 12:47 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.