FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 02-13-2012, 08:45 PM
Boris Epstein
 
Default MySQL/file system question

Hello listmates,

I got a rather strange situation that I can't quite make sense of. OK, I've
got a very large data file to sort (hundreds of millions of lines) and I
decided to use MySQL for the purpose. I inserted the stuff into a table
easily enough. Then I decided to sort it and got stuck as it turned out
that MySQL, unless specifically configured to do otherwise, puts temporary
files in /tmp which simply was not sufficiently large. Then I changed that
directory to a partition that had more space (let's call the new temp
driectory /home/big-temp) and now as I am running the query aimed at
sorting the data it seems like space, according to the df, is no longer
being used up under / (which was there /tmp was) but is now being used up
in the right partition ( /home, the large one). Yet /home/big-temp is still
empty!

So how is that possible? I know there are all kinds of file locking
possible depending on the circumstances, file system type, etc - but how
can it be that the files, if they exist, are not even visible to ls, even
"ls -a", even run by root?

Thanks in advance for any clarification.

Cheers,

Boris.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-13-2012, 08:58 PM
Adam Tauno Williams
 
Default MySQL/file system question

On Mon, 2012-02-13 at 16:45 -0500, Boris Epstein wrote:
> Hello listmates,
> I got a rather strange situation that I can't quite make sense of. OK, I've
> got a very large data file to sort (hundreds of millions of lines) and I
> decided to use MySQL for the purpose. I inserted the stuff into a table
> easily enough. Then I decided to sort it and got stuck as it turned out
> that MySQL, unless specifically configured to do otherwise, puts temporary
> files in /tmp which simply was not sufficiently large. Then I changed that
> directory to a partition that had more space (let's call the new temp
> driectory /home/big-temp) and now as I am running the query aimed at
> sorting the data it seems like space, according to the df, is no longer
> being used up under / (which was there /tmp was) but is now being used up
> in the right partition ( /home, the large one). Yet /home/big-temp is still
> empty!
> So how is that possible?

Easy. It is using temporary files the *correct* way.

1. Open file
2. Unlink file
3. Use file
4. Close file

This means (a) even if the process abends the resources allocated to the
file are released and (b) an external process can't see [or modify] the
temporary file.

When a file is unlinked it remains 'active' until all references to the
file are released - but the daemon is still holding a reference [because
it is using the file].

There is a file there, but nobody, not even root, can see it.

Actually you can; if you look in /proc/{pid#}/fd ...

--
System & Network Administrator [ LPI & NCLA ]
<http://www.whitemiceconsulting.com>
OpenGroupware Developer <http://www.opengroupware.us>
Adam Tauno Williams

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-13-2012, 09:06 PM
Boris Epstein
 
Default MySQL/file system question

On Mon, Feb 13, 2012 at 4:58 PM, Adam Tauno Williams <awilliam@whitemice.org
> wrote:

> On Mon, 2012-02-13 at 16:45 -0500, Boris Epstein wrote:
> > Hello listmates,
> > I got a rather strange situation that I can't quite make sense of. OK,
> I've
> > got a very large data file to sort (hundreds of millions of lines) and I
> > decided to use MySQL for the purpose. I inserted the stuff into a table
> > easily enough. Then I decided to sort it and got stuck as it turned out
> > that MySQL, unless specifically configured to do otherwise, puts
> temporary
> > files in /tmp which simply was not sufficiently large. Then I changed
> that
> > directory to a partition that had more space (let's call the new temp
> > driectory /home/big-temp) and now as I am running the query aimed at
> > sorting the data it seems like space, according to the df, is no longer
> > being used up under / (which was there /tmp was) but is now being used up
> > in the right partition ( /home, the large one). Yet /home/big-temp is
> still
> > empty!
> > So how is that possible?
>
> Easy. It is using temporary files the *correct* way.
>
> 1. Open file
> 2. Unlink file
> 3. Use file
> 4. Close file
>
> This means (a) even if the process abends the resources allocated to the
> file are released and (b) an external process can't see [or modify] the
> temporary file.
>
> When a file is unlinked it remains 'active' until all references to the
> file are released - but the daemon is still holding a reference [because
> it is using the file].
>
> There is a file there, but nobody, not even root, can see it.
>
> Actually you can; if you look in /proc/{pid#}/fd ...
>
> --
> System & Network Administrator [ LPI & NCLA ]
> <http://www.whitemiceconsulting.com>
> OpenGroupware Developer <http://www.opengroupware.us>
> Adam Tauno Williams
>
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos


Adam, thanks!

I haven't thought of it this way. In fact it matches with this description:

http://dev.mysql.com/doc/refman/5.1/en/temporary-files.html

I can kind of see the advantages; the disadvantages, of course, are that a
major transaction can not be resumed in case the mysqld process is stopped.

Boris.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-13-2012, 09:31 PM
Boris Epstein
 
Default MySQL/file system question

>
>
>
> Easy. It is using temporary files the *correct* way.
>
> 1. Open file
> 2. Unlink file
> 3. Use file
> 4. Close file
>
> This means (a) even if the process abends the resources allocated to the
> file are released and (b) an external process can't see [or modify] the
> temporary file.
>
> When a file is unlinked it remains 'active' until all references to the
> file are released - but the daemon is still holding a reference [because
> it is using the file].
>
> There is a file there, but nobody, not even root, can see it.
>
> Actually you can; if you look in /proc/{pid#}/fd ...
>
> --
> System & Network Administrator [ LPI & NCLA ]
> <http://www.whitemiceconsulting.com>
> OpenGroupware Developer <http://www.opengroupware.us>
> Adam Tauno Williams
>
>
In fact, I did look in /proc/{pid#}/fd and found the file names, thanks!
Coud quite figure out the size of those invisible files - but no matter,
hopefully I've got enough room.

Thanks.

Boris.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-13-2012, 09:34 PM
Adam Tauno Williams
 
Default MySQL/file system question

On Mon, 2012-02-13 at 17:06 -0500, Boris Epstein wrote:
> On Mon, Feb 13, 2012 at 4:58 PM, Adam Tauno Williams <awilliam@whitemice.org
> > wrote:
> > On Mon, 2012-02-13 at 16:45 -0500, Boris Epstein wrote:
> > > Hello listmates,
> > > I got a rather strange situation that I can't quite make sense of. OK,
> > I've
> > > got a very large data file to sort (hundreds of millions of lines) and I
> > > decided to use MySQL for the purpose. I inserted the stuff into a table
> > > easily enough. Then I decided to sort it and got stuck as it turned out
> > > that MySQL, unless specifically configured to do otherwise, puts
> > temporary
> > > files in /tmp which simply was not sufficiently large. Then I changed
> > that
> > > directory to a partition that had more space (let's call the new temp
> > > driectory /home/big-temp) and now as I am running the query aimed at
> > > sorting the data it seems like space, according to the df, is no longer
> > > being used up under / (which was there /tmp was) but is now being used up
> > > in the right partition ( /home, the large one). Yet /home/big-temp is
> > still
> > > empty!
> > > So how is that possible?
> > Easy. It is using temporary files the *correct* way.
> > 1. Open file
> > 2. Unlink file
> > 3. Use file
> > 4. Close file
> > This means (a) even if the process abends the resources allocated to the
> > file are released and (b) an external process can't see [or modify] the
> > temporary file.
> > When a file is unlinked it remains 'active' until all references to the
> > file are released - but the daemon is still holding a reference [because
> > it is using the file].
> > There is a file there, but nobody, not even root, can see it.
> > Actually you can; if you look in /proc/{pid#}/fd ...
> I haven't thought of it this way. In fact it matches with this description:
> http://dev.mysql.com/doc/refman/5.1/en/temporary-files.html
> I can kind of see the advantages; the disadvantages, of course, are that a
> major transaction can not be resumed in case the mysqld process is stopped.

If the process is stopped the transaction cannot be resumed for a myriad
reasons; loss of the temporary file is a trivial concern.

--
System & Network Administrator [ LPI & NCLA ]
<http://www.whitemiceconsulting.com>
OpenGroupware Developer <http://www.opengroupware.us>
Adam Tauno Williams

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-13-2012, 09:39 PM
Boris Epstein
 
Default MySQL/file system question

>
>
>
> If the process is stopped the transaction cannot be resumed for a myriad
> reasons; loss of the temporary file is a trivial concern.
>
> --
> System & Network Administrator [ LPI & NCLA ]
> <http://www.whitemiceconsulting.com>
> OpenGroupware Developer <http://www.opengroupware.us>
> Adam Tauno Williams
>
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>

True - although if you somehow track your progress within the transaction
making it resumable could be possible.

Boris.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-13-2012, 10:01 PM
Les Mikesell
 
Default MySQL/file system question

On Mon, Feb 13, 2012 at 4:06 PM, Boris Epstein <borepstein@gmail.com> wrote:

> > >
> I haven't thought of it this way. In fact it matches with this description:
>
> http://dev.mysql.com/doc/refman/5.1/en/temporary-files.html
>
> I can kind of see the advantages; the disadvantages, of course, are that a
> major transaction can not be resumed in case the mysqld process is stopped.
>
>
I don't think it would make much sense to try to finish a query after a
restart since at that point the temp table might not reflect the real data
anyway. Have you tried adding an index on the fields in your query so it
doesn't have to sort everything when you make the query? Mysql isn't too
bright about optimizing 3-table joins, but with one or 2 and pre-indexed
fields it shouldn't need much time or extra space.

--
Les Mikesell
lesmikesell@gmail.com
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 03:59 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org