Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Gentoo User (http://www.linux-archive.org/gentoo-user/)
-   -   1-Terabyte drives - 4K sector sizes? -> bar performance so far (http://www.linux-archive.org/gentoo-user/322133-1-terabyte-drives-4k-sector-sizes-bar-performance-so-far.html)

Mark Knecht 02-07-2010 03:27 PM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
Hi,
I got a WD 1T drive to use in a new machine for my dad. I didn't
pay a huge amount of attention to the technical details when I
purchased it other than it was SATA2, big, and the price was good.
Here's the NewEgg link:

http://www.newegg.com/Product/Product.aspx?Item=N82E16822136490

I installed the drive, created some partitions and set off to put
ext3 on it using just mke2fs -j /dev/sda3. The partitions gets written
and everything works but when I started installing Gentoo on it I was
getting some HUGE delays at times, such as when unpacking
portage.latest.tar.bz. Basically the tar step would be rolling along
and then the drive would literally appear to stop for 1 minute before
proceeding. No CPU usage, the machine is alive in other terminals, but
anything directed at the disk just seems dead. Sticking my ear on the
drive it doesn't sound like the drive is doing anything.

I was trying to determine what to do - I.e is this a bad drive, how
to return it, etc. - and started reading the reviews at NewEgg. One
guy using it with Linux had this to say:

<QUOTE>
4KB physical sectors: KNOW WHAT YOU'RE DOING!

Pros: Quiet, cool-running, big cache

Cons: The 4KB physical sectors are a problem waiting to happen. If you
misalign your partitions, disk performance can suffer. I ran
benchmarks in Linux using a number of filesystems, and I found that
with most filesystems, read performance and write performance with
large files didn't suffer with misaligned partitions, but writes of
many small files (unpacking a Linux kernel archive) could take several
times as long with misaligned partitions as with aligned partitions.
WD's advice about who needs to be concerned is overly simplistic,
IMHO, and it's flat-out wrong for Linux, although it's probably
accurate for 90% of buyers (those who run Windows or Mac OS and use
their standard partitioning tools). If you're not part of that 90%,
though, and if you don't fully understand this new technology and how
to handle it, buy a drive with conventional 512-byte sectors!
</QUOTE>

Now, I don't mind getting a bit dirty learning to use this
correctly but I'm wondering what that means in a practical sense.
Reading the mke2fs man page the word 'sector' doesn't come up. It's my
understanding the Linux 'blocks' are groups of sectors. True? If the
disk must use 4K sectors then what - the smallest block has to be 4K
and I'm using 1 sector per block? It seems that ext3 doesn't support
anything larger than 4K?

As a test I blew away all the partitions and made one huge 1
terabyte partition using ext3. I think tried untarring the portage
snapshot and then deleting the directory where I put it a bunch of
times. I get very different times each time I do this. untarring
varies from 6 minutes 24 seconds to 10 minutes 25 seconds. Removing
the directory varies from 3 seconds to 1 minute 22 seconds.

Every time there is an apparent delay I just see the hard drive
light turned on solid. That said as far as I know if I wait for things
to complete the data is there but I haven't tested it extensively.

Is this a bad drive or am I somehow using it incorrectly?

Thanks,
Mark


gandalf TestMount # time tar xjf /mnt/TestMount/portage-latest.tar.bz2
-C /mnt/TestMount/usr

real 6m24.736s
user 0m9.969s
sys 0m3.537s
gandalf TestMount # time rm -rf /mnt/TestMount/usr/

real 0m3.229s
user 0m0.110s
sys 0m1.809s
gandalf TestMount # mkdir usr
gandalf TestMount # time tar xjf /mnt/TestMount/portage-latest.tar.bz2
-C /mnt/TestMount/usr

real 7m50.193s
user 0m8.647s
sys 0m2.811s
gandalf TestMount # time rm -rf /mnt/TestMount/usr/

real 0m3.234s
user 0m0.119s
sys 0m1.792s
gandalf TestMount # mkdir usr
gandalf TestMount # time tar xjf /mnt/TestMount/portage-latest.tar.bz2
-C /mnt/TestMount/usr

real 10m25.926s
user 0m8.645s
sys 0m2.765s
gandalf TestMount # time rm -rf /mnt/TestMount/usr/

real 1m22.330s
user 0m0.124s
sys 0m1.810s
gandalf TestMount # mkdir usr
gandalf TestMount # time tar xjf /mnt/TestMount/portage-latest.tar.bz2
-C /mnt/TestMount/usr

real 8m12.307s
user 0m8.463s
sys 0m2.708s
gandalf TestMount # time rm -rf /mnt/TestMount/usr/

real 0m29.517s
user 0m0.114s
sys 0m1.810s
gandalf TestMount #




gandalf ~ # hdparm -tT /dev/sdb

/dev/sdb:
Timing cached reads: 11362 MB in 2.00 seconds = 5684.46 MB/sec
Timing buffered disk reads: 314 MB in 3.00 seconds = 104.64 MB/sec
gandalf ~ #

Alexander 02-07-2010 04:30 PM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
On Sunday 07 February 2010 19:27:46 Mark Knecht wrote:

> Every time there is an apparent delay I just see the hard drive
> light turned on solid. That said as far as I know if I wait for things
> to complete the data is there but I haven't tested it extensively.
>
> Is this a bad drive or am I somehow using it incorrectly?
>

Is there any related info in dmesg?

Volker Armin Hemmann 02-07-2010 05:19 PM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
On Sonntag 07 Februar 2010, Alexander wrote:
> On Sunday 07 February 2010 19:27:46 Mark Knecht wrote:
> > Every time there is an apparent delay I just see the hard drive
> >
> > light turned on solid. That said as far as I know if I wait for things
> > to complete the data is there but I haven't tested it extensively.
> >
> > Is this a bad drive or am I somehow using it incorrectly?
>
> Is there any related info in dmesg?

or maybe there is too much cached and seeking is not the drives strong point
...

Volker Armin Hemmann 02-07-2010 06:16 PM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
On Sonntag 07 Februar 2010, Mark Knecht wrote:
> On Sun, Feb 7, 2010 at 9:30 AM, Alexander <b3nder@yandex.ru> wrote:
> > On Sunday 07 February 2010 19:27:46 Mark Knecht wrote:
> >> Every time there is an apparent delay I just see the hard drive
> >> light turned on solid. That said as far as I know if I wait for things
> >> to complete the data is there but I haven't tested it extensively.
> >>
> >> Is this a bad drive or am I somehow using it incorrectly?
> >
> > Is there any related info in dmesg?
>
> No, nothing in dmesg at all.
>
> Here are two tests this morning. The first is to the 1T drive, the
> second is to a 120GB drive I'm currently using as a system drive until
> I work this out:
>
> gandalf TestMount # time tar xjf /mnt/TestMount/portage-latest.tar.bz2
> -C /mnt/TestMount/usr
>
> real 8m13.077s
> user 0m8.184s
> sys 0m2.561s
> gandalf TestMount #
>
>
> mark@gandalf ~ $ time tar xjf /mnt/TestMount/portage-latest.tar.bz2 -C
> /home/mark/Test_usr/
>
> real 0m39.213s
> user 0m8.243s
> sys 0m2.135s
> mark@gandalf ~ $
>
> 8 minutes vs 39 seconds!
>
> The amount of data written appears to be the same:
>
> gandalf ~ # du -shc /mnt/TestMount/usr/
> 583M /mnt/TestMount/usr/
> 583M total
> gandalf ~ #
>
>
> mark@gandalf ~ $ du -shc /home/mark/Test_usr/
> 583M /home/mark/Test_usr/
> 583M total
> mark@gandalf ~ $
>
>
> I did some reading at the WD site and it seems this drive does use the
> 4K sector size. The way it's done is the addressing on cable is still
> 512 byte 'user sectors', but they are packed into 4K physical sectors
> and internal hardware does the mapping.
>
> I suspect the performance issue is figuring out how to get the file
> system to keep things on 4K boundaries. I assume that's what the 4K
> block size is for when building the file system but I need to go find
> out more about that. I did not select it specifically. Maybe I need
> to.
>
> Thanks,
> Mark

no. 4k block size is the default for linux filesystems. But you might have
'misaligned' the partitions. There is a lot of text to read about
'eraseblocks' on ssds and how important it is to align the partitions. You
might want to read up on that to learn how to align partitions.

Willie Wong 02-07-2010 06:39 PM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
On Sun, Feb 07, 2010 at 08:27:46AM -0800, Mark Knecht wrote:
> <QUOTE>
> 4KB physical sectors: KNOW WHAT YOU'RE DOING!
>
> Pros: Quiet, cool-running, big cache
>
> Cons: The 4KB physical sectors are a problem waiting to happen. If you
> misalign your partitions, disk performance can suffer. I ran
> benchmarks in Linux using a number of filesystems, and I found that
> with most filesystems, read performance and write performance with
> large files didn't suffer with misaligned partitions, but writes of
> many small files (unpacking a Linux kernel archive) could take several
> times as long with misaligned partitions as with aligned partitions.
> WD's advice about who needs to be concerned is overly simplistic,
> IMHO, and it's flat-out wrong for Linux, although it's probably
> accurate for 90% of buyers (those who run Windows or Mac OS and use
> their standard partitioning tools). If you're not part of that 90%,
> though, and if you don't fully understand this new technology and how
> to handle it, buy a drive with conventional 512-byte sectors!
> </QUOTE>
>
> Now, I don't mind getting a bit dirty learning to use this
> correctly but I'm wondering what that means in a practical sense.
> Reading the mke2fs man page the word 'sector' doesn't come up. It's my
> understanding the Linux 'blocks' are groups of sectors. True? If the
> disk must use 4K sectors then what - the smallest block has to be 4K
> and I'm using 1 sector per block? It seems that ext3 doesn't support
> anything larger than 4K?

The problem is not when you are making the filesystem with mke2fs, but
when you partitioned the disk using fdisk. I'm sure I am making some
small mistakes in the explanation below, but it goes something like
this:

a) The harddrive with 4K sectors allows the head to efficiently
read/write 4K sized blocks at a time.
b) However, to be compatible in hardware, the harddrive allows 512B
sized blocks to be addressed. In reality, this means that you can
individually address the 8 512B-sized chunks of the 4K sized blocks,
but each will count as a separate operation. To illustrate: say the
hardware has some sector X of size 4K. It has 8 addressable slots
inside X1 ... X8 each of size 512B. If your OS clusters read/writes on
the 512B level, it will send 8 commands to read the info in those 8
blocks separately. If your OS clusters in 4K, it will send one
command. So in the stupid analysis I give here, it will take 8 times
as long for the 512B addressing to read the same data, since it will
take 8 passes, and each time inefficiently reading only 1/8 of the
data required. Now in reality, drives are smarter than that: if all 8
of those are sent in sequence, sometimes the drives will cluster them
together in one read.
c) A problem occurs, however, when your OS deals with 4K clusters but
when you make the partition, the partition is offset! Imagine the
physical read sectors of your disk looking like

AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDD

but when you make your partitions, somehow you partitioned it

....YYYYYYYYZZZZZZZZWWWWWWWW....

This is possible because the drive allows addressing by 512K chunks.
So for some reason one of your partitions starts halfway inside a
physical sector. What is the problem with this? Now suppose your OS
sends data to be written to the ZZZZZZZZ block. If it were completely
aligned, the drive will just go kink-move the head to the block, and
overwrite it with this information. But since half of the block is
over the BBBB phsical sector, and half over CCCC, what the disk now
needs to do is to

pass 1) read BBBBBBBB
pass 2) modify the second half of BBBB to match the first half of ZZZZ
pass 3) write BBBBBBBB
pass 4) read CCCCCCCC
pass 5) modify the first half of CCCC to match the second half of ZZZZ
pass 6) write CCCCCCCC

Or what is known as a read-modify-write operation. Thus the disk
becomes a lot less efficient.

----------

Now, I don't know if this is the actual problem is causing your
performance problems. But this may be it. When you use fdisk, it
defaults to aligning the partition to cylinder boundaries, and use the
default (from ancient times) value of 63 x (512B sized) sectors per
track. Since 63 is not evenly divisible by 8, you see that quite
likely some of your partitions are not aligned to the physical sector
boundaries.

If you use cfdisk, you can try to change the geometry with the command
g. Or you can use the command u to change the units used in the
partitioning to either sectors or megabytes, and make sure your
partition sizes are a multiple of 8 in the former, or an integer in
the latter.

Again, take what I wrote with a grain of salt: this information came
from the research I did a little while back after reading the slashdot
article on this 4K switch. So being my own understanding, it may not
completely be correct.

HTH,

W
--
Willie W. Wong wwong@math.princeton.edu
Data aequatione quotcunque fluentes quantitae involvente fluxiones invenire
et vice versa ~~~ I. Newton

Willie Wong 02-08-2010 01:08 AM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
On Sun, Feb 07, 2010 at 01:42:18PM -0800, Mark Knecht wrote:
> OK - it turns out if I start fdisk using the -u option it show me
> sector numbers. Looking at the original partition put on just using
> default values it had the starting sector was 63 - probably about the
> worst value it could be. As a test I blew away that partition and
> created a new one starting at 64 instead and the untar results are
> vastly improved - down to roughly 20 seconds from 8-10 minutes. That's
> roughly twice as fast as the old 120GB SATA2 drive I was using to test
> the system out while I debugged this issue.

That's good to hear.

> I'm still a little fuzzy about what happens to the extra sectors at
> the end of a track. Are they used and I pay for a little bit of
> overhead reading data off of them or are they ignored and I lose
> capacity? I think it must be the former as my partition isn't all that
> much less than 1TB.

As far as I know, you shouldn't worry about it. The
head/track/cylinder addressing is a relic of an older day. Almost all
modern drives should be accessed via LBA. If interested, take a look
at the wikipedia entry on Cylinder-Head-Sector and Logical Block
Addressing.

Basically, you are not losing anything.

Cheers,

W
--
Willie W. Wong wwong@math.princeton.edu
Data aequatione quotcunque fluentes quantitae involvente fluxiones invenire
et vice versa ~~~ I. Newton

Valmor de Almeida 02-08-2010 04:25 AM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
Mark Knecht wrote:
> On Sun, Feb 7, 2010 at 11:39 AM, Willie Wong <wwong@math.princeton.edu> wrote:
[snip]
> OK - it turns out if I start fdisk using the -u option it show me
> sector numbers. Looking at the original partition put on just using
> default values it had the starting sector was 63 - probably about the

I too was wondering why a Toshiba HDD 1.8" MK2431GAH (4kB-sector), 240
GB I've recently obtained was slow:

-> time tar xfj portage-latest.tar.bz2

real 16m5.500s
user 0m28.535s
sys 0m19.785s

Following your post I recreated a single partition (reiserfs 3.6)
starting at the 64th sector:

Disk /dev/sdb: 240.1 GB, 240057409536 bytes
255 heads, 63 sectors/track, 29185 cylinders, total 468862128 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0xe7bf4b8e

Device Boot Start End Blocks Id System
/dev/sdb1 64 468862127 234431032 83 Linux

and the time was improved

-> time tar xfj portage-latest.tar.bz2

real 2m15.600s
user 0m28.156s
sys 0m18.933s


--
Valmor

Valmor de Almeida 02-08-2010 05:52 PM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
Mark Knecht wrote:
[snip]
>
> This has been helpful for me. I'm glad Valmor is getting better
> results also.
[snip]

These 4k-sector drives can be problematic when upgrading older
computers. For instance, my laptop BIOS would not boot from the toshiba
drive I mentioned earlier. However when used as an external usb drive, I
could boot gentoo. Since I have been using this drive as backup storage
I did not investigate the reason for the lower speed. I am happy to get
a factor of 8 in speed up now after you did the research :)

Thanks for your postings.

--
Valmor

Stroller 02-08-2010 06:57 PM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
On 8 Feb 2010, at 05:25, Valmor de Almeida wrote:


Mark Knecht wrote:
On Sun, Feb 7, 2010 at 11:39 AM, Willie Wong <wwong@math.princeton.edu
> wrote:

[snip]

OK - it turns out if I start fdisk using the -u option it show me
sector numbers. Looking at the original partition put on just using
default values it had the starting sector was 63 - probably about the


I too was wondering why a Toshiba HDD 1.8" MK2431GAH (4kB-sector), 240
GB I've recently obtained was slow:

-> time tar xfj portage-latest.tar.bz2

real 16m5.500s
user 0m28.535s
sys 0m19.785s

Following your post I recreated a single partition (reiserfs 3.6)
starting at the 64th sector:

Disk /dev/sdb: 240.1 GB, 240057409536 bytes
255 heads, 63 sectors/track, 29185 cylinders, total 468862128 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0xe7bf4b8e

Device Boot Start End Blocks Id System
/dev/sdb1 64 468862127 234431032 83 Linux

and the time was improved

-> time tar xfj portage-latest.tar.bz2

real 2m15.600s
user 0m28.156s
sys 0m18.933s


Thanks to both you & Mark for posting this information about these
improved timings.


I have just checked, and I am getting 3.5 - 6 minutes (real) to untar
portage. I had blamed performance of this array on the fact that the
RAID controller is an older model PCI card I got cheap(ish) off eBay,
but I see it is also aligned beginning at sector 63.


I'm not quite sure if this is cause of poor performance here, as the
drives in this array are not quite as modern as yours - I'm guessing
that at least a couple of the drives have been bought in the last 6
months, but they are only 500GB drives. However I guess it would only
require one drive in the array to have 4K sectors and it would cause
this kind of slowdown. I will try checking their spec now.


This is the same server that caused me to post in relation to slow
Samba transfers 3 weeks ago ("How to determine if a NIC is playing
gigabit?"). I have still not yet tested thoroughly - there are always
chores getting in the way! - but it seems like I was able to transfer
the same files in about a third (or maybe even a quarter) the time at
100mbit, between my laptop & desktop Macs.


I am not immediately able to alter the partition layout, as I have
scads of data on this array. In order to test I think I will need to
create a second array, aligned optimally, and copy the data across.


I had been recently thinking that 2TB drives are now 40% cheaper per
gig than 500GB ones, so perhaps I will have to splash out on 3 of
them. This seems rather a lot of money, but I could probably use the
space. Hmmmn... actually 1TB are nearly as cheap as per gig -
considering the eBaying of my current drives, those would make a lot
of sense.


Stroller.



$ time tar xfj portage-latest.tar.bz2

real 6m3.128s
user 0m37.810s
sys 0m39.614s
$ echo p | sudo fdisk -u /dev/sdb

The number of cylinders for this disk is set to 182360.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)

Command (m for help):
Disk /dev/sdb: 1500.0 GB, 1499968045056 bytes
255 heads, 63 sectors/track, 182360 cylinders, total 2929625088 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x27a827a7

Device Boot Start End Blocks Id System
/dev/sdb1 63 2929613399 1464806668+ 83 Linux

Command (m for help): Command (m for help): Command (m for help):
got EOF thrice - exiting..
$

Frank Steinmetzger 02-08-2010 11:05 PM

1-Terabyte drives - 4K sector sizes? -> bar performance so far
 
Am Sonntag, 7. Februar 2010 schrieb Mark Knecht:

> Hi Willie,
> OK - it turns out if I start fdisk using the -u option it show me
> sector numbers. Looking at the original partition put on just using
> default values it had the starting sector was 63

Same here.

> - probably about the worst value it could be.

Hm.... what about those first 62 sectors?
I bought this 500GB drive for my laptop recently and did a fresh partitioning
scheme on it, and then rsynced the filesystems of the old, smaller drive onto
it. The first two partitions are ntfs, but I believe they also use cluster
sizes of 4k by default. So technically I could repartition everything and
then restore the contents from my backup drive.

And indeed my system becomes very sluggish when I do some HDD shuffling.

> As a test I blew away that partition and
> created a new one starting at 64 instead and the untar results are
> vastly improved - down to roughly 20 seconds from 8-10 minutes. That's
> roughly twice as fast as the old 120GB SATA2 drive I was using to test
> the system out while I debugged this issue.

Though the result justifies your decision, I would have though one has to
start at 65, unless the disk starts counting its sectors at 0.
--
Gruß | Greetings | Qapla'
Programmers don’t die, they GOSUB without RETURN.


All times are GMT. The time now is 12:40 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.