FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian User

 
 
LinkBack Thread Tools
 
Old 08-08-2011, 06:25 AM
Dion Kant
 
Default LVM write performance

Dear list,

When writing to a logical volume (/dev/sys/test) directly through the
device, I obtain a slow performance:

root@dom0-2:/dev/mapper# dd of=/dev/sys/test if=/dev/zero
4580305+0 records in
4580305+0 records out
2345116160 bytes (2.3 GB) copied, 119.327 s, 19.7 MB/s

Making a file system on top of the LV, mounting it and write into a file
is ok:

root@dom0-2:/dev/mapper# mkfs.xfs /dev/sys/test
root@dom0-2:/mnt# mount /dev/sys/test /mnt/lv
root@dom0-2:/mnt# dd of=/mnt/lv/out if=/dev/zero
2647510+0 records in
2647510+0 records out
1355525120 bytes (1.4 GB) copied, 11.3235 s, 120 MB/s

Furthermore, by accident I noticed that writing directly to the block
device is oke when the LV is mounted (of course destroying the file
system on it):

root@dom0-2:/mnt# dd of=/dev/sys/test if=/dev/zero
3703375+0 records in
3703374+0 records out
1896127488 bytes (1.9 GB) copied, 15.4927 s, 122 MB/s

Does anyone know what is going on?

The configuration is as follows:

Debian 6.0.2
Kernel 2.6.32-5-xen-amd64
Tests are on a partition on one physical disk

Best regards,

Dion Kant


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E3F8173.5040401@concero.nl">http://lists.debian.org/4E3F8173.5040401@concero.nl
 
Old 08-08-2011, 01:33 PM
Stan Hoeppner
 
Default LVM write performance

On 8/8/2011 1:25 AM, Dion Kant wrote:
>
> Dear list,
>
> When writing to a logical volume (/dev/sys/test) directly through the
> device, I obtain a slow performance:
>
> root@dom0-2:/dev/mapper# dd of=/dev/sys/test if=/dev/zero
> 4580305+0 records in
> 4580305+0 records out
> 2345116160 bytes (2.3 GB) copied, 119.327 s, 19.7 MB/s
>
> Making a file system on top of the LV, mounting it and write into a file
> is ok:
>
> root@dom0-2:/dev/mapper# mkfs.xfs /dev/sys/test
> root@dom0-2:/mnt# mount /dev/sys/test /mnt/lv
> root@dom0-2:/mnt# dd of=/mnt/lv/out if=/dev/zero
> 2647510+0 records in
> 2647510+0 records out
> 1355525120 bytes (1.4 GB) copied, 11.3235 s, 120 MB/s
>
> Furthermore, by accident I noticed that writing directly to the block
> device is oke when the LV is mounted (of course destroying the file
> system on it):
>
> root@dom0-2:/mnt# dd of=/dev/sys/test if=/dev/zero
> 3703375+0 records in
> 3703374+0 records out
> 1896127488 bytes (1.9 GB) copied, 15.4927 s, 122 MB/s
>
> Does anyone know what is going on?
>
> The configuration is as follows:

Yes. You lack knowledge of the Linux storage stack and of the dd
utility. Your system is fine. You are simply running an improper test,
and interpreting the results from that test incorrectly.

Google for more information on the "slow" results you are seeing.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E3FE596.8010904@hardwarefreak.com">http://lists.debian.org/4E3FE596.8010904@hardwarefreak.com
 
Old 08-08-2011, 07:00 PM
Dion Kant
 
Default LVM write performance

On 08/08/2011 03:33 PM, Stan Hoeppner wrote:
> On 8/8/2011 1:25 AM, Dion Kant wrote:
>> Dear list,
>>
>> When writing to a logical volume (/dev/sys/test) directly through the
>> device, I obtain a slow performance:
>>
>> root@dom0-2:/dev/mapper# dd of=/dev/sys/test if=/dev/zero
>> 4580305+0 records in
>> 4580305+0 records out
>> 2345116160 bytes (2.3 GB) copied, 119.327 s, 19.7 MB/s
>>
>> Making a file system on top of the LV, mounting it and write into a file
>> is ok:
>>
>> root@dom0-2:/dev/mapper# mkfs.xfs /dev/sys/test
>> root@dom0-2:/mnt# mount /dev/sys/test /mnt/lv
>> root@dom0-2:/mnt# dd of=/mnt/lv/out if=/dev/zero
>> 2647510+0 records in
>> 2647510+0 records out
>> 1355525120 bytes (1.4 GB) copied, 11.3235 s, 120 MB/s
>>
>> Furthermore, by accident I noticed that writing directly to the block
>> device is oke when the LV is mounted (of course destroying the file
>> system on it):
>>
>> root@dom0-2:/mnt# dd of=/dev/sys/test if=/dev/zero
>> 3703375+0 records in
>> 3703374+0 records out
>> 1896127488 bytes (1.9 GB) copied, 15.4927 s, 122 MB/s
>>
>> Does anyone know what is going on?
>>
>> The configuration is as follows:
> Yes. You lack knowledge of the Linux storage stack and of the dd
> utility. Your system is fine. You are simply running an improper test,
> and interpreting the results from that test incorrectly.
>
> Google for more information on the "slow" results you are seeing.
>
Hmm, Interpreting your answer, this behaviour is what you expect.
However, I think it is a bit strange to find, with this "improper
test", about a factor 10 difference between reading from and writing to
a logical volume by using dd directly on the device file. Note that dd
if=/dev/sys/test of=/dev/null does give disk i/o limited results.

Apparently the Debian kernel behaves differently with respect to this
"issue" from for example an openSUSE kernel, which does give symmetric
(near disk i/o limited) results.

What is the proper way to copy a (large) raw disk image onto a logical
volume?

Thanks for your advise to try Google. I already found a couple of posts
from people describing this similar issue, but no proper explanation yet.

Dion.


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E403233.5070807@concero.nl">http://lists.debian.org/4E403233.5070807@concero.nl
 
Old 08-09-2011, 04:03 AM
Stan Hoeppner
 
Default LVM write performance

On 8/8/2011 2:00 PM, Dion Kant wrote:
> On 08/08/2011 03:33 PM, Stan Hoeppner wrote:
>> On 8/8/2011 1:25 AM, Dion Kant wrote:
>>> Dear list,
>>>
>>> When writing to a logical volume (/dev/sys/test) directly through the
>>> device, I obtain a slow performance:
>>>
>>> root@dom0-2:/dev/mapper# dd of=/dev/sys/test if=/dev/zero
>>> 4580305+0 records in
>>> 4580305+0 records out
>>> 2345116160 bytes (2.3 GB) copied, 119.327 s, 19.7 MB/s
>>>
>>> Making a file system on top of the LV, mounting it and write into a file
>>> is ok:
>>>
>>> root@dom0-2:/dev/mapper# mkfs.xfs /dev/sys/test
>>> root@dom0-2:/mnt# mount /dev/sys/test /mnt/lv
>>> root@dom0-2:/mnt# dd of=/mnt/lv/out if=/dev/zero
>>> 2647510+0 records in
>>> 2647510+0 records out
>>> 1355525120 bytes (1.4 GB) copied, 11.3235 s, 120 MB/s
>>>
>>> Furthermore, by accident I noticed that writing directly to the block
>>> device is oke when the LV is mounted (of course destroying the file
>>> system on it):
>>>
>>> root@dom0-2:/mnt# dd of=/dev/sys/test if=/dev/zero
>>> 3703375+0 records in
>>> 3703374+0 records out
>>> 1896127488 bytes (1.9 GB) copied, 15.4927 s, 122 MB/s
>>>
>>> Does anyone know what is going on?
>>>
>>> The configuration is as follows:
>> Yes. You lack knowledge of the Linux storage stack and of the dd
>> utility. Your system is fine. You are simply running an improper test,
>> and interpreting the results from that test incorrectly.
>>
>> Google for more information on the "slow" results you are seeing.
>>
> Hmm, Interpreting your answer, this behaviour is what you expect.
> However, I think it is a bit strange to find, with this "improper
> test", about a factor 10 difference between reading from and writing to
> a logical volume by using dd directly on the device file. Note that dd
> if=/dev/sys/test of=/dev/null does give disk i/o limited results.

Apparently you are Google challenged as well. Here:
http://lmgtfy.com/?q=lvm+block+size

5th hit:
http://blog.famzah.net/2010/02/05/dd-sequential-write-performance-tests-on-a-raw-block-device-may-be-incorrect/

> What is the proper way to copy a (large) raw disk image onto a logical
> volume?

See above, and do additional research into dd and "block size". It also
wouldn't hurt for you to actually read and understand the dd man page.

> Thanks for your advise to try Google. I already found a couple of posts
> from people describing this similar issue, but no proper explanation yet.

I already knew the answer, so maybe my search criteria is what allowed
me to "find" the answer for you in 20 seconds or less. I hate spoon
feeding people, as spoon feeding is antithetical to learning and
remembering. Hopefully you'll learn something from this thread, and
remember it.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E40B1AE.5080505@hardwarefreak.com">http://lists.debian.org/4E40B1AE.5080505@hardwarefreak.com
 
Old 08-09-2011, 04:30 AM
Stan Hoeppner
 
Default LVM write performance

On 8/8/2011 11:03 PM, Stan Hoeppner wrote:
> On 8/8/2011 2:00 PM, Dion Kant wrote:
>> On 08/08/2011 03:33 PM, Stan Hoeppner wrote:
>>> On 8/8/2011 1:25 AM, Dion Kant wrote:
>>>> Dear list,
>>>>
>>>> When writing to a logical volume (/dev/sys/test) directly through the
>>>> device, I obtain a slow performance:
>>>>
>>>> root@dom0-2:/dev/mapper# dd of=/dev/sys/test if=/dev/zero
>>>> 4580305+0 records in
>>>> 4580305+0 records out
>>>> 2345116160 bytes (2.3 GB) copied, 119.327 s, 19.7 MB/s
>>>>
>>>> Making a file system on top of the LV, mounting it and write into a file
>>>> is ok:
>>>>
>>>> root@dom0-2:/dev/mapper# mkfs.xfs /dev/sys/test
>>>> root@dom0-2:/mnt# mount /dev/sys/test /mnt/lv
>>>> root@dom0-2:/mnt# dd of=/mnt/lv/out if=/dev/zero
>>>> 2647510+0 records in
>>>> 2647510+0 records out
>>>> 1355525120 bytes (1.4 GB) copied, 11.3235 s, 120 MB/s
>>>>
>>>> Furthermore, by accident I noticed that writing directly to the block
>>>> device is oke when the LV is mounted (of course destroying the file
>>>> system on it):
>>>>
>>>> root@dom0-2:/mnt# dd of=/dev/sys/test if=/dev/zero
>>>> 3703375+0 records in
>>>> 3703374+0 records out
>>>> 1896127488 bytes (1.9 GB) copied, 15.4927 s, 122 MB/s
>>>>
>>>> Does anyone know what is going on?
>>>>
>>>> The configuration is as follows:
>>> Yes. You lack knowledge of the Linux storage stack and of the dd
>>> utility. Your system is fine. You are simply running an improper test,
>>> and interpreting the results from that test incorrectly.
>>>
>>> Google for more information on the "slow" results you are seeing.
>>>
>> Hmm, Interpreting your answer, this behaviour is what you expect.
>> However, I think it is a bit strange to find, with this "improper
>> test", about a factor 10 difference between reading from and writing to
>> a logical volume by using dd directly on the device file. Note that dd
>> if=/dev/sys/test of=/dev/null does give disk i/o limited results.
>
> Apparently you are Google challenged as well. Here:
> http://lmgtfy.com/?q=lvm+block+size
>
> 5th hit:
> http://blog.famzah.net/2010/02/05/dd-sequential-write-performance-tests-on-a-raw-block-device-may-be-incorrect/
>
>> What is the proper way to copy a (large) raw disk image onto a logical
>> volume?
>
> See above, and do additional research into dd and "block size". It also
> wouldn't hurt for you to actually read and understand the dd man page.
>
>> Thanks for your advise to try Google. I already found a couple of posts
>> from people describing this similar issue, but no proper explanation yet.
>
> I already knew the answer, so maybe my search criteria is what allowed
> me to "find" the answer for you in 20 seconds or less. I hate spoon
> feeding people, as spoon feeding is antithetical to learning and
> remembering. Hopefully you'll learn something from this thread, and
> remember it.

BTW, you didn't mentioned what disk drive is in use in this test. Is it
an Advanced Format drive? If so, and your partitions are unaligned,
this in combination with no dd block size being specified will cause
your 10x drop in your dd "test". The wrong block size alone shouldn't
yield a 10x drop, more like 3-4x. Please state the model# of the disk
drive, and the partition table using:

/# hdparm -I /dev/sdX
/# fdisk -l /dev/sdX

Lemme guess, this is one of those POS cheap WD Green drives, isn't it?
Just in case, read this too:

http://wdc.custhelp.com/app/answers/detail/a_id/5655/~/how-to-install-a-wd-advanced-format-drive-on-a-non-windows-operating-system

This document applies to *all* Advanced Format drives, not strictly
those sold by Western Digital.

--
Stan



--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E40B7E1.1080707@hardwarefreak.com">http://lists.debian.org/4E40B7E1.1080707@hardwarefreak.com
 
Old 08-09-2011, 02:12 PM
Dion Kant
 
Default LVM write performance

On 08/09/2011 06:30 AM, Stan Hoeppner wrote:
> On 8/8/2011 11:03 PM, Stan Hoeppner wrote:
>> On 8/8/2011 2:00 PM, Dion Kant wrote:
>>> On 08/08/2011 03:33 PM, Stan Hoeppner wrote:
>>>> On 8/8/2011 1:25 AM, Dion Kant wrote:
>>>>> Dear list,
>>>>>
>>>>> When writing to a logical volume (/dev/sys/test) directly through the
>>>>> device, I obtain a slow performance:
>>>>>
>>>>> root@dom0-2:/dev/mapper# dd of=/dev/sys/test if=/dev/zero
>>>>> 4580305+0 records in
>>>>> 4580305+0 records out
>>>>> 2345116160 bytes (2.3 GB) copied, 119.327 s, 19.7 MB/s
>>>>>
>>>>> Making a file system on top of the LV, mounting it and write into a file
>>>>> is ok:
>>>>>
>>>>> root@dom0-2:/dev/mapper# mkfs.xfs /dev/sys/test
>>>>> root@dom0-2:/mnt# mount /dev/sys/test /mnt/lv
>>>>> root@dom0-2:/mnt# dd of=/mnt/lv/out if=/dev/zero
>>>>> 2647510+0 records in
>>>>> 2647510+0 records out
>>>>> 1355525120 bytes (1.4 GB) copied, 11.3235 s, 120 MB/s
>>>>>
>>>>> Furthermore, by accident I noticed that writing directly to the block
>>>>> device is oke when the LV is mounted (of course destroying the file
>>>>> system on it):
>>>>>
>>>>> root@dom0-2:/mnt# dd of=/dev/sys/test if=/dev/zero
>>>>> 3703375+0 records in
>>>>> 3703374+0 records out
>>>>> 1896127488 bytes (1.9 GB) copied, 15.4927 s, 122 MB/s
>>>>>
>>>>> Does anyone know what is going on?
>>>>>
>>>>> The configuration is as follows:
>>>> Yes. You lack knowledge of the Linux storage stack and of the dd
>>>> utility. Your system is fine. You are simply running an improper test,
>>>> and interpreting the results from that test incorrectly.
>>>>
>>>> Google for more information on the "slow" results you are seeing.
>>>>
>>> Hmm, Interpreting your answer, this behaviour is what you expect.
>>> However, I think it is a bit strange to find, with this "improper
>>> test", about a factor 10 difference between reading from and writing to
>>> a logical volume by using dd directly on the device file. Note that dd
>>> if=/dev/sys/test of=/dev/null does give disk i/o limited results.
>> Apparently you are Google challenged as well. Here:
>> http://lmgtfy.com/?q=lvm+block+size
>>
>> 5th hit:
>> http://blog.famzah.net/2010/02/05/dd-sequential-write-performance-tests-on-a-raw-block-device-may-be-incorrect/
>>
>>> What is the proper way to copy a (large) raw disk image onto a logical
>>> volume?
>> See above, and do additional research into dd and "block size". It also
>> wouldn't hurt for you to actually read and understand the dd man page.
>>
>>> Thanks for your advise to try Google. I already found a couple of posts
>>> from people describing this similar issue, but no proper explanation yet.
>> I already knew the answer, so maybe my search criteria is what allowed
>> me to "find" the answer for you in 20 seconds or less. I hate spoon
>> feeding people, as spoon feeding is antithetical to learning and
>> remembering. Hopefully you'll learn something from this thread, and
>> remember it.
> BTW, you didn't mentioned what disk drive is in use in this test. Is it
> an Advanced Format drive? If so, and your partitions are unaligned,
> this in combination with no dd block size being specified will cause
> your 10x drop in your dd "test". The wrong block size alone shouldn't
> yield a 10x drop, more like 3-4x. Please state the model# of the disk
> drive, and the partition table using:
>
> /# hdparm -I /dev/sdX
> /# fdisk -l /dev/sdX
>
> Lemme guess, this is one of those POS cheap WD Green drives, isn't it?
> Just in case, read this too:
>
> http://wdc.custhelp.com/app/answers/detail/a_id/5655/~/how-to-install-a-wd-advanced-format-drive-on-a-non-windows-operating-system
>
> This document applies to *all* Advanced Format drives, not strictly
> those sold by Western Digital.
>
Hello Stan,

Thanks for your remarks. The disk info is given below. Writing to the
disk is oke when mounted, so I think it is not a hardware/alignment
issue. However your remarks made me do some additional investigations:

1. dd of=/dev/sdb4 if=/dev/zero gives similar results, so it has nothing
to do with LVM;
2. My statement about writing like this on an openSUSE kernel is wrong.
Also with openSUSE and the same hardware I get similar (slow) results
when writing to the disk using dd via the device file.

So now the issue has diverted to the asymmetric behaviour when
writing/reading using dd directly through the (block) device file.

Reading with dd if=/dev/sdb4 of=/dev/null gives disk limited performance
Writing with dd of=/dev/sdb4 if=/dev/zero gives about a factor 10 less
performance.

However, after mounting a file system on sdb4 (read only), I can use dd
of=/dev/sdb4 if=/dev/zero at (near) disk limited performance.

Now I used this trick to copy a large (raw) disk image onto an LVM
partition. I think this is odd. Can somebody explain why this is like it is?

Here is the disk info:
Model Family: Seagate Barracuda ES
Device Model: ST3750640NS

root@dom0-2:~# fdisk -l /dev/sdb

Disk /dev/sdb: 750.2 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000eae95

Device Boot Start End Blocks Id System
/dev/sdb1 1 244 1951744 fd Linux raid
autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2 244 280 292864 fd Linux raid
autodetect
Partition 2 does not end on cylinder boundary.
/dev/sdb3 280 7575 58593280 fd Linux raid
autodetect
Partition 3 does not end on cylinder boundary.
/dev/sdb4 7575 91202 671734784 fd Linux raid
autodetect
Partition 4 does not end on cylinder boundary.

root@dom0-2:~# hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
Model Number: ST3750640NS
Serial Number: 5QD193MQ
Firmware Revision: 3.AEK
Standards:
Supported: 7 6 5 4
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 1465149168
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
device size with M = 1024*1024: 715404 MBytes
device size with M = 1000*1000: 750156 MBytes (750 GB)
cache/buffer size = 16384 KBytes
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = ?
Advanced power management level: 254
Recommended acoustic management value: 254, current value: 0
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=240ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* DOWNLOAD_MICROCODE
* Advanced Power Management feature set
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
64-bit World wide name
Time Limited Commands (TLC) feature set
Command Completion Time Limit (CCTL)
* Gen1 signaling speed (1.5Gb/s)
* Native Command Queueing (NCQ)
* Phy event counters
Device-initiated interface power management
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
Logical Unit WWN Device Identifier: 0000000000000000
NAA : 0
IEEE OUI : 000000
Unique ID : 000000000
Checksum: correct

Dion


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E41404B.6020702@concero.nl">http://lists.debian.org/4E41404B.6020702@concero.nl
 
Old 08-09-2011, 05:13 PM
Stan Hoeppner
 
Default LVM write performance

On 8/9/2011 9:12 AM, Dion Kant wrote:

> Thanks for your remarks. The disk info is given below. Writing to the
> disk is oke when mounted, so I think it is not a hardware/alignment
> issue. However your remarks made me do some additional investigations:
>
> 1. dd of=/dev/sdb4 if=/dev/zero gives similar results, so it has nothing
> to do with LVM;
> 2. My statement about writing like this on an openSUSE kernel is wrong.
> Also with openSUSE and the same hardware I get similar (slow) results
> when writing to the disk using dd via the device file.
>
> So now the issue has diverted to the asymmetric behaviour when
> writing/reading using dd directly through the (block) device file.
>
> Reading with dd if=/dev/sdb4 of=/dev/null gives disk limited performance
> Writing with dd of=/dev/sdb4 if=/dev/zero gives about a factor 10 less
> performance.

Run:
/$ dd of=/dev/sdb4 if=/dev/zero bs=4096 count=500000

Then run again with bs=512 count=2000000

That will write 2GB in 4KB blocks and will prevent dd from trying to
buffer everything before writing it. You don't break out of this--it
finishes on it's own due to 'count'. The second run with use a block
size of 512B, which is the native sector size of the Seagate disk.
Either of these should improve your actual dd performance dramatically.

When you don't specify a block size with dd, dd attempts to "buffer" the
entire input stream, or huge portions of it, into memory before writing
it out. If you look at RAM, swap usage, and disk IO while running your
'raw' dd test, you'll likely see both memory, and IO to the swap device,
are saturated, with little actual data being written to the target disk
partition.

I attempted to nudge you into finding this information on your own, but
you apparently did not. I explained all of this not long ago, either
here or on the linux-raid list. It should be in Google somewhere.
Never use dd without specifying the proper block size of the target
device--never. For a Linux filesystem this will be 4096 and for a raw
hard disk device it will be 512, optimally anyway. Other values may
give better performance, depending on the system, the disk controller,
and device driver, etc.

That Seagate isn't an AF model so sector alignment isn't the issue here,
just improper use of dd.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E416AA5.8060205@hardwarefreak.com">http://lists.debian.org/4E416AA5.8060205@hardwarefreak.com
 
Old 08-13-2011, 11:53 AM
Dion Kant
 
Default LVM write performance

On 08/09/2011 07:13 PM, Stan Hoeppner wrote:
> On 8/9/2011 9:12 AM, Dion Kant wrote:
>
>> Thanks for your remarks. The disk info is given below. Writing to the
>> disk is oke when mounted, so I think it is not a hardware/alignment
>> issue. However your remarks made me do some additional investigations:
>>
>> 1. dd of=/dev/sdb4 if=/dev/zero gives similar results, so it has nothing
>> to do with LVM;
>> 2. My statement about writing like this on an openSUSE kernel is wrong.
>> Also with openSUSE and the same hardware I get similar (slow) results
>> when writing to the disk using dd via the device file.
>>
>> So now the issue has diverted to the asymmetric behaviour when
>> writing/reading using dd directly through the (block) device file.
>>
>> Reading with dd if=/dev/sdb4 of=/dev/null gives disk limited performance
>> Writing with dd of=/dev/sdb4 if=/dev/zero gives about a factor 10 less
>> performance.
> Run:
> /$ dd of=/dev/sdb4 if=/dev/zero bs=4096 count=500000
>
> Then run again with bs=512 count=2000000
>
> That will write 2GB in 4KB blocks and will prevent dd from trying to
> buffer everything before writing it. You don't break out of this--it
> finishes on it's own due to 'count'. The second run with use a block
> size of 512B, which is the native sector size of the Seagate disk.
> Either of these should improve your actual dd performance dramatically.
>
> When you don't specify a block size with dd, dd attempts to "buffer" the
> entire input stream, or huge portions of it, into memory before writing
> it out. If you look at RAM, swap usage, and disk IO while running your
> 'raw' dd test, you'll likely see both memory, and IO to the swap device,
> are saturated, with little actual data being written to the target disk
> partition.
>
> I attempted to nudge you into finding this information on your own, but
> you apparently did not. I explained all of this not long ago, either
> here or on the linux-raid list. It should be in Google somewhere.
> Never use dd without specifying the proper block size of the target
> device--never. For a Linux filesystem this will be 4096 and for a raw
> hard disk device it will be 512, optimally anyway. Other values may
> give better performance, depending on the system, the disk controller,
> and device driver, etc.
>
> That Seagate isn't an AF model so sector alignment isn't the issue here,
> just improper use of dd.right.
>
Stan,

You are right, with bs=4096 the write performance improves
significantly. From the man page of dd I concluded that not specifying
bs selects ibs=512 and obs=512. A bs=512 gives indeed similar
performance as not specifying bs at all.

When observing the system with vmstat I see the same (strange) behaviour
for no bs specified, or bs=512:

root@dom0-2:~# vmstat 2
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
1 0 0 6314620 125988 91612 0 0 0 3 5 5 0 0
100 0
1 1 0 6265404 173744 91444 0 0 23868 13 18020 12290 0
0 86 14
2 1 0 6214576 223076 91704 0 0 24666 1 18596 12417 0
0 90 10
0 1 0 6163004 273172 91448 0 0 25046 0 18867 12614 0
0 89 11
1 0 0 6111308 323252 91592 0 0 25042 0 18861 12608 0
0 92 8
0 1 0 6059860 373220 91648 0 0 24984 0 18821 12578 0
0 85 14
0 1 0 6008164 423304 91508 0 0 25040 0 18863 12611 0
0 95 5
2 1 0 5956344 473468 91604 0 0 25084 0 18953 12630 0
0 95 5
0 1 0 5904896 523548 91532 0 0 25038 0 18867 12607 0
0 87 13
0 1 0 5896068 528680 91520 0 0 2558 99597 2431 1373 0 0
92 8
0 2 0 5896088 528688 91520 0 0 0 73736 535 100 0 0
86 13
0 1 0 5896128 528688 91520 0 0 0 73729 545 99 0 0
88 12
1 0 0 6413920 28712 91612 0 0 54 2996 634 372 0 0
95 4
0 0 0 6413940 28712 91520 0 0 0 0 78 80 0 0
100 0
0 0 0 6413940 28712 91520 0 0 0 0 94 97 0 0
100 0

Remarkable behaviour in the sense that there is a lot of bi in the
beginning and finally I see bo at 75 MB/s.

With obs=4096 it looks like

root@dom0-2:~# vmstat 2
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
1 0 0 6413600 28744 91540 0 0 0 3 5 5 0 0
100 0
1 0 0 6413724 28744 91540 0 0 0 0 103 96 0 0
100 0
1 0 0 6121616 312880 91208 0 0 0 18 457 133 1 2
97 0
0 1 0 5895588 528756 91540 0 0 0 83216 587 88 1 3
90 6
0 1 0 5895456 528756 91540 0 0 0 73728 539 98 0 0
92 8
0 3 0 5895400 528760 91536 0 0 0 73735 535 93 0 0
86 14
1 0 0 6413520 28788 91436 0 0 54 19359 783 376 0 0
93 6
0 0 0 6413544 28788 91540 0 0 0 2 100 84 0 0
100 0
0 0 0 6413544 28788 91540 0 0 0 0 86 87 0 0
100 0
0 0 0 6413552 28796 91532 0 0 0 10 110 113 0 0
100 0

As soon as I select a bs which is not a whole multiple of 4096, I get a
lot of block input and a bad performance for writing data to disk.

I'll try to Google your mentioned thread(s) on this. I still feel not
very satisfied with your explanation though.

Thanks so far,

Dion


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E4665A1.8090105@concero.nl">http://lists.debian.org/4E4665A1.8090105@concero.nl
 
Old 08-13-2011, 01:55 PM
Stan Hoeppner
 
Default LVM write performance

On 8/13/2011 6:53 AM, Dion Kant wrote:

> Stan,
>
> You are right, with bs=4096 the write performance improves
> significantly. From the man page of dd I concluded that not specifying
> bs selects ibs=512 and obs=512. A bs=512 gives indeed similar
> performance as not specifying bs at all.
>
> When observing the system with vmstat I see the same (strange) behaviour
> for no bs specified, or bs=512:
>
> root@dom0-2:~# vmstat 2
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy
> id wa
> 1 0 0 6314620 125988 91612 0 0 0 3 5 5 0 0
> 100 0
> 1 1 0 6265404 173744 91444 0 0 23868 13 18020 12290 0
> 0 86 14
> 2 1 0 6214576 223076 91704 0 0 24666 1 18596 12417 0
> 0 90 10
> 0 1 0 6163004 273172 91448 0 0 25046 0 18867 12614 0
> 0 89 11
> 1 0 0 6111308 323252 91592 0 0 25042 0 18861 12608 0
> 0 92 8
> 0 1 0 6059860 373220 91648 0 0 24984 0 18821 12578 0
> 0 85 14
> 0 1 0 6008164 423304 91508 0 0 25040 0 18863 12611 0
> 0 95 5
> 2 1 0 5956344 473468 91604 0 0 25084 0 18953 12630 0
> 0 95 5
> 0 1 0 5904896 523548 91532 0 0 25038 0 18867 12607 0
> 0 87 13
> 0 1 0 5896068 528680 91520 0 0 2558 99597 2431 1373 0 0
> 92 8
> 0 2 0 5896088 528688 91520 0 0 0 73736 535 100 0 0
> 86 13
> 0 1 0 5896128 528688 91520 0 0 0 73729 545 99 0 0
> 88 12
> 1 0 0 6413920 28712 91612 0 0 54 2996 634 372 0 0
> 95 4
> 0 0 0 6413940 28712 91520 0 0 0 0 78 80 0 0
> 100 0
> 0 0 0 6413940 28712 91520 0 0 0 0 94 97 0 0
> 100 0
>
> Remarkable behaviour in the sense that there is a lot of bi in the
> beginning and finally I see bo at 75 MB/s.

That might be due to massive merges, but I'm not really a kernel hacker
so I can't say for sure.

> With obs=4096 it looks like
>
> root@dom0-2:~# vmstat 2
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy
> id wa
> 1 0 0 6413600 28744 91540 0 0 0 3 5 5 0 0
> 100 0
> 1 0 0 6413724 28744 91540 0 0 0 0 103 96 0 0
> 100 0
> 1 0 0 6121616 312880 91208 0 0 0 18 457 133 1 2
> 97 0
> 0 1 0 5895588 528756 91540 0 0 0 83216 587 88 1 3
> 90 6
> 0 1 0 5895456 528756 91540 0 0 0 73728 539 98 0 0
> 92 8
> 0 3 0 5895400 528760 91536 0 0 0 73735 535 93 0 0
> 86 14
> 1 0 0 6413520 28788 91436 0 0 54 19359 783 376 0 0
> 93 6
> 0 0 0 6413544 28788 91540 0 0 0 2 100 84 0 0
> 100 0
> 0 0 0 6413544 28788 91540 0 0 0 0 86 87 0 0
> 100 0
> 0 0 0 6413552 28796 91532 0 0 0 10 110 113 0 0
> 100 0
>
> As soon as I select a bs which is not a whole multiple of 4096, I get a
> lot of block input and a bad performance for writing data to disk.

> I'll try to Google your mentioned thread(s) on this. I still feel not
> very satisfied with your explanation though.

My explanation to you wasn't fully correct. I confused specifying no
block size with specifying an insanely large block size. The other post
I was referring to dealt with people using a 1GB (or larger) block size
because it made the math easier for them when wanting to write a large
test file.

Instead of dividing their total file size by 4096 and using the result
for "bs=4096 count=X" (which is the proper method I described to you)
they were simply specifying, for example, "bs=2G count=1" to write a 2
GB test file. Doing this causes the massive buffering I described, and
consequently, horrible performance, typically by a factor of 10 or more,
depending on the specific system.

The horrible performance with bs=512 is likely due to the LVM block size
being 4096, and forcing block writes that are 1/8th normal size, causing
lots of merging. If you divide 120MB/s by 8 you get 15MB/s, which IIRC
from your original post, is approximately the write performance you were
seeing, which was 19MB/s.

If my explanation doesn't seem thorough enough that's because I'm not a
kernel expert. I'm just have a little better than average knowledge/
understanding of some of aspects of the kernel.

If you want a really good explanation of the reasons behind this dd
block size behavior while writing to a raw LVM device, try posting to
lkml proper or one of the sub lists dealing with LVM and the block
layer. Also, I'm sure some of the expert developers on the XFS list
could answer this as well, though it would be a little OT there, unless
of course your filesystem test yielding the 120MB/s was using XFS.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4E468237.4020108@hardwarefreak.com">http://lists.debian.org/4E468237.4020108@hardwarefreak.com
 
Old 08-13-2011, 02:45 PM
Ivan Shmakov
 
Default LVM write performance

>>>>> Stan Hoeppner <stan@hardwarefreak.com> writes:

[…]

> The horrible performance with bs=512 is likely due to the LVM block
> size being 4096, and forcing block writes that are 1/8th normal size,
> causing lots of merging. If you divide 120MB/s by 8 you get 15MB/s,
> which IIRC from your original post, is approximately the write
> performance you were seeing, which was 19MB/s.

I'm not an expert in that matter either, but I don't seem to
recall that LVM uses any “blocks”, other than, of course, the
LVM “extents.”

What's more important in my opinion is that 4096 is exactly the
platform's page size.

--cut: vgcreate(8) --
-s, --physicalextentsize PhysicalExtentSize[kKmMgGtT]
Sets the physical extent size on physical volumes of this volume
group. A size suffix (k for kilobytes up to t for terabytes) is
optional, megabytes is the default if no suffix is present. The
default is 4 MB and it must be at least 1 KB and a power of 2.
--cut: vgcreate(8) --

[…]

--
FSF associate member #7257


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 864o1l8j5c.fsf@gray.siamics.net">http://lists.debian.org/864o1l8j5c.fsf@gray.siamics.net
 

Thread Tools




All times are GMT. The time now is 02:43 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org