FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Fedora Infrastructure

 
 
LinkBack Thread Tools
 
Old 08-26-2010, 11:35 PM
Stephen John Smoogen
 
Default Compressing files (gz versus bz2 versus xz)

So I was asked to look at compression on log servers and to see if
changing to xz would save us some space. My test is not comprehensive
but showed what might happen.

Basic summary. XZ may save us up to 2% over what we are currently
saving but its real advantage is in speed of uncompressing files over
bzip2. [compression may be faster for some files also.]

File | Size | Gzip | G% | Bunzip2 | B% | XZ | X%
messages.log | 644568 | 10992 | 98.3 | 4856 | 99.3 | 5940 | 99.1
mail.log | 610816 | 65060 | 89.3 | 40836 | 93.3 | 35536 | 94.5
TOTAL | 1255384 | 76052 | 93.5 | 45692 | 96.1 | 41476 | 96.5

Program | Compression Time | Uncompression Time
GZIP | 00m43.416s | 00m10.033s
BZIP | 10m42.296s | 01m02.525s
XZ | 10m15.937s | 00m12.565s


Raw data below

root@log01 smooge-b]# du -s messages.log mail.log
644568 messages.log
610816 mail.log
[root@log01 smooge-b]# time gzip -v -9 messages.log mail.log
messages.log: 98.3% -- replaced with messages.log.gz
mail.log: 89.3% -- replaced with mail.log.gz

real 0m43.416s
user 0m41.335s
sys 0m1.736s
[root@log01 smooge-b]# du -s messages.log.gz mail.log.gz
10992 messages.log.gz
65060 mail.log.gz
[root@log01 smooge-b]# time gunzip -v messages.log.gz mail.log.gz
messages.log.gz: 98.3% -- replaced with messages.log
mail.log.gz: 89.3% -- replaced with mail.log

real 0m10.033s
user 0m6.948s
sys 0m3.004s

[root@log01 smooge-b]# time bzip2 -v -9 messages.log mail.log
messages.log: 133.043:1, 0.060 bits/byte, 99.25% saved, 659381328
in, 4956148 out.
mail.log: 14.961:1, 0.535 bits/byte, 93.32% saved, 624854215
in, 41766136 out.

real 10m42.296s
user 10m36.948s
sys 0m1.608s
[root@log01 smooge-b]# du -sc messages.log.bz2 mail.log.bz2
4856 messages.log.bz2
40836 mail.log.bz2
45692 total
[root@log01 smooge-b]# time bunzip2 -v messages.log.bz2 mail.log.bz2
messages.log.bz2: done
mail.log.bz2: done

real 1m2.525s
user 0m44.779s
sys 0m4.956s

[root@log01 smooge-b]# time xz -v -9 messages.log mail.log
messages.log (1/2)
100.0 % 5,923.6 KiB / 628.8 MiB = 0.009 3.1 MiB/s 3:21

mail.log (2/2)
100.0 % 34.7 MiB / 595.9 MiB = 0.058 1.4 MiB/s 6:53

real 10m15.937s
user 10m8.550s
sys 0m3.552s
[root@log01 smooge-b]# du -s messages.log.xz mail.log.xz
5940 messages.log.xz
35536 mail.log.xz
[root@log01 smooge-b]# time unxz -v messages.log.xz mail.log.xz
messages.log.xz (1/2)
100.0 % 5,923.6 KiB / 628.8 MiB = 0.009 140 MiB/s 0:04

mail.log.xz (2/2)
100.0 % 34.7 MiB / 595.9 MiB = 0.058 74 MiB/s 0:08

real 0m12.565s
user 0m8.709s
sys 0m3.636s



--
Stephen J Smoogen.
“The core skill of innovators is error recovery, not failure avoidance.”
Randy Nelson, President of Pixar University.
"We have a strategic plan. It's called doing things.""
— Herb Kelleher, founder Southwest Airlines
_______________________________________________
infrastructure mailing list
infrastructure@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/infrastructure
 
Old 08-26-2010, 11:44 PM
Mike McGrath
 
Default Compressing files (gz versus bz2 versus xz)

On Thu, 26 Aug 2010, Stephen John Smoogen wrote:

> So I was asked to look at compression on log servers and to see if
> changing to xz would save us some space. My test is not comprehensive
> but showed what might happen.
>
> Basic summary. XZ may save us up to 2% over what we are currently
> saving but its real advantage is in speed of uncompressing files over
> bzip2. [compression may be faster for some files also.]
>
> File | Size | Gzip | G% | Bunzip2 | B% | XZ | X%
> messages.log | 644568 | 10992 | 98.3 | 4856 | 99.3 | 5940 | 99.1
> mail.log | 610816 | 65060 | 89.3 | 40836 | 93.3 | 35536 | 94.5
> TOTAL | 1255384 | 76052 | 93.5 | 45692 | 96.1 | 41476 | 96.5
>
> Program | Compression Time | Uncompression Time
> GZIP | 00m43.416s | 00m10.033s
> BZIP | 10m42.296s | 01m02.525s
> XZ | 10m15.937s | 00m12.565s
>
>
> Raw data below
>
> root@log01 smooge-b]# du -s messages.log mail.log
> 644568 messages.log
> 610816 mail.log
> [root@log01 smooge-b]# time gzip -v -9 messages.log mail.log
> messages.log: 98.3% -- replaced with messages.log.gz
> mail.log: 89.3% -- replaced with mail.log.gz
>
> real 0m43.416s
> user 0m41.335s
> sys 0m1.736s
> [root@log01 smooge-b]# du -s messages.log.gz mail.log.gz
> 10992 messages.log.gz
> 65060 mail.log.gz
> [root@log01 smooge-b]# time gunzip -v messages.log.gz mail.log.gz
> messages.log.gz: 98.3% -- replaced with messages.log
> mail.log.gz: 89.3% -- replaced with mail.log
>
> real 0m10.033s
> user 0m6.948s
> sys 0m3.004s
>
> [root@log01 smooge-b]# time bzip2 -v -9 messages.log mail.log
> messages.log: 133.043:1, 0.060 bits/byte, 99.25% saved, 659381328
> in, 4956148 out.
> mail.log: 14.961:1, 0.535 bits/byte, 93.32% saved, 624854215
> in, 41766136 out.
>
> real 10m42.296s
> user 10m36.948s
> sys 0m1.608s
> [root@log01 smooge-b]# du -sc messages.log.bz2 mail.log.bz2
> 4856 messages.log.bz2
> 40836 mail.log.bz2
> 45692 total
> [root@log01 smooge-b]# time bunzip2 -v messages.log.bz2 mail.log.bz2
> messages.log.bz2: done
> mail.log.bz2: done
>
> real 1m2.525s
> user 0m44.779s
> sys 0m4.956s
>
> [root@log01 smooge-b]# time xz -v -9 messages.log mail.log
> messages.log (1/2)
> 100.0 % 5,923.6 KiB / 628.8 MiB = 0.009 3.1 MiB/s 3:21
>
> mail.log (2/2)
> 100.0 % 34.7 MiB / 595.9 MiB = 0.058 1.4 MiB/s 6:53
>
> real 10m15.937s
> user 10m8.550s
> sys 0m3.552s
> [root@log01 smooge-b]# du -s messages.log.xz mail.log.xz
> 5940 messages.log.xz
> 35536 mail.log.xz
> [root@log01 smooge-b]# time unxz -v messages.log.xz mail.log.xz
> messages.log.xz (1/2)
> 100.0 % 5,923.6 KiB / 628.8 MiB = 0.009 140 MiB/s 0:04
>
> mail.log.xz (2/2)
> 100.0 % 34.7 MiB / 595.9 MiB = 0.058 74 MiB/s 0:08
>
> real 0m12.565s
> user 0m8.709s
> sys 0m3.636s
>
>

It does take a while to grep through the bzipped logs. if you want to
re-compress them all i say have at it.

-Mike
_______________________________________________
infrastructure mailing list
infrastructure@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/infrastructure
 
Old 08-27-2010, 03:58 AM
Stephen John Smoogen
 
Default Compressing files (gz versus bz2 versus xz)

On Thu, Aug 26, 2010 at 17:44, Mike McGrath <mmcgrath@redhat.com> wrote:
> On Thu, 26 Aug 2010, Stephen John Smoogen wrote:
>
>> So I was asked to look at compression on log servers and to see if
>> changing to xz would save us some space. My test is not comprehensive
>> but showed what might happen.
>>
>> Basic summary. XZ may save us up to 2% over what we are currently
>> saving but its real advantage is in speed of uncompressing files over
>> bzip2. [compression may be faster for some files also.]
>>

>
> It does take a while to grep through the bzipped logs. *if you want to
> re-compress them all i say have at it.

Ok I will look at it after I get the hardware call in tomorrow.




--
Stephen J Smoogen.
“The core skill of innovators is error recovery, not failure avoidance.”
Randy Nelson, President of Pixar University.
"We have a strategic plan. It's called doing things.""
— Herb Kelleher, founder Southwest Airlines
_______________________________________________
infrastructure mailing list
infrastructure@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/infrastructure
 

Thread Tools




All times are GMT. The time now is 07:49 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org