FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > ArchLinux > ArchLinux Pacman Development

 
 
LinkBack Thread Tools
 
Old 12-25-2011, 09:37 AM
Allan McRae
 
Default makepkg: calculate exact total file size

The current calculation of the total file size for a package using "du"
suffers from issues in portability and correctness. Especially on btrfs,
this can result in clearly wrong package information such as:

Download Size : 14684.29 KiB
Installed Size : 7628.00 KiB

Use a slower but more accurate method involving "cat" and "wc" to
calculate total file size.

Signed-off-by: Allan McRae <allan@archlinux.org>
---
configure.ac | 1 -
scripts/Makefile.am | 1 -
scripts/makepkg.sh.in | 3 +--
3 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/configure.ac b/configure.ac
index fb1e0b3..6941054 100644
--- a/configure.ac
+++ b/configure.ac
@@ -245,7 +245,6 @@ esac

AM_CONDITIONAL([CYGWIN], test "x$host_os_cygwin" = "xyes")
AM_CONDITIONAL([DARWIN], test "x$host_os_darwin" = "xyes")
-AC_PATH_PROGS([DUPATH], [du], [du], [/usr/bin$PATH_SEPARATOR/bin] )
AC_SUBST(SIZECMD)
AC_SUBST(SEDINPLACE)
AC_SUBST(STRIP_BINARIES)
diff --git a/scripts/Makefile.am b/scripts/Makefile.am
index 328fbff..727de25 100644
--- a/scripts/Makefile.am
+++ b/scripts/Makefile.am
@@ -53,7 +53,6 @@ edit = sed
-e 's|@BUILDSCRIPT[@]|$(BUILDSCRIPT)|g'
-e 's|@SIZECMD[@]|$(SIZECMD)|g'
-e 's|@SEDINPLACE[@]|$(SEDINPLACE)|g'
- -e 's|@DUPATH[@]|$(DUPATH)|g'
-e 's|@SCRIPTNAME[@]|$@|g'
-e 's|@configure_input[@]|Generated from $@.sh.in; do not edit by hand.|g'

diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
index 8c6984d..c78db86 100644
--- a/scripts/makepkg.sh.in
+++ b/scripts/makepkg.sh.in
@@ -1132,8 +1132,7 @@ write_pkginfo() {
else
local packager="Unknown Packager"
fi
- local size="$(@DUPATH@ -sk)"
- size="$(( ${size%%[^0-9]*} * 1024 ))"
+ local size="$(find . | xargs cat 2>/dev/null | wc -c)"

msg2 "$(gettext "Generating %s file...")" ".PKGINFO"
echo "# Generated by makepkg $myver"
--
1.7.8.1
 
Old 12-25-2011, 02:06 PM
Dave Reisner
 
Default makepkg: calculate exact total file size

On Sun, Dec 25, 2011 at 08:37:24PM +1000, Allan McRae wrote:
> The current calculation of the total file size for a package using "du"
> suffers from issues in portability and correctness. Especially on btrfs,
> this can result in clearly wrong package information such as:
>
> Download Size : 14684.29 KiB
> Installed Size : 7628.00 KiB
>
> Use a slower but more accurate method involving "cat" and "wc" to
> calculate total file size.
>
> Signed-off-by: Allan McRae <allan@archlinux.org>
> ---
> configure.ac | 1 -
> scripts/Makefile.am | 1 -
> scripts/makepkg.sh.in | 3 +--
> 3 files changed, 1 insertions(+), 4 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index fb1e0b3..6941054 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -245,7 +245,6 @@ esac
>
> AM_CONDITIONAL([CYGWIN], test "x$host_os_cygwin" = "xyes")
> AM_CONDITIONAL([DARWIN], test "x$host_os_darwin" = "xyes")
> -AC_PATH_PROGS([DUPATH], [du], [du], [/usr/bin$PATH_SEPARATOR/bin] )
> AC_SUBST(SIZECMD)
> AC_SUBST(SEDINPLACE)
> AC_SUBST(STRIP_BINARIES)
> diff --git a/scripts/Makefile.am b/scripts/Makefile.am
> index 328fbff..727de25 100644
> --- a/scripts/Makefile.am
> +++ b/scripts/Makefile.am
> @@ -53,7 +53,6 @@ edit = sed
> -e 's|@BUILDSCRIPT[@]|$(BUILDSCRIPT)|g'
> -e 's|@SIZECMD[@]|$(SIZECMD)|g'
> -e 's|@SEDINPLACE[@]|$(SEDINPLACE)|g'
> - -e 's|@DUPATH[@]|$(DUPATH)|g'
> -e 's|@SCRIPTNAME[@]|$@|g'
> -e 's|@configure_input[@]|Generated from $@.sh.in; do not edit by hand.|g'
>
> diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
> index 8c6984d..c78db86 100644
> --- a/scripts/makepkg.sh.in
> +++ b/scripts/makepkg.sh.in
> @@ -1132,8 +1132,7 @@ write_pkginfo() {
> else
> local packager="Unknown Packager"
> fi
> - local size="$(@DUPATH@ -sk)"
> - size="$(( ${size%%[^0-9]*} * 1024 ))"
> + local size="$(find . | xargs cat 2>/dev/null | wc -c)"

Unsafe xargs usage.

find . -print0 | xargs -0 2>/dev/null | wc -c

Why can't we use @SIZECMD@ here? Same issues as du?

>
> msg2 "$(gettext "Generating %s file...")" ".PKGINFO"
> echo "# Generated by makepkg $myver"
> --
> 1.7.8.1
>
>
 
Old 12-25-2011, 04:20 PM
Florian Pritz
 
Default makepkg: calculate exact total file size

On 25.12.2011 16:06, Dave Reisner wrote:
> On Sun, Dec 25, 2011 at 08:37:24PM +1000, Allan McRae wrote:
>> The current calculation of the total file size for a package using "du"
>> suffers from issues in portability and correctness. Especially on btrfs,
>> this can result in clearly wrong package information such as:
>>
>> Download Size : 14684.29 KiB
>> Installed Size : 7628.00 KiB
>>
>> Use a slower but more accurate method involving "cat" and "wc" to
>> calculate total file size.
>>
>> Signed-off-by: Allan McRae <allan@archlinux.org>
>> ---
>> diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
>> index 8c6984d..c78db86 100644
>> --- a/scripts/makepkg.sh.in
>> +++ b/scripts/makepkg.sh.in
>> @@ -1132,8 +1132,7 @@ write_pkginfo() {
>> else
>> local packager="Unknown Packager"
>> fi
>> - local size="$(@DUPATH@ -sk)"
>> - size="$(( ${size%%[^0-9]*} * 1024 ))"
>> + local size="$(find . | xargs cat 2>/dev/null | wc -c)"
>
> Unsafe xargs usage.
>
> find . -print0 | xargs -0 2>/dev/null | wc -c

You forgot the cat.

find . -print0 | xargs -0 cat 2>/dev/null | wc -c

>
> Why can't we use @SIZECMD@ here? Same issues as du?
>

SIZECMD returns one file size per line so we'd also have to add them up.

Small test (1003 bytes PKGBUILD on btrfs with default mount options):
SIZECMD (stat -L -c %s) 1003
du -skh 512
du -sb 1003
wc -c 1003

If du -sb is portable that might be the easiest way.

--
Florian Pritz
 
Old 12-25-2011, 04:27 PM
Dave Reisner
 
Default makepkg: calculate exact total file size

On Sun, Dec 25, 2011 at 06:20:27PM +0100, Florian Pritz wrote:
> On 25.12.2011 16:06, Dave Reisner wrote:
> > On Sun, Dec 25, 2011 at 08:37:24PM +1000, Allan McRae wrote:
> >> The current calculation of the total file size for a package using "du"
> >> suffers from issues in portability and correctness. Especially on btrfs,
> >> this can result in clearly wrong package information such as:
> >>
> >> Download Size : 14684.29 KiB
> >> Installed Size : 7628.00 KiB
> >>
> >> Use a slower but more accurate method involving "cat" and "wc" to
> >> calculate total file size.
> >>
> >> Signed-off-by: Allan McRae <allan@archlinux.org>
> >> ---
> >> diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
> >> index 8c6984d..c78db86 100644
> >> --- a/scripts/makepkg.sh.in
> >> +++ b/scripts/makepkg.sh.in
> >> @@ -1132,8 +1132,7 @@ write_pkginfo() {
> >> else
> >> local packager="Unknown Packager"
> >> fi
> >> - local size="$(@DUPATH@ -sk)"
> >> - size="$(( ${size%%[^0-9]*} * 1024 ))"
> >> + local size="$(find . | xargs cat 2>/dev/null | wc -c)"
> >
> > Unsafe xargs usage.
> >
> > find . -print0 | xargs -0 2>/dev/null | wc -c
>
> You forgot the cat.
>
> find . -print0 | xargs -0 cat 2>/dev/null | wc -c
>

meow. Sorry, mittens.

> >
> > Why can't we use @SIZECMD@ here? Same issues as du?
> >
>
> SIZECMD returns one file size per line so we'd also have to add them up.

Yup. I do this in paccache:

@SIZECMD@ "${candidates[@]}" | awk '{ sum += $1 } END { print sum }'

>
> Small test (1003 bytes PKGBUILD on btrfs with default mount options):
> SIZECMD (stat -L -c %s) 1003
> du -skh 512
> du -sb 1003
> wc -c 1003
>
> If du -sb is portable that might be the easiest way.
>
> --
> Florian Pritz
>



>
 
Old 12-25-2011, 09:09 PM
Allan McRae
 
Default makepkg: calculate exact total file size

On 26/12/11 03:27, Dave Reisner wrote:
> On Sun, Dec 25, 2011 at 06:20:27PM +0100, Florian Pritz wrote:
>> On 25.12.2011 16:06, Dave Reisner wrote:
>>> On Sun, Dec 25, 2011 at 08:37:24PM +1000, Allan McRae wrote:
>>>> The current calculation of the total file size for a package using "du"
>>>> suffers from issues in portability and correctness. Especially on btrfs,
>>>> this can result in clearly wrong package information such as:
>>>>
>>>> Download Size : 14684.29 KiB
>>>> Installed Size : 7628.00 KiB
>>>>
>>>> Use a slower but more accurate method involving "cat" and "wc" to
>>>> calculate total file size.
>>>>
>>>> Signed-off-by: Allan McRae <allan@archlinux.org>
>>>> ---
>>>> diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
>>>> index 8c6984d..c78db86 100644
>>>> --- a/scripts/makepkg.sh.in
>>>> +++ b/scripts/makepkg.sh.in
>>>> @@ -1132,8 +1132,7 @@ write_pkginfo() {
>>>> else
>>>> local packager="Unknown Packager"
>>>> fi
>>>> - local size="$(@DUPATH@ -sk)"
>>>> - size="$(( ${size%%[^0-9]*} * 1024 ))"
>>>> + local size="$(find . | xargs cat 2>/dev/null | wc -c)"
>>>
>>> Unsafe xargs usage.
>>>
>>> find . -print0 | xargs -0 2>/dev/null | wc -c
>>
>> You forgot the cat.
>>
>> find . -print0 | xargs -0 cat 2>/dev/null | wc -c
>>
>
> meow. Sorry, mittens.
>
>>>
>>> Why can't we use @SIZECMD@ here? Same issues as du?
>>>
>>
>> SIZECMD returns one file size per line so we'd also have to add them up.
>
> Yup. I do this in paccache:
>
> @SIZECMD@ "${candidates[@]}" | awk '{ sum += $1 } END { print sum }'

I'm happy using @SIZECMD@:

allan@mugen ~/tmp/libreoffice
> find . -print0 | xargs -0 stat -L -c %s | awk '{sum += $1 } END {
print sum }'
196464312

allan@mugen ~/tmp/libreoffice
> find . -print0 | xargs -0 cat 2>/dev/null | wc -c
195276472

allan@mugen ~/tmp/libreoffice
> du -sb
196444140 .

Of course the numbers between the stat and wc approach are different
because stat adds a "block size" amount for each directory of which
there is 290 in the libreoffice package:

(196464312 - 195276472) / 4096 = 290

So the SIZECMD approach is filesystem dependent, but in a way that is
creates minimal difference, unlike the current approach which can wildly
vary. It is also about the same speed as the current du based approach.

>>
>> Small test (1003 bytes PKGBUILD on btrfs with default mount options):
>> SIZECMD (stat -L -c %s) 1003
>> du -skh 512
>> du -sb 1003
>> wc -c 1003
>>
>> If du -sb is portable that might be the easiest way.
>>

It is not... which is why currently -k is used and then we multiply by
1024. There are many messages on pacman-dev about this back when that
change was made to makepkg.
 
Old 12-25-2011, 09:50 PM
Florian Pritz
 
Default makepkg: calculate exact total file size

On 25.12.2011 23:09, Allan McRae wrote:
> On 26/12/11 03:27, Dave Reisner wrote:
>> On Sun, Dec 25, 2011 at 06:20:27PM +0100, Florian Pritz wrote:
>>> On 25.12.2011 16:06, Dave Reisner wrote:
>>>> Why can't we use @SIZECMD@ here? Same issues as du?
>>>>
>>>
>>> SIZECMD returns one file size per line so we'd also have to add them up.
>>
>> Yup. I do this in paccache:
>>
>> @SIZECMD@ "${candidates[@]}" | awk '{ sum += $1 } END { print sum }'
>
> I'm happy using @SIZECMD@:
>
> allan@mugen ~/tmp/libreoffice
>> find . -print0 | xargs -0 stat -L -c %s | awk '{sum += $1 } END {
> print sum }'
> 196464312
>
> allan@mugen ~/tmp/libreoffice
>> find . -print0 | xargs -0 cat 2>/dev/null | wc -c
> 195276472
>
> allan@mugen ~/tmp/libreoffice
>> du -sb
> 196444140 .
>
> Of course the numbers between the stat and wc approach are different
> because stat adds a "block size" amount for each directory of which
> there is 290 in the libreoffice package:
>
> (196464312 - 195276472) / 4096 = 290
>
> So the SIZECMD approach is filesystem dependent, but in a way that is
> creates minimal difference, unlike the current approach which can wildly
> vary. It is also about the same speed as the current du based approach.

find . ! -type -d -print0

--
Florian Pritz
 
Old 12-25-2011, 09:57 PM
Dave Reisner
 
Default makepkg: calculate exact total file size

On Sun, Dec 25, 2011 at 11:50:57PM +0100, Florian Pritz wrote:
> On 25.12.2011 23:09, Allan McRae wrote:
> > On 26/12/11 03:27, Dave Reisner wrote:
> >> On Sun, Dec 25, 2011 at 06:20:27PM +0100, Florian Pritz wrote:
> >>> On 25.12.2011 16:06, Dave Reisner wrote:
> >>>> Why can't we use @SIZECMD@ here? Same issues as du?
> >>>>
> >>>
> >>> SIZECMD returns one file size per line so we'd also have to add them up.
> >>
> >> Yup. I do this in paccache:
> >>
> >> @SIZECMD@ "${candidates[@]}" | awk '{ sum += $1 } END { print sum }'
> >
> > I'm happy using @SIZECMD@:
> >
> > allan@mugen ~/tmp/libreoffice
> >> find . -print0 | xargs -0 stat -L -c %s | awk '{sum += $1 } END {
> > print sum }'
> > 196464312
> >
> > allan@mugen ~/tmp/libreoffice
> >> find . -print0 | xargs -0 cat 2>/dev/null | wc -c
> > 195276472
> >
> > allan@mugen ~/tmp/libreoffice
> >> du -sb
> > 196444140 .
> >
> > Of course the numbers between the stat and wc approach are different
> > because stat adds a "block size" amount for each directory of which
> > there is 290 in the libreoffice package:
> >
> > (196464312 - 195276472) / 4096 = 290
> >
> > So the SIZECMD approach is filesystem dependent, but in a way that is
> > creates minimal difference, unlike the current approach which can wildly
> > vary. It is also about the same speed as the current du based approach.
>
> find . ! -type -d -print0
>

But directories add to the size of the package. IMO, it's correct to
just sum up everything.

d

P.S. history expansion doesn't happen in a non-interactive shell.
there's no need to escape the !.
 

Thread Tools




All times are GMT. The time now is 04:28 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org