I have implemented the reading directly from the db tarball for sync
databases. See the patchset on by "backend" branch
(http://projects.archlinux.org/users/allan/pacman.git/log/?h=backend).
Some of the changes are quite big (e.g. 9 files changed, 2184
insertions, 1571 deletions), but that is mainly moving and duplicating
code in the process of splitting the handling of local and sync dbs.
There is still _a lot_ of clean-up to be done in the code, but it is
fully functional.
This breaks pactests... lots of pactests... but that is expected as the
pactest suite does not understand a tar based sync db. I would like
help fixing that as I get lost in the pactest code.
Anyway, here is the timing for some operations. First time is with
dropped caches, the second running it again with the db cached.
echo "n" | pacman -Su (five packages to update)
old: 0m17.823s, 0m0.306s
new: 0m10.577s, 0m0.489s
So the tar backend gets some quite significant speed gains in the
uncached case, and has marginal losses when the db is cached.
Note that the entire sync db is read and parsed no matter the (sync)
operation. That explains the difference in the cases with the db
cached. This could be adjusted to read the package list (pkgname,
pkgver-pkgrel) first and read the desc/depends files for the entire db
or a single package as needed but that is probably distant future stuff...
Allan
10-10-2010, 05:50 AM
Allan McRae
tar based sync backend implemented!
Whoops... supposed to go to pacman-dev.
Allan
10-10-2010, 05:50 AM
Allan McRae
tar based sync backend implemented!
Hi,
I have implemented the reading directly from the db tarball for sync
databases. See the patchset on by "backend" branch
(http://projects.archlinux.org/users/allan/pacman.git/log/?h=backend).
Some of the changes are quite big (e.g. 9 files changed, 2184
insertions, 1571 deletions), but that is mainly moving and duplicating
code in the process of splitting the handling of local and sync dbs.
There is still _a lot_ of clean-up to be done in the code, but it is
fully functional.
This breaks pactests... lots of pactests... but that is expected as the
pactest suite does not understand a tar based sync db. I would like
help fixing that as I get lost in the pactest code.
Anyway, here is the timing for some operations. First time is with
dropped caches, the second running it again with the db cached.
echo "n" | pacman -Su (five packages to update)
old: 0m17.823s, 0m0.306s
new: 0m10.577s, 0m0.489s
So the tar backend gets some quite significant speed gains in the
uncached case, and has marginal losses when the db is cached.
Note that the entire sync db is read and parsed no matter the (sync)
operation. That explains the difference in the cases with the db
cached. This could be adjusted to read the package list (pkgname,
pkgver-pkgrel) first and read the desc/depends files for the entire db
or a single package as needed but that is probably distant future stuff...
Allan
10-11-2010, 05:17 AM
Allan McRae
tar based sync backend implemented!
On 10/10/10 15:50, Allan McRae wrote:
Hi,
I have implemented the reading directly from the db tarball for sync
databases. See the patchset on by "backend" branch
(http://projects.archlinux.org/users/allan/pacman.git/log/?h=backend).
Some of the changes are quite big (e.g. 9 files changed, 2184
insertions, 1571 deletions), but that is mainly moving and duplicating
code in the process of splitting the handling of local and sync dbs.
There is still _a lot_ of clean-up to be done in the code, but it is
fully functional.
I have now pushed the patches that do all the clean-up I thought was
needed. I also fixed a couple of small issues Xavier noticed and
restored parsing the "deltas" file in sync databases (rebased into other
patch). With Xavier's changes to the pactest suite, all pactests pass
(except known failures).
So I basically think this branch is good to go! It still needs rebased
on master (commit 821ff061 makes it need some manual intervention...)
Any further comments would be much appreciated.
Allan
10-13-2010, 01:53 PM
Allan McRae
tar based sync backend implemented!
On 10/10/10 15:50, Allan McRae wrote:
Hi,
I have implemented the reading directly from the db tarball for sync
databases. See the patchset on by "backend" branch
(http://projects.archlinux.org/users/allan/pacman.git/log/?h=backend).
Some of the changes are quite big (e.g. 9 files changed, 2184
insertions, 1571 deletions),
<snip>
I did some rebasing today and split that big patch up as much as
possible. It should be much easier to follow/review what I did now.