As part of Google Summer of Code, together with two other students and
our mentors, I'm looking into supporting non-ebuild repository formats
in existing package managers. Currently, in Portage these are
supported by externally generating ebuilds from available metadata in
advance, and then using these in Portage by putting them in a local
overlay (see g-cpan for an example). In Paludis, there is some
"native" support for non-ebuild repository types (they have exheres,
just for starters), and I've been told it's fairly trivial to add new
repository types in Pkgcore, too, simply by inheriting from a base
My point is of course not to bash on Portage, but to come up with an
elegant solution to support a list of new repository types.
Personally, Google is about to pay me for adding support for R package
repositories (CRAN and Bioconductor), and the two other students will
be doing Pypi and PEAR support, respectively. We aren't looking
forward to doing one task nine times: support for package managers.
See, our projects communicate with two others: on one side, we have to
read and interpret package repositories, on the other side, we need to
communicate with all the different package managers. Reading
repositories is something we'll only have to implement once for a
given repository format, but passing the relevant data to portage,
pkgcore or paludis is something less trivial.
We, students and mentors, have thought of various plans to only have
to write repository "plugins" and tackle the package manager side
together once and for all, but we couldn't reach agreement. How can we
support the three existing package managers, and any future package
manager which only supports PMS? To accomplish PMS-only package
manager support, the repository code would at least have to be able to
generate ebuilds, but perhaps we could come up with something better
than simply translating one type of metadata into the other, for
slightly more advanced package managers? Could we perhaps come up with
a *standard* for non-ebuild repository type /definitions/? Ideally,
developers would only have to write one chunk of code to read a
repository format, and then all package managers would be able to read
repositories of that type.
Now, before we roll in the discussion of stability: yes, blindly
importing packages from upstream is dangerous, and yes, it may just
set your cat on fire. However, it's a matter of fact that the package
maintainers simply don't have the time to manually check all packages,
and isn't Gentoo all /about/ at least having /access/ to those nuclear
plants? If this monster turns out to be deadly, we can always disable
support by default. That said, I do believe many packages, especially
in CRAN and Bioconductor, eventually will not do much more harm than
cause inflation of the U.S. dollar, which honestly cannot be helped
anyway. Wait, did I say Bioconductor? Oh, then terrorists with access
to Gentoo may have less trouble developing their own biological
weapons, but so be it, as long as they publish their DNA code as
The final questions I'm really here for, then, are: how do you feel
about standardizing repository format definitions, how should we
support new repository types in current package managers, how should
we go about constructing a common interface for new format definitions
and why is it always on days like these I run out of coffee?
Thanks for all your thoughts!
Auke Booij / tulcod.