FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo Development

 
 
LinkBack Thread Tools
 
Old 10-02-2012, 04:15 AM
Brian Harring
 
Default CVS -> git, list of where non-infra folk can contribute

Cross-posting to scm; responses should go to scm please (and the
people who whinge about cross posting should go promptly to hell if
I have any say in the matter).

On Mon, Oct 01, 2012 at 05:58:43PM -0700, Diego Elio Petten?? wrote:
> On 01/10/2012 17:51, Gregory M. Turner wrote:
> >
> > Anyhow, I get it: administering the vcs for a huge project such as
> > Gentoo is very hard work. If I somehow gave some other impression, I'm
> > sorry. Perhaps Rich and I insensitively voiced our shared assumption
> > that Gentoo's continued reliance on cvs stems from a lack of motivation
> > and consensus, rather than a shortage of labor and resources.
>
> That's definitely not the case. While we do have had some complains
> (mostly from Prefix last I knew) about git's working, the consensus for
> going to git is there. The problems are vastly technical.
>
> Problems such as "how many developers would be fine with having to
> checkout 2GB of history to be able to commit"? git support shallow
> clones but not if you want to commit to them.

Few corrections;
1) You can commit to shallow clones. You can actually push from them
too- you just have to know what you're doing (your parent *has* to be
known to the other side, else you're trying to push a disconnected
history/graph to the other side, which doesn't know how to connect the
two). We won't be doing that fortunately, just noting that it is
possible if you're careful (and I know what the man page says; what
I'm saying is the full version, rather than the short version they
list there).

2) graft's are what we'll be doing there; kind of shallow, but now.
Basically the same thing the kernel folk did.


As for the "quit your bitching and contribute already" rant angle;
Diego's accurate; minimally, it's more productive to contribute and
you're less likely to crap on folks motivation, let alone risk the
wraith of a pissy person like me yelling at you.

Here in is the kicker; certain chunks of this can't be handled by
random joe blow off the street- they require core infra access.

Bluntly (no disrespect to people, just being brutally direct) I don't
care if you have infra friends, I don't care if you maintain a couple
of boxes; if you're doing heavy OPs in a production environment,
you'll understand the issue of trust/access- thus you'll understand
that some of this work, cannot be done by anyone but infra.

Like it or not, very few people have access to the core cvs -> rsync
hosts/machinery- since each/every/one/of/us means it's a security
angle that has to be tracked. That's not arguable, so don't even try
please.

That said, there are non-infra contributions people can make.

I suggest people do that; here's the list off the top of my head
(these are things worst case, I'll sort- which means it'll be months
out till I finish them considering my own time constraints and focus
on getting eapi5 support into pkgcore first).

0) First the rules of the road for this discussion; assume that I'll
be bitchy if you violate this.

0.a) We're not dropping the existing history. Suggesting this is
asking for a killfile entry, it's viable for small or throw-away
projects; gentoo-x86 cvs repository is not a throw-away project.

0.b) Lesser offence since it's not obvious; the various suggestions
that we just snapshot this, then try to fix history after the fact
won't work- look into git's transitive trust via sha1's of the
parent's sha1. To do that sort of proposal means forcing a full
history rewrite down the line; this doesn't fly.

0.c) For whatever I've missed, assume that if it craps on developers
workflow... it's a no go, and needs to be addressed. Does CVS suck?
Yes, I hate having to use it. But it *works*; switching to git has to
be, minimally, a lateral move for developers in terms of their
workflow- we cannot make it worse else what's the point of this whole
exercise? There may be an exception or two here- things that aren't
sorted immediately upon conversion, but those exceptions will only fly
if they're minor, don't require history rewrites, and someone is
locked in/guranteed to be working on it now (else we have no gurantee
it'll actually be sorted).


1) We need a thin manifest -> thick manifest converter. Thin
manifests are used for git- they store just DIST entries. Thick (also
known as 'full'), are what cvs/rsync users are familiar with- it holds
checksums for all content.

1.a) This converter must use portage api's; ultimately, this
thin->thick conversion will be signed by an infra key (rather than the
current hodgepodge of devs). I suggest nesting it under the emaint
command.

1.b) This converter needs to be fast. $VCS -> rsync updates occur
every 30 minutes. thin/thick sorting should be sub minute, frankly;
go parallel (multiprocessing) being my suggestion, threadpool worst
case (since most of the work won't be gil bound).

1.c) This absolutely has to be fucking stable. This will be a core
part of our infrastructure after all.

1.d) I will kneecap the first person who whines about portage on this,
or suggests NIH "lets just hack it"- they won't have to support it,
this goes into portage so it's proper, and so infra isn't stuck w/
more custom code.

1.e) This actually isn't that hard. Ask in #gentoo-portage for
details, look at portage source, look at repoman's existing manifest
command- that manifest command already is the basics of it.

1.5) Incremental signing of a tree is basically required; meaning
whatever scanner there is, shouldn't require resigning every single
package, only those that have changed thick manifest wise.

1.6) Anyone looking to do this should pop into #gentoo-portage, talk
w/ a user named 'carebear', zmedico, etc; zmedico is portage's
maintainer, carebear is the current person volunteering to sort this
(help may be appreciated, talk to him/her/it).


2) Building off of #1, although *NOT REQUIRED FOR CVS->GIT MIGRATION*,
just very strongly desired, is sorting tree signing gleps while we're
at it. Start from http://www.gentoo.org/proj/en/glep/glep-0057.html ;
whatever solution #1 takes (likely an emaint command), tree signing
will be built right smack dab into it.


3) Robin afaik is putting together an email with the details; roughly,
the conversion process is conversion of cvs to svn, then svn2git
conversion; this is done since frankly it's the best/sanest conversion
pathway, and the fastest. The validation of that conversion, and
getting it down to basically a set of known invocations is required.

3.a) Roughly, the plan will be snag the tree, start conversion.
Validate the results, repeat as necessary till we're happy with it.
This is the initial git core history, This step should be <8h; mostly
cpu time, frankly, although re-validation of that pathway is required
(I did a fair amount of optimization to this, but I've not rechecked
the runtime in a while- nor if there is a better option in existence).
Basically, it's strongly preferable we're not sorting this at the time
we're trying to do the live conversion- the core issues need to be
sorted before.

3.b) Take all cvs activity that has occurred since the tree was
snapshotted and conversion started, and replay it into git via tailor;
this is minor- and avoidable if we just shut the tree down for however
long 3.a takes; that said, the tailor route is the intention, and
shouldn't be a problem.


4) People who strongly know git hooks would be useful; server side,
all incoming pushes from devs will have their commits validated before
touching the tree- bad validation, commit gets kicked back to them.
The hooks for this need doing (development of this can be done locally
w/out having to access infra either). Hell, someone may already have
done something similar- I've not seen it, but we need something akin
to this; whoever does this, needs to write it such that the auth
backend is configurable (upon deployment, this will be bound into
ldap, or an ldap scraped set of data that it'll consult); assume that
the auth backend will be user->gpg key level of validation (meaning I
cannot take a random commit antarus had against current ToT, and push
that on his behalf- robin may disagree on this point however).



Were that to be done, that would leave for infra basically the
following- which is most definitely not a complete list-

1) gitolite configuration/setup, which afaik is basically sorted.
2) cvs -> rsync pathways being rebuilt to be git -> rsync (reliant on
#1 from above, but there is more that occurs there).
3) Thanking people for stepping up and helping to take care of the
stuff we're seriously low on time to sort.

People don't step up, I'll be working my way through that list; that
said, my timetable were I to do this isn't "next week or the week
after"- it's "over the next few months as time allows".

Also, it's entirely possible I missed something for the non-infra
tasks people can contribute to; that's just a quick brain dump, pardon
any incorrect statements. If one has questions and answers aren't
coming through via the scm ml, then worst case track me down on
freenode via the ferringb nick; just assume I'll be wickedly laggy
in responding.

Finally, pardon the strong tones; the tone in use isn't meant to
dissuade people from contributing, it's meant to ensure people stay
focused on what's required here to get the job done- discussions about
building a git mirroring tier (for example) are for *after* the
initial work is done (understand that 99% of users will be using rsync
even when we switch dev's underlying vcs got git; longer term that may
change, but it's a v2 type thing, not a v1 type thing).

Cheers-
~harring
 
Old 10-02-2012, 04:58 AM
Ben de Groot
 
Default CVS -> git, list of where non-infra folk can contribute

Thank you so much for taking the time to give us this clear list of
things that need to be done to take this forward!

--
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin
 
Old 10-02-2012, 08:20 PM
"Gregory M. Turner"
 
Default CVS -> git, list of where non-infra folk can contribute

Brian Harring wrote:

1) We need a thin manifest -> thick manifest converter. Thin
manifests are used for git- they store just DIST entries. Thick (also
known as 'full'), are what cvs/rsync users are familiar with- it holds
checksums for all content.

carebear is the current person volunteering to sort this
(help may be appreciated, talk to him/her/it).


heh

I'll read up, spend some time on IRC, and see what I can do to help here.



replay it into git via tailor;



Never knew about that tool... not sure about the wisdom of adding an
extra moving part just to keep the lights on for those few hours...
Given the "2G of history" issue Diego mentioned, which if I understand
correctly, effectively means that the future gentoo git can never rebase
its commit history, why chance it?


In my last experience with cvs->git (at the time I was building a rsync
(binutils cvs)->git mirror for a client), the most difficult thing about
cvs->git was trying to scrub the identity data.


I don't remember the exact issue, but somehow, git had identity
uniqueness constraints that cvs happily ignored, or something like that.
I never thought to try using svn as an intermediate -- but I like that
idea a lot and wish I had thought of it when I needed to.


Anyhow, wrong ml for this, I'll subscribe to -scm.

-gmt
 
Old 10-02-2012, 08:51 PM
Theo Chatzimichos
 
Default CVS -> git, list of where non-infra folk can contribute

On Tuesday 02 of October 2012 12:58:04 Ben de Groot wrote:
> Thank you so much for taking the time to give us this clear list of
> things that need to be done to take this forward!

Disclaimer: I haven't read Brian's long mail (and most of the mails in this
mailing list for the past month)

One of the things that would be nice to have before the Git migration is
Documentation. Feel free to submit docs in the wiki, and I'll help a lot after
the conference as well.

Theo
 
Old 10-03-2012, 03:46 AM
Ben de Groot
 
Default CVS -> git, list of where non-infra folk can contribute

On 3 October 2012 04:51, Theo Chatzimichos <tampakrap@gentoo.org> wrote:
> One of the things that would be nice to have before the Git migration is
> Documentation. Feel free to submit docs in the wiki, and I'll help a lot after
> the conference as well.

Can you be more specific as to what kind of docs are needed?

--
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin
 
Old 10-03-2012, 04:58 AM
Jeroen Roovers
 
Default CVS -> git, list of where non-infra folk can contribute

On Wed, 3 Oct 2012 11:46:15 +0800
Ben de Groot <yngwin@gentoo.org> wrote:

> On 3 October 2012 04:51, Theo Chatzimichos <tampakrap@gentoo.org>
> wrote:
> > One of the things that would be nice to have before the Git
> > migration is Documentation. Feel free to submit docs in the wiki,
> > and I'll help a lot after the conference as well.
>
> Can you be more specific as to what kind of docs are needed?
>

Just a quick browse through our docs, leaving out examples where there
is merely mention of "CVS repositories" (where CVS is equated with any
version control system) instead of CVS specific material:

http://www.gentoo.org/proj/en/devrel/handbook/handbook.xml?part=1&chap=4

http://www.gentoo.org/proj/en/devrel/handbook/handbook.xml?part=2&chap=1

http://www.gentoo.org/proj/en/devrel/handbook/handbook.xml?part=2&chap=3#doc_chap1

http://www.gentoo.org/proj/en/devrel/handbook/handbook.xml?part=2&chap=4

http://www.gentoo.org/proj/en/devrel/handbook/handbook.xml?part=2&chap=5

http://devmanual.gentoo.org/general-concepts/cvs-to-rsync/index.html

http://www.gentoo.org/proj/en/infrastructure/cvs-sshkeys.xml

http://www.gentoo.org/doc/en/cvs-tutorial.xml

http://www.gentoo.org/proj/en/devrel/quiz/ebuild-quiz.txt


jer
 

Thread Tools




All times are GMT. The time now is 10:18 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org