FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo Development

 
 
LinkBack Thread Tools
 
Old 04-06-2010, 02:13 AM
Nirbheek Chauhan
 
Default The problem of ChangeLog generation

One of the few remaining problems to be solved for the migration to
git for our gentoo-x86/ and gentoo/ trees (besides other
projects/overlays) is the problem of how to handle ChangeLogs.

====
Gist:
====
* It makes zero sense to manually manage ChangeLogs in git[1]
- Irritating conflicts while merging branches or remote master
+ Similar argument for having only distfile manifests; but I digress...
- Duplication of effort and information
- Saves space for local checkouts
* Proposed is to generate ChangeLogs from git commits on the rsync
server side when metadata generation is done
- Scripts to do this already exist[1]


Now, there are obviously problems with this. Some of them are
documented below alongwith their proposed solutions. If people foresee
other problems with this; they are requested to comment. They are also
welcome to comment if they have a better solution to the problems
listed below.

Also, please try to keep this thread on-topic.


========
Problems:
========
* Messages in ChangeLog are not always the same as the commit messages
(~1% are different)
* Some people place additional information in the commit message which
is intended only for developer use
- Most of the difference in ChangeLog/commit messages comes from this
* Trivial changes are often not documented in ChangeLogs
- This is upto the developer's personal preference
- Some folks do this because of the extra time it takes
+ This use-case becomes irrelevant due to automatic generation of ChangeLog

========
Solutions:
========
* Do not re-generate the existing ChangeLog; rather make the ChangeLog
generation script smart enough to only append
- Solves the "messages not same" problem for existing commits
* Use a separator in the commit message like "==
" to denote that
everything after this is dev-only information and should be skipped
from the user ChangeLog
- Solves the problem for people who like to add extra dev-only info
in the CVS commit message
* Ignore commits with "[$tag][trivial]" in the tag[2] from being added
to ChangeLog
- Keeps the wishes of the developer and does not pollute ChangeLog
with such info


1. http://live.gnome.org/Git/ChangeLog
2. http://live.gnome.org/Git/CommitMessages
--
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team
 
Old 04-06-2010, 06:41 AM
Fabian Groffen
 
Default The problem of ChangeLog generation

On 06-04-2010 07:43:02 +0530, Nirbheek Chauhan wrote:
> * It makes zero sense to manually manage ChangeLogs in git[1]
> - Irritating conflicts while merging branches or remote master
> + Similar argument for having only distfile manifests; but I digress...
> - Duplication of effort and information
> - Saves space for local checkouts

This seems to assume
a) that we will do branches, and
b) that those branches somehow are official and in use

In CVS we are not allowed to use branches, as a policy, that somehow
makes sense. Our stable tree is visible via keywords instead.

Why would we suddenly do branches? It still isn't a good thing. If you
talk about branches in the sense of a clone of the entire repo, why
would we suddenly do massive concurrent development on the same ebuilds?

I can tell you from good experience that you only do such things if you
really have to, e.g. when you are in an overlay that needs to have
modifications to nearly everything and you try to keep that overlay
up-to-date with its origin, gentoo-x86. It's no fun, because it
conflicts pretty much on lots of things, not ChangeLogs.

It seems to me, that if you are in a clone working on something, you
just only write the ChangeLog once you merge it with its origin,
gentoo-x86. You have to review what happened at that stage anyway.

If you really have lots of changes, you will find that many commits on
the other side will cause you conflicts, so the ChangeLog is just a very
small part of it. Conclusion, if you can, try hard to keep your changes
minimal, and preferably zero compared to the origin, gentoo-x86.


--
Fabian Groffen
Gentoo on a different level
 
Old 04-06-2010, 07:01 AM
Nirbheek Chauhan
 
Default The problem of ChangeLog generation

On Tue, Apr 6, 2010 at 12:11 PM, Fabian Groffen <grobian@gentoo.org> wrote:
> On 06-04-2010 07:43:02 +0530, Nirbheek Chauhan wrote:
>> * It makes zero sense to manually manage ChangeLogs in git[1]
>> * - Irritating conflicts while merging branches or remote master
>> * * + Similar argument for having only distfile manifests; but I digress...
>> * - Duplication of effort and information
>> * - Saves space for local checkouts
>
> This seems to assume
> a) that we will do branches, and
> b) that those branches somehow are official and in use
>

No. Conflicts can arise (and I have seen them arise) trivially if you
make changes and try to do a pull --rebase; which is then not
fast-forward, and you're left with an ugly mess of conflicts on your
hands. Say you're moving stuff from an overlay using git format-patch;
how do you handle the conflicts it will generate to ChangeLogs and
Manifests?

Also, this is not the only reason to not use ChangeLogs.

Trivial example purely for demonstrative purposes:

Without ChangeLog:
make change1; commit; test; realise it needs change2; commit; test;
rebase commits; push
With ChangeLog:
make change1; write ChangeLog; commit; test; realise it needs change2;
reset --hard ChangeLog HEAD^; rewrite ChangeLog; commit; test; rebase
commits; push

Now which is easier? Don't forget that the major reason for moving to
git was the ability to make several local commits and pushing them in
an atomic way; so you are bound to make mistakes and want to rebase.

> If you really have lots of changes, you will find that many commits on
> the other side will cause you conflicts, so the ChangeLog is just a very
> small part of it.

I bump an ebuild; arch team member marks older version stable. Two
completely orthogonal changes that conflict now. With ChangeLogs,
*every* *single* change you make conflicts. You do a rebase; and it
conflicts! It's just stupid.

Extreme example: profiles/ChangeLog

>*Conclusion, if you can, try hard to keep your changes
> minimal, and preferably zero compared to the origin, gentoo-x86.
>

With the inevitable increased activity on the gentoo-x86 tree, this
will become more and more difficult.

--
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team
 
Old 04-06-2010, 11:40 AM
Fabian Groffen
 
Default The problem of ChangeLog generation

On 06-04-2010 12:31:51 +0530, Nirbheek Chauhan wrote:
> On Tue, Apr 6, 2010 at 12:11 PM, Fabian Groffen <grobian@gentoo.org> wrote:
> > On 06-04-2010 07:43:02 +0530, Nirbheek Chauhan wrote:
> >> * It makes zero sense to manually manage ChangeLogs in git[1]
> >> * - Irritating conflicts while merging branches or remote master
> >> * * + Similar argument for having only distfile manifests; but I digress...
> >> * - Duplication of effort and information
> >> * - Saves space for local checkouts
> >
> > This seems to assume
> > a) that we will do branches, and
> > b) that those branches somehow are official and in use
> >
>
> No. Conflicts can arise (and I have seen them arise) trivially if you
> make changes and try to do a pull --rebase; which is then not
> fast-forward, and you're left with an ugly mess of conflicts on your
> hands. Say you're moving stuff from an overlay using git format-patch;
> how do you handle the conflicts it will generate to ChangeLogs and
> Manifests?

Ehm, you consider pkg-moving packages an operation you do from day to
day? I surely hope not.
The other changes you talk about sound like a generic problem to me, not
related to ChangeLog files at all.

> Also, this is not the only reason to not use ChangeLogs.
>
> Trivial example purely for demonstrative purposes:
>
> Without ChangeLog:
> make change1; commit; test; realise it needs change2; commit; test;
> rebase commits; push
> With ChangeLog:
> make change1; write ChangeLog; commit; test; realise it needs change2;
> reset --hard ChangeLog HEAD^; rewrite ChangeLog; commit; test; rebase
> commits; push

If you just pull/update before you start your changes and commit/push
afterwards there are no problems. I see no branching in your examples.
In a branch, I wouldn't make ChangeLog changes until I merge with main
and commit there.

> Now which is easier? Don't forget that the major reason for moving to
> git was the ability to make several local commits and pushing them in
> an atomic way; so you are bound to make mistakes and want to rebase.

Ohw, was that the major reason... What a nonsense. If you need to push
several commits to the same package at the same time, you could have
probably done it with a single commit on CVS as well, just using a
single ChangeLog entry, which looks much cleaner to me.

If you talk about bumping the whole of KDE at the same time, well nice,
but then how are you going to solve the problem with the Manifest file?
I'd say one of the requirements of a new VCS is the ability to do an
atomic commit over multiple directories, which is quite different from
making several local commits, to me.

> > If you really have lots of changes, you will find that many commits on
> > the other side will cause you conflicts, so the ChangeLog is just a very
> > small part of it.
>
> I bump an ebuild; arch team member marks older version stable. Two
> completely orthogonal changes that conflict now. With ChangeLogs,
> *every* *single* change you make conflicts. You do a rebase; and it
> conflicts! It's just stupid.

WHY would you wait a couple of months with pushing your new version out?
The conflict chance you get if you'd push immediately, is as big as the
conflict chance you get with CVS if this happens.
Your argument is more to get rid of the ChangeLog and Manifest file,
then to start using git, IMO.

> >*Conclusion, if you can, try hard to keep your changes
> > minimal, and preferably zero compared to the origin, gentoo-x86.
>
> With the inevitable increased activity on the gentoo-x86 tree, this
> will become more and more difficult.

Do Tove's stats actually show there is an increase in activity? One
would have to plot it, but glancing over it, it actually looks about
steady to me.

Git doesn't solve the problem, you still have a couple of untackled
issues standing out. I think Robin tried to address them in previous
mails.


--
Fabian Groffen
Gentoo on a different level
 
Old 04-06-2010, 01:06 PM
Richard Freeman
 
Default The problem of ChangeLog generation

On 04/05/2010 10:13 PM, Nirbheek Chauhan wrote:

* Proposed is to generate ChangeLogs from git commits on the rsync
server side when metadata generation is done
- Scripts to do this already exist[1]


I haven't seen this discussed, so I'm going to toss this out there and duck:

Why not just get rid of the in-tree Changelogs entirely? The scm logs
already document this information, so why have it in a file?


It seems like the main purpose for it is for end-users to have some idea
what changed in an ebuild. However, in my experience the upstream
changes are far more impactful than the ebuild changes, and those aren't
in the Changelogs at all.


Instead, why not just create a script that gets distributed with portage
that will upon request tell a user what changed based on the scm logs?
I can't imagine that the hit on the servers will be all that large, and
since this is read-only traffic it might be manageable through replication.


Rich
 
Old 04-06-2010, 10:21 PM
"Robin H. Johnson"
 
Default The problem of ChangeLog generation

On Tue, Apr 06, 2010 at 09:06:24AM -0400, Richard Freeman wrote:
> Why not just get rid of the in-tree Changelogs entirely? The scm
> logs already document this information, so why have it in a file?
The major concern with this is users that are NOT connected to the
internet always.
If you are connected, you can just use --exclude Changelog in your rsync
options.

--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : robbat2@gentoo.org
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
 
Old 04-07-2010, 06:25 AM
Hans de Graaff
 
Default The problem of ChangeLog generation

On Tue, 2010-04-06 at 09:06 -0400, Richard Freeman wrote:
> Why not just get rid of the in-tree Changelogs entirely? The scm logs
> already document this information, so why have it in a file?
>
> It seems like the main purpose for it is for end-users to have some idea
> what changed in an ebuild. However, in my experience the upstream
> changes are far more impactful than the ebuild changes, and those aren't
> in the Changelogs at all.

I pretty much always use the -l option of portage to include the
pertinent changes in the ChangeLog, because this is the only way to know
about any changes before the package is merged. Yes, the NEWS from the
package usually contains a lot more detail, but I won't be able to read
it until after the fact. In my experience plenty of ChangeLogs in our
tree at least briefly document what changed in the package as opposed to
the ebuild.

Hans
 
Old 04-07-2010, 07:55 AM
Dirkjan Ochtman
 
Default The problem of ChangeLog generation

On Tue, Apr 6, 2010 at 04:13, Nirbheek Chauhan <nirbheek@gentoo.org> wrote:
> * Use a separator in the commit message like "==
" to denote that
> everything after this is dev-only information and should be skipped
> from the user ChangeLog

I think this is fairly elegant, and a good solution to this problem.

Cheers,

Dirkjan
 
Old 04-07-2010, 09:58 AM
Angelo Arrifano
 
Default The problem of ChangeLog generation

First, I've been using git to hack Linux for some embedded devices. My
development was in sync with upstream linux-omap to which I sent several
patches. So, consider yourself that I have some experience with git.

On 06-04-2010 08:41, Fabian Groffen wrote:
> On 06-04-2010 07:43:02 +0530, Nirbheek Chauhan wrote:
>> * It makes zero sense to manually manage ChangeLogs in git[1]
>> - Irritating conflicts while merging branches or remote master
>> + Similar argument for having only distfile manifests; but I digress...
>> - Duplication of effort and information
>> - Saves space for local checkouts
>
> This seems to assume
> a) that we will do branches, and
> b) that those branches somehow are official and in use
>
> In CVS we are not allowed to use branches, as a policy, that somehow
> makes sense. Our stable tree is visible via keywords instead.
>
> Why would we suddenly do branches? It still isn't a good thing. If you
> talk about branches in the sense of a clone of the entire repo, why
> would we suddenly do massive concurrent development on the same ebuilds?

IMHO repository branching would be greatly useful on Gentoo portage,
specially for third-party and other Gentoo-based distros. It will be a
lot easier for them to keep their own changes to ebuilds while in sync
with main Gentoo tree. This is a big win for everyone.

With my experience in Gentoo-embedded I can also present a problem where
branching is extremely useful:
1) Package foobar-1.2 is in the tree and keyworded only for ~x86 ~amd64.
2) Some dev at -embedded decides that package is useful and applies his
traditional cross-compile hackery.
3) The usual route would be to open a shi*load of bugs, wait a cr*pload
of time for the maintainer response and if the weather feels like it,
there is authorization to commit. Then there is also need to retest for
already keyworded arches so we know we don't break others.

3*) With git, one would just branch (lets call it embedded branch) the
package. Apply the patches there and let people using embedded profiles
to emerge from that branch instead of master.
Benefits? I think they are pretty obvious - people can start putting
quick patches in the tree for specific arches while not breaking others.

IMHO, the only bottleneck I see on Gentoo development is the massive
policy (not saying it is not needed) a -dev has to follow just to commit
a simple fix. Git my friends, will be our holly grail.
>
> I can tell you from good experience that you only do such things if you
> really have to, e.g. when you are in an overlay that needs to have
> modifications to nearly everything and you try to keep that overlay
> up-to-date with its origin, gentoo-x86. It's no fun, because it
> conflicts pretty much on lots of things, not ChangeLogs.
>
> It seems to me, that if you are in a clone working on something, you
> just only write the ChangeLog once you merge it with its origin,
> gentoo-x86. You have to review what happened at that stage anyway.
>
> If you really have lots of changes, you will find that many commits on
> the other side will cause you conflicts, so the ChangeLog is just a very
> small part of it. Conclusion, if you can, try hard to keep your changes
> minimal, and preferably zero compared to the origin, gentoo-x86.
>
>

I don't know why but people seem to have eternal scarring to merge
conflicts. Yes, they happen and yes they are trivial to fix if people
don't commit crap that touches a lot of stuff. In portage, the tree is
very well organized and with some good policies like restricting each
commit to one package will pretty much prevent conflicts.

I will not comment on if Changelogs are going to give conflicts or not.
That would be best answered by the people that is running portage git
for some time.
 
Old 04-07-2010, 10:03 AM
Ciaran McCreesh
 
Default The problem of ChangeLog generation

On Wed, 07 Apr 2010 11:58:13 +0200
Angelo Arrifano <miknix@gentoo.org> wrote:
> With my experience in Gentoo-embedded I can also present a problem
> where branching is extremely useful:
> 1) Package foobar-1.2 is in the tree and keyworded only for ~x86
> ~amd64. 2) Some dev at -embedded decides that package is useful and
> applies his traditional cross-compile hackery.
> 3) The usual route would be to open a shi*load of bugs, wait a
> cr*pload of time for the maintainer response and if the weather feels
> like it, there is authorization to commit. Then there is also need to
> retest for already keyworded arches so we know we don't break others.
>
> 3*) With git, one would just branch (lets call it embedded branch) the
> package. Apply the patches there and let people using embedded
> profiles to emerge from that branch instead of master.
> Benefits? I think they are pretty obvious - people can start putting
> quick patches in the tree for specific arches while not breaking
> others.

And then you have to keep merging master into your embedded branch
every few hours to keep up. It's a waste of time. Instead, you should
just put a modified foobar-1.2 in your own repository and rely upon the
package manager's extensive and clean handling of multiple repositories
to avoid having to do any merging yourself.

There are uses for merges, but working around the shortcomings in a
package manager shouldn't be one of them. Migrating to Git should be
about addressing problems with CVS, not about addressing problems with
Portage.

--
Ciaran McCreesh
 

Thread Tools




All times are GMT. The time now is 06:45 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org