Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Gentoo Embedded (http://www.linux-archive.org/gentoo-embedded/)
-   -   Licence compliance - capturing all source files used to make a build? (http://www.linux-archive.org/gentoo-embedded/639031-licence-compliance-capturing-all-source-files-used-make-build.html)

Ed W 02-29-2012 01:46 PM

Licence compliance - capturing all source files used to make a build?
 
Hi, how do others handle open source licence compliance when building
some base system using gentoo?


In particular I guess simply capturing the ebuilds is not sufficient and
it's necessary to capture and distribute all the source and patch files
used to create a build. The emerge tool doesn't obviously give a way to
capture this stuff. I looked in the eclasses, particularly the epatch
file and I'm not clear that I can easily hook into that.


At the moment I'm using a bashrc file to grab everything from the build
directory. This seems reasonably robust for source files. However, for
patches I have considered creating a fake patch utility which would
record all the files it operates on. Any other suggestions? Perhaps
catalyst already has done something like that - not familiar with it though?


Whilst the above is largely targeting GPL type licences, are there other
things I should consider for other licences? Other things I need to
ensure I distribute for GPL? Any pointers to (simple) documentation on
how one can be a compliant open source citizen..?


Thanks

Ed W

Ed W 02-29-2012 11:36 PM

Licence compliance - capturing all source files used to make a build?
 
Hi


It's not simple. You have to learn the requirements of each license
and see if and how they allow themselves to be combined. There are
businesses doing exactly that. If you want to DIY I think you just
have to start by reading the licenses. You may or may not want an
IP lawyer sitting beside you while doing it.


This is the kind of unhelpful answer that I can find plenty of examples
of through google...


Consider that all software comes with some kind of licence. Generally
if you ask a non opensource company about licensing costs then even the
sales droid can help you out. I do find it quite baffling that on
average if you question an opensource user then their answer on
licensing is that one should redirect the question to one of the most
expensive and opaque professions on earth... If your mate gave you that
answer in the pub when you asked what price for a beer you would
immediately cotton on that they don't really know and are bluffing...


The bit people seem to miss is that legal documents are for forcing
arbitration in the event of dispute - in the meantime people are
supposed to rub along in a cooperative manner. That many OSS advocates
seem to feel that employing expensive lawyers is the only way to talk to
them shows that they are probably missing the bigger picture...


On a more constructive note: I think I do understand the key terms of
the main software licences we use, from my understanding they are not
all that onerous. So can we perhaps move this topic onto tips,
suggestions and practical matters about moving forward? I'm not sure
that one of the most expensive type of lawyers is best employed talking
scripting tips?



If you have patches which use a different license than the package
they modify then you have more work to do. Portage doesn't help here.
A good start would be to add record of all patches applied by emerge.
Indeed add it into the epatch command.


OK, so this is what I asked the list. Please don't turn it back at me...

Firstly can we not assume that the patches in gentoo *are* in
compliance, otherwise gentoo's various packaged binaries would cause
Gentoo to be out of compliance? (I'm going to assume that human error
will cause at least some mistakes, but lets hope that just like Gentoo
isn't being sued right now, copyright holders are actually going to be
cooperative in fixing minor issues...!)



So, back to the problem: one of the bigger challenges seems to be how to
actually capture the absolute list of patches applied? Any
suggestions? I already suggested creating my own "patch" utility which
saves it's input - seems ugly - other suggestions?


I'm not using catalyst. Any tips from others on capturing, presenting,
managing and deploying GPL code?


Hoping for useful answers here rather than "talk to some really
expensive professional who knows nothing about programming".


Gentoo seems very attractive for building embedded system - however,
there seem to be some missing steps to help with deployment. I thought
that was ontopic for this list? Any tips from others who are building
things?



Cheers

Ed W

Ed W 03-01-2012 07:20 AM

Licence compliance - capturing all source files used to make a build?
 
Hi


I wouldn't have mentioned an IP lawyer at all had it not been for the
fact that I know that you are in the US. :)


I'm in the UK


I use catalyst, and I control what gets deployed with custom ebuilds
and snapshots. The fewer packages in the final system the better;
less stuff to track.


Whilst I guess it should be possible to tear apart catalyst and find out
how they do it, does anyone happen to know or have a heads up on the
code for catalyst? It must be a solved problem so I should think others
have solved this in various ways?


Thanks

Ed W

wireless 03-01-2012 03:18 PM

Licence compliance - capturing all source files used to make a build?
 
On 02/29/12 09:46, Ed W wrote:


Whilst the above is largely targeting GPL type licences, are there other
things I should consider for other licences? Other things I need to
ensure I distribute for GPL? Any pointers to (simple) documentation on
how one can be a compliant open source citizen..?


Ed,

It maybe worth the effort to ask your questions to other embedded lists
too, as my reading of all of these responses, makes me wonder, has not
someone else already discovered and publish a list at some point in time.

For example maybe at Linux From Scratch they advise on what softwares
and codes to use, depending on what you are building up. Or maybe
open embedded ?

It just seems like this question should be solved and already documented
somewhere? With dozens (hundreds) of commercial linux distros, surely
they list licenses for codes they include therein?

Maybe some research on what google has published on the licenses it
encounters via the Android packages? Some of the BSD embedded

projects might also be a good source of information.

hth,
James

Ed W 03-01-2012 05:57 PM

Licence compliance - capturing all source files used to make a build?
 
Hi


Whilst I guess it should be possible to tear apart catalyst and find out
how they do it, does anyone happen to know or have a heads up on the code
for catalyst?

The catalyst code has no part in this, but it takes a portage snapshot
as one of it's inputs, and if you maintain a custom snapshot (with
only packages you need) then you know what gets used.



But not all the patches are in the portage tree? Trivial example might
be the kernel where the ebuild is tiny and references an http location
for the patches? My understanding is that for a GPL licence one should
provide a copy of these patches in the "code dump", not just an http
link? Is that your understanding?


So by implication it's not clear that catalyst does satisfy your GPL
requirements for distribution?


I suspect something more is probably happening, eg some of the linked
patches probably get included into the source download location and
probably you can pick them up there - however, there are now a LOT of
ways to fetching source and patches and it would be hard to be sure of
100% coverage?


Has someone done some actual probing on this? Peter what does catalyst
provide for say gcc/kernel sources in it's source output? All the patches?


Cheers

Ed W

Mike Frysinger 03-02-2012 05:37 AM

Licence compliance - capturing all source files used to make a build?
 
On Wednesday 29 February 2012 09:46:57 Ed W wrote:
> In particular I guess simply capturing the ebuilds is not sufficient and
> it's necessary to capture and distribute all the source and patch files
> used to create a build. The emerge tool doesn't obviously give a way to
> capture this stuff.

file a bug report to add a feature to do this ... something like "buildsrcpkg".
it'd automatically bundle up all the eclasses the pkg is using as well as all
of $CATEGORY/$PN/.

> At the moment I'm using a bashrc file to grab everything from the build
> directory. This seems reasonably robust for source files. However, for
> patches I have considered creating a fake patch utility which would
> record all the files it operates on. Any other suggestions? Perhaps
> catalyst already has done something like that - not familiar with it
> though?

if you capture all of the $PORTDIR/$CATEGORY/$PN/ and $A, then there should be
no need to manually hook into epatch to capture the patches. there's really
no other place these could come from.
-mike

Bertrand Jacquin 03-02-2012 02:22 PM

Licence compliance - capturing all source files used to make a build?
 
On 02.03.2012 15:35, Peter Stuge wrote:

Mike Frysinger wrote:

if you capture all of the $PORTDIR/$CATEGORY/$PN/ and $A, then
there should be no need to manually hook into epatch


The point of hooking into epatch would be to only have exactly those
patches which get applied. Some ebuilds come with a huge set of
patches, but only few may be applied depending on USE and version.
It's nice to have just the right ones.


epatch is not the only necessary thing, some ebuilds do 'sed -i' on
files. I don't really know for autotools files.


An extension about Mike mind can be an automagic diff between all
SRC_URI freshly src_unpack(ed) and after install+distclean or after
src_prepare. Assuming that not any other code is modified during
src_(compile|install).

Ed W 03-03-2012 05:21 PM

Licence compliance - capturing all source files used to make a build?
 
Hi


But not all the patches are in the portage tree? Trivial example might
be the kernel where the ebuild is tiny and references an http location
for the patches?

Then you would change the kernel ebuild in your snapshot, so that it
becomes self-contained.


That's clearly not a practical suggestion because there are many such
ebuilds with this behaviour and the suggestion to "rewrite all your
ebuilds" kind of defeats the benefit of using gentoo?




My understanding is that for a GPL licence one should provide a
copy of these patches in the "code dump", not just an http link?
Is that your understanding?

I think your understanding is incomplete, and I recommend that you
read through the license again.


?? Why all the stupid hints rather than just stating the answer!

Under what circumstances do you claim that it's not necessary to
actually supply the code for a patch which has been made to a GPL
licenced code base?? I think you are implying that it's satisfactory to
"supply" code by having a twisted and nested chain of source locations
for all the code, some of which may not be under my control? As you
hint, I then have the risk of servers outside of my control causing my
compliance failure. However, this is all moot because my whole question
is about accurately capturing all the upstream source so that I can
maintain my own cache?



I'm not sure why GPL seems to attract such special behaviour. In every
other industry one will usually provide both a legal licence and also a
non legal "summary of intent". For some reason the open source
advocates seem to excel in leaping on any minor misunderstanding of
their licensing agreements, but then enjoy confounding the situation
with "nah that's not it, but I can't give you any hints as to why I
*think* you are wrong...". Look it's just a straightforward licence -
we don't need to be lawyers to have a stab at complying with it and
generally helping with understanding it's nuances...


The big thing which annoys me is that one can comply with to the letter
of the GPL with a big code dump that, and lets be honest here, benefits
absolutely no one really (what do you do with a lump of undocumented and
obfuscated hacky code. There are several open letters on the internet
discussing this, but what you are looking for is people to get involved
with the *spirit* of working within the open source process and sharing
in a useful way, not just code dumping.


The piece we are discussing here is really the boring compliance piece
which personally I think is largely unhelpful, last chance saloon kind
of code dump. All the useful pieces of code I try to push upstream.
For sure the GPL provision at least means you get the code even if *I*
don't try and push it upstream and am uncooperative, but really, for the
vast majority of code, it's just boiler plate reproduction of stuff that
you would get from upstream if you needed it anyway...




So by implication it's not clear that catalyst does satisfy your GPL
requirements for distribution?

I never say it did. I said that it helps with some things.


What "some things"? Previously I asked for help capturing the source
code tree and you implied that it would be correctly captured by
catalyst - however, now it seems to be becoming clear that catalyst
doesn't capture all the patches either? So we seem to be back to the
original question again and catalyst seem to be just a detour that
hasn't advanced us?


With that in mind if you are using only catalyst, how do *you* make sure
you are GPL compliant and provide all patches/sources, etc? (Not a
challenge, just genuinely trying to learn from how others are doing things?)





I suspect something more is probably happening, eg some of the linked
patches probably get included into the source download location and
probably you can pick them up there - however, there are now a LOT of
ways to fetching source and patches and it would be hard to be sure
of 100% coverage?

Fourth time: Add bookkeeping into the epatch function.


No, it's not "fourth time". It was my idea in the original email!
However, patching portage is unsatisfactory in that it's fragile and
easily overwritten accidently. By all means if you have a way of
patching which is less fragile, eg if there is some way to patch the
eclass using some overlay in /usr/local/portage then I would be grateful
for *that* information.


you are just saying "do it" like having the idea is the easy bit!
Actually the implementation seems hacky to me. Wrapping the patch
utility seems more robust to me, but it's still not ideal...



Downloading is irrelevant, especially since sometimes many more
patches are downloaded than are actually applied.


I'm not sure I follow? My understanding is that we need to supply
patches that are applied, not just every patch to every ebuild - I think
we are agreeing on that?




It's the other way around:

You provide a snapshot to catalyst, and catalyst builds kernel from
that. You say what you want catalyst to build, and you create the
package.

You may end up doing more ebuild maintenance, but you likely want to
do just that anyway, in order to keep track of what actually goes
into your system.


Hmm, that's a very superficial description of what is done, but I can
infer some of what you mean.


You might be saying that you figure out every ebuild that you need in
your solution, then manually patch them all to use source pulled down
from your own server, plus sync all the sources from gentoo to yoru own
server? However, this seems like a desperate amount of work?


You might be saying you just snapshot the gentoo portage tree, however,
I don't see how that helps you capture all the sources and patches
correctly?



Can you please clarify how you generate your portage snapshot for
catalyst and how you create your own offline snapshot of all sources
(including downloaded patches) - this is I think is what I'm looking to
learn?


Thanks

Ed W

Ed W 03-03-2012 05:34 PM

Licence compliance - capturing all source files used to make a build?
 
On 02/03/2012 15:22, Bertrand Jacquin wrote:
On 02.03.2012 15:35, Peter Stuge wrote:


Mike Frysinger wrote:


if you capture all of the
$PORTDIR/$CATEGORY/$PN/ and $A, then


there should be no need to manually hook into epatch





The point of hooking into epatch would be to only have exactly
those


patches which get applied. Some ebuilds come with a huge set of


patches, but only few may be applied depending on USE and
version.


It's nice to have just the right ones.





epatch is not the only necessary thing, some ebuilds do 'sed -i'
on files. I don't really know for autotools files.





Hmm, this is an interesting thought.



My instinct would be to consider this under the heading of "build
recipe" since it's arguably similar to what the makefile and other
pre-processors are doing.* I don't disagree that someone could argue
this all kinds of ways, but I think you would have to be fairly
bloody minded to try for an infringement claim if the ebuild were
provided (since arguably the patch is there)?



It also occurs to me that it's safer to include all of $

FILESDIR, or at least everything without a .patch extension, since
there is also "cp $FILESDIR/somefile $S" to watch for?



An extension about Mike mind can be an automagic diff
between all SRC_URI freshly src_unpack(ed) and after
install+distclean or after src_prepare. Assuming that not any
other code is modified during src_(compile|install).





A very simple solution is to diff the code - however, I claim this
is *incredibly* unhelpful to the whole notion of open source.* If I
were the copyright holder then I would far rather have a
cooperative, but accidentally non compliant distributor who has
genuinely made an effort, than someone who just provides a massive
code diff...* (Yeah, probably some corner case in that argument, but
the point is that patches like "fix segfault on touching some file"
are infinitely more useful than a massive diff...)





Thanks for the feedback



Ed W

Ed W 03-03-2012 06:00 PM

Licence compliance - capturing all source files used to make a build?
 
On 02/03/2012 06:37, Mike Frysinger wrote:

On Wednesday 29 February 2012 09:46:57 Ed W wrote:

In particular I guess simply capturing the ebuilds is not sufficient and
it's necessary to capture and distribute all the source and patch files
used to create a build. The emerge tool doesn't obviously give a way to
capture this stuff.

file a bug report to add a feature to do this ... something like "buildsrcpkg".
it'd automatically bundle up all the eclasses the pkg is using as well as all
of $CATEGORY/$PN/.



Submitted

https://bugs.gentoo.org/show_bug.cgi?id=406811

Thanks

Ed W


All times are GMT. The time now is 06:14 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.