FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo Development

 
 
LinkBack Thread Tools
 
Old 01-31-2010, 08:48 AM
"Robin H. Johnson"
 
Default GLEP58 - MetaManifest

Changes:
- Provide a summary of the generation method.
- Clarify that every file in the tree must be included.
- Manifests at the first level of the directory structure are now
REQUIRED to help mitigate the cost of redistribution. Other levels are
still optional.
- Scripts are permitted generate multiple levels of Manifests in
parallel to save the cost of traversing the tree multiple times.

--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : robbat2@gentoo.org
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
GLEP: 58
Title: Security of distribution of Gentoo software - Infrastructure to User distribution - MetaManifest
Version: $Revision: 1.7 $
Last-Modified: $Date: 2010/01/31 07:53:30 $
Author: Robin Hugh Johnson <robbat2@gentoo.org>,
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Requires: 44, 60
Created: October 2006
Updated: November 2007, June 2008, July 2008, October 2008, January 2010
Post-History: December 2009, January 2010

========
Abstract
========
MetaManifest provides a means of verifiable distribution from Gentoo
Infrastructure to a user system, while data is conveyed over completely
untrusted networks and system, by extending the Manifest2 specification,
and adding a top-level Manifest file, with support for other nested
Manifests.

==========
Motivation
==========
As part of a comprehensive security plan, we need a way to prove that
something originating from Gentoo as an organization (read Gentoo-owned
hardware, run by infrastructure), has not been tampered with. This
allows the usage of third-party rsync mirrors, without worrying that
they have modified something critical (e.g. eclasses, which are still
unsigned).

Securing the untrusted distribution is one of the easier tasks in the
security plan - in short, all that is required is having a hash of every
item in the tree, and signing that hash to prove it came from Gentoo.

Ironically we have a hashed and signed distribution (it's just not used
by most users, due to it's drawbacks): Our tree snapshot tarballs have
hashes and signatures.

So now we want to add the same verification to our material that is
distributed by rsync. We already provide hashes of subsets of the tree -
our Manifests protect individual packages. However metadata, eclasses
and profiles are not protected at this time. The directories of
packages and distfiles are NOT covered by this, as they are not
distributed by rsync.

This portion of the tree-signing work provides only the following
guarantee: A user can prove that the tree from the Gentoo infrastructure
has not been tampered with since leaving the Gentoo infrastructure.
No other guarantees, either implicit or explicit are made.

Additionally, distributing a set of the most recent MetaManifests from a
trusted source allows validation of trees that come from community
mirrors, and allows detection of all cases of malicious mirrors (either
by deliberate delay, replay [C08a, C08b] or alteration).

=============
Specification
=============
For lack of a better name, the following solution should be known as the
MetaManifest. Those responsible for the name have already been sacked.

MetaManifest basically contains hashes of every file in the tree, either
directly or indirectly. The direct case applies to ANY file that does
not appear in an existing Manifest file (e.g. eclasses, Manifest files
themselves). The indirect case is covered by the CONTENTS of existing
Manifest files. If the Manifest itself is correct, we know that by
tracking the hash of the Manifest, we can be assured that the contents
are protected.

In the following, the MetaManifest file is a file named 'Manifest',
located at the root of a repository.

---------------------------------------------
Procedure for creating the MetaManifest file:
---------------------------------------------
Summary:
========
The objective of creating the MetaManifest file(s) is to ensure that
every single file in the tree occurs in at least one Manifest.

Process:
========
1. Start at the root of the Gentoo Portage tree (gentoo-x86, although
this procedure applies to overlays as well).

2. Initialize two unordered sets: COVERED, ALL.

1. 'ALL' shall contain every file that exists in the present tree.
2. 'COVERED' shall contain EVERY file that is mentioned in an existing
Manifest2. If a file is mentioned in a Manifest2, but does not
exist, it must still be included. No files should be excluded.

3. Traverse the tree, depth-first.

1. At the top level only, ignore the following directories: distfiles,
packages, local.
2. If a directory contains a Manifest file, extract all relevant local
files from it (presently: AUX, MISC, EBUILD; but should follow the
evolution of Manifest2 entry types per [#GLEP60]), and place them
into the COVERED set.
3. Recursively add every file in the directory to the ALL set,
pursuant to the exclusion list as mentioned in [#GLEP60].

4. Produce a new set, UNCOVERED, as the set-difference (ALL)-(COVERED).
This is every item that is not covered by another Manifest, or part
of an exclusion list.

5. If an existing MetaManifest file is present, remove it.

6. For each file in UNCOVERED, assign a Manifest2 type, produce the
hashes, and add with the filetype to the MetaManifest file.

7. For unique identification of the MetaManifest, a header line should
be included, using the exact contents of the metadata/timestamp.x
file, so that a MetaManifest may be tied back to a tree as
distributed by the rsync mirror system. The string of
'metadata/timestamp.x' should be included to identify this revision
of MetaManifest generation. e.g.:
"Timestamp: metadata/timestamp.x: 1215722461 Thu Jul 10 20:41:01 2008 UTC"
The package manager MUST not use the identifying string as a filename.

8. The MetaManifest must ultimately be GnuPG-signed.

1. For the initial implementation, the same key as used for snapshot
tarball signing is sufficient.
2. For the future, the key used for fully automated signing by infra
should not be on the same keyring as developer keys. See [#GLEPxx+3
for further notes].

Notes:
======
The above does not conflict the proposal contained in GLEP33, which
restructure eclasses to include subdirectories and Manifest files, as
the Manifest rules above still provide indirect verification for all
files after the GLEP33 restructuring if it comes to pass.

Additional levels of Manifests are required, such as per-category, and
in the eclasses, profiles and metadata directories. This ensures that a
change to a singular file causes the smallest possible overall change in
the Manifests as propagated. Creation of the additional levels of
Manifests uses the same process as described above, simply starting at a
different root point.

MetaManifest generation will take place as part of the existing process
by infrastructure that takes the contents of CVS and prepares it for
distribution via rsync, which includes generating metadata. In-tree
Manifest files are not validated at this point, as they are assumed to
be correct.

--------------------------------------------------------
Verification of one or more items from the MetaManifest:
--------------------------------------------------------
There are two times that this may happen: firstly, immediately after the
rsync has completed - this has the advantage that the kernel file cache
is hot, and checking the entire tree can be accomplished quickly.
Secondly, the MetaManifest should be checked during installation of a
package.

----------------------------------------------------
Procedure for verifying an item in the MetaManifest:
----------------------------------------------------
In the following, I've used term 'M2-verify' to note following the hash
verification procedures as defined by the Manifest2 format - which
compromise checking the file length, and that the hashes match. Which
filetypes may be ignored on missing is discussed in [#GLEP60].

1. Check the GnuPG signature on the MetaManifest against the keyring of
automated Gentoo keys. See [#GLEPxx+3] for full details regarding
verification of GnuPG signatures.
1. Abort if the signature check fails.

2. Check the Timestamp header. If it is significantly out of date
compared to the local clock or a trusted source, halt or require
manual intervention from the user.

3. For a verification of the tree following an rsync:

1. Build a set 'ALL' of every file covered by the rsync. (exclude
distfiles/, packages/, local/)
2. M2-verify every entry in the MetaManifest, descending into inferior
Manifests as needed. Place the relative path of every checked item
into a set 'COVERED'.
3. Construct the set 'UNCOVERED' by set-difference between the ALL and
COVERED sets.
4. For each file in the UNCOVERED set, assign a Manifest2 filetype.
5. If the filetype for any file in the UNCOVERED set requires a halt
on error, abort and display a suitable error.
6. Completed verification

4. If checking at the installation of a package:

1. M2-verify the entry in MetaManifest for the Manifest
2. M2-verify all relevant metadata/ contents if metadata/ is being
used in any way (optionally done before dependency checking).
3. M2-verifying the contents of the Manifest.
4. Perform M2-verification of all eclasses and profiles used (both
directly and indirectly) by the ebuild.

Notes:
======
1. For initial implementations, it is acceptable to check EVERY item in
the eclass and profiles directory, rather than tracking the exact
files used by every eclass (see note #2). Later implementations
should strive to only verify individual eclasses and profiles as
needed.
2. Tracking of exact files is of specific significance to the libtool
eclass, as it stores patches under eclass/ELT-patches, and as such
that would not be picked up by any tracing of the inherit function.
This may be alleviated by a later eclass and ebuild variable that
explicitly declares what files from the tree are used by a package.

====================
Implementation Notes
====================
For this portion of the tree-signing work, no actions are required of
the individual Gentoo developers. They will continue to develop and
commit as they do presently, and the MetaManifest is added by
Infrastructure during the tree generation process, and distributed to
users.

Any scripts generating Manifests and the MetaManifest may find it useful
to generate multiple levels of Manifests in parallel, and this is
explicitly permitted, provided that every file in the tree is covered by
at least one Manifest or the MetaManifest file. The uppermost
Manifest (MetaManifest) is the only item that does not occur in any
other Manifest file, but is instead GPG-signed to enable it's
validation.

--------------------------------------------
MetaManifest and the new Manifest2 filetypes
--------------------------------------------
While [#GLEP60] describes the addition of new filetypes, these are NOT
needed for implementation of the MetaManifest proposal. Without the new
filetypes, all entries in the MetaManifest would be of type 'MISC'.

----------------------------------------------------
Timestamps & Additional distribution of MetaManifest
----------------------------------------------------
As discussed by [C08a,C08b], malicious third-party mirrors may use the
principles of exclusion and replay to deny an update to clients, while
at the same time recording the identity of clients to attack.

This should be guarded against by including a timestamp in the header of
the MetaManifest, as well as distributing the latest MetaManifests by a
trusted channel.

On all rsync mirrors directly maintained by the Gentoo infrastructure,
and not on community mirrors, there should be a new module
'gentoo-portage-metamanifests'. Within this module, all MetaManifests
for a recent time frame (e.g. one week) should be kept, named as
"MetaManifest.$TS", where $TS is the timestamp from inside the file.
The most recent MetaManifest should always be symlinked as
MetaManifest.current. The possibility of serving the recent
MetaManifests via HTTPS should also be explored to mitigate
man-in-the-middle attacks.

The package manager should obtain MetaManifest.current and use it to
decide is the tree is too out of date per operation #2 of the
verification process. The decision about freshness should be a
user-configuration setting, with the ability to override.

--------------------------------
MetaManifest size considerations
--------------------------------
With only two levels of Manifests (per-package and top-level), every
rsync will cause a lot of traffic transferring the modified top-level
MetaManifest. To reduce this, first-level directory Manifests are
required. Alternatively, if the distribution method efficiently handles
small patch-like changes in an existing file, using an uncompressed
MetaManifest may be acceptable (this would primarily be distributed
version control systems). Other suggestions in reducing this traffic are
welcomed.

=======================
Backwards Compatibility
=======================
- There are no backwards compatibility issues, as old versions of
Portage do not look for a Manifest file at the top level of the tree.
- Manifest2-aware versions of Portage ignore all entries that they are
not certain how to handle. Enabling headers and PGP signing to be
conducted easily.

======
Thanks
======
I'd like to thank the following people for input on this GLEP.

- Patrick Lauer (patrick): Prodding me to get all of the tree-signing
work finished, and helping to edit.
- Ciaran McCreesh (ciaranm): Paludis Manifest2
- Brian Harring (ferringb): pkgcore Manifest2
- Marius Mauch (genone) & Zac Medico (zmedico): Portage Manifest2
- Ned Ludd (solar) - Security concept review

==========
References
==========

[C08a] Cappos, J et al. (2008). "Package Management Security".
University of Arizona Technical Report TR08-02. Available online
from: ftp://ftp.cs.arizona.edu/reports/2008/TR08-02.pdf
[C08b] Cappos, J et al. (2008). "Attacks on Package Managers"
Available online at:
http://www.cs.arizona.edu/people/justin/packagemanagersecurity/

=========
Copyright
=========
Copyright (c) 2006-2010 by Robin Hugh Johnson. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, v1.0.

vim: tw=72 ts=2 expandtab:
 
Old 02-02-2010, 05:27 AM
Denis Dupeyron
 
Default GLEP58 - MetaManifest

You'll find below an email from solar to Robin about MetaManifest. I'm
adding it to this thread (with solar's authorization) as it seems
pertinent.

Denis.

On Thu, Jan 21, 2010 at 6:51 PM, Ned Ludd <solar@gentoo.org> wrote:
> Robin,
>
> I recall you wanted me to mail you what we talked about last nite in
> #gentoo-portage and I'll CC: the council so they have an idea what to
> maybe expect.
>
> So in our talking last night we discussed the fact that if the Manifest
> format has to change why not just get rid of it all together, and save
> some serious in tree space with the new MetaManifest's taking over all
> together. This would include MetaManifest's at the 2-level.
> You said the MetaManifest would need about 4 fields in them to describe
> the distfiles etc. Devs would still push normal Manifest's to the cvs
> tree so DIST can be obtained by the backend infra scripts. But those
> Manifest's could be dropped from the mirroring. if [ -e CVS ] then
> portage would need to use the existing Manifest's
>
> This method would hands down win my vote. As you know I'm not a fan of
> format changes in general as they can make the Gentoo experience suck,
> but if we are going to change formats. Lets do it right.
>
> The only downside I can see in this method is for people like drobbins
> who mirror our tree but overlay right on top of it then provide it back
> out. In such cases we should provide our backend scripts to the public
> so they can re MetaManifest.
>
> I'm probably forgetting all sorts of details from the chat. But
> hopefully this is enough to remind you, as well as giving the other
> council ppl an idea of what to maybe expect.
 
Old 02-02-2010, 06:35 AM
"Robin H. Johnson"
 
Default GLEP58 - MetaManifest

On Mon, Feb 01, 2010 at 11:27:01PM -0700, Denis Dupeyron wrote:
> You'll find below an email from solar to Robin about MetaManifest. I'm
> adding it to this thread (with solar's authorization) as it seems
> pertinent.
>
> Denis.
>
> On Thu, Jan 21, 2010 at 6:51 PM, Ned Ludd <solar@gentoo.org> wrote:
> > Robin,
> >
> > I recall you wanted me to mail you what we talked about last nite in
> > #gentoo-portage and I'll CC: the council so they have an idea what to
> > maybe expect.
> >
> > So in our talking last night we discussed the fact that if the Manifest
> > format has to change why not just get rid of it all together, and save
> > some serious in tree space with the new MetaManifest's taking over all
> > together. This would include MetaManifest's at the 2-level.
> > You said the MetaManifest would need about 4 fields in them to describe
> > the distfiles etc. Devs would still push normal Manifest's to the cvs
> > tree so DIST can be obtained by the backend infra scripts. But those
> > Manifest's could be dropped from the mirroring. if [ -e CVS ] then
> > portage would need to use the existing Manifest's
First, I'd like to clarify one things for all other readers, as it isn't
clear for anybody else just reading this email.
================
Solar's proposal does the following:
1. Tree in CVS/VCS:
- drop ALL Manifest2 lines _EXCEPT_ DIST.
2. Tree available via rsync:
- Manifests at the following locations ONLY:
- /MetaManifest
- /${CAT}/Manifest
- /profiles/Manifest
- /eclasses/Manifest
- /metadata/cache/${CAT}/Manifest
- /metadata/glsa/Manifest
- Data from ALL Manifests get moved to one of the above.
- MISC/EBUILD etc (non-DIST) lines generated at the same time that the
rsync tree is prepared.
3. Net savings of approximately 13000 inodes, as the per package
Manifest data is now one level up, saving the inode from the package.
================

Now, I believe that this above should be possible WITHIN the framework
of my proposed MetaManifest changes.

I specifically stated in GLEP58:
===
The objective of creating the MetaManifest file(s) is to ensure that
every single file in the tree occurs in at least one Manifest.
===

My proposals did not cover removing other Manifest files per solar's
suggestion, as I perceived that to be a much larger objective than my
goal of actually securing the existing tree distribution.

I am entirely open to solar's suggestions, in an additional GLEP, as
they will require that Portage support IS fully in place, because old
versions WILL fail on a tree without per-package Manifest.

> > This method would hands down win my vote. As you know I'm not a fan of
> > format changes in general as they can make the Gentoo experience suck,
> > but if we are going to change formats. Lets do it right.
A potential plan for GLEP58 and solar's changes would be:
1. Council approves GLEP58.
2. Portage support is added, we add MetaManifests everywhere needed
(top-level, categories, metadata, eclass etc) in the tree.
3. Old Portage versions still work at this point, because they ignore
the other Manifest files.
4. Wait 6-12 months for Portage upgrade cycle.
5. Change the content of the MetaManifests to be per solar's proposal.
6. Drop per-package Manifests from the tree.

Thus there is ZERO breakage.

A similar timeline is required for ALL of the other GLEPs I have proposed.
GLEP59 - Hashes:
- Can add new hashes right now.
- Some of the old hashes we can remove right now.
- Have to keep just one old hash for old Portage to still work.
GLEP60 - Filetypes:
- Can add new types right now.
- Cannot remove ANY types for a full upgrade cycle.
GLEP61 - Compression:
- (uncofirmed) Cannot add the compressed files in per-package locations until
the upgrade cycle is done, as old Portage will complain about their existence.

> The only downside I can see in this method is for people like drobbins
> who mirror our tree but overlay right on top of it then provide it back
> out. In such cases we should provide our backend scripts to the public
> so they can re MetaManifest.
My MetaManifest generation script is already public. I do agree that we could
do better in documenting and publishing our older rsync generation scripts.

--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : robbat2@gentoo.org
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
 

Thread Tools




All times are GMT. The time now is 08:40 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org