FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Fedora Development

 
 
LinkBack Thread Tools
 
Old 11-20-2007, 03:21 PM
"Ed Swierk"
 
Default InstantMirror Proposal ApacheMirror.py for a site-local Fedora mirror

On 11/19/07, Warren Togami <wtogami@redhat.com> wrote:
> http://fedoraproject.org/wiki/Infrastructure/ProjectHosting/RequestingNewProject
> Could you please create an "upstream" project for it at
> hosted.fedoraproject.org? I think there are a number of improvements
> that can be made.

Done. I like the name InstantMirror.

> I didn't read deeply into your code yet, but I imagine that it needs
> improvement to handle unique synchronization and expiration issues that
> yum repos and rawhide install trees create when file contents change
> without changing filenames.

If a requested file already exists in the local mirror, the handler
compares the Last-Modified time of the upstream file with the local
file, and downloads the file if the upstream version is newer. I'm not
familiar with rawhide, but this seems to work okay for the updates
repos where metadata files are frequently regenerated. It doesn't
remove files that no longer exist upstream, of course.

> Perhaps a separate, asynchronous daemon can monitor upstream (via HTTP
> or whatever) for repomd.xml changes. It should then parse the
> repomd.xml so it knows when to expire the repodata/* files. Then it
> should parse the .xml files in repodata/ to compare it to local storage,
> and intelligently expire the packages if any changed (as happens during
> signing). It can then know exactly which files to delete from the local
> cache because they are no longer in the upstream. This daemon interacts
> with ApacheMirror.py only in deleting files from the local directories,
> effectively expiring the cache. Very simple.
>
> That daemon could be configured to handle intelligent expiry of various
> parts of the mirror tree in different ways. For example:
> - development (rawhide) repo changes at least once per day. It also
> contains install images (boot.iso, bootdisk.img, stage2, etc.) that need
> to be expired every time the tree changes. (We might need to add a
> hashes file to the mirror tree to allow the tool to monitor these changes.)
> - Released distros never change, so don't need to monitor their
> repomd.xml for changes.

An even simpler approach is to have the daemon iterate through every
local file, checking whether the file exists upstream and deleting the
local copy if it doesn't. This requres no repodata parsing, but
flooding the upstream server with HEAD requests might be considered
unfriendly.

> The default definitions for mirroring download.fedoraproject.org could
> be included in a Fedora/EPEL package that requires ApacheMirror.py and
> the monitor/expiry daemon. That way a sysadmin who wants to create an
> instant Fedora mirror need only install that package and enable it in
> /etc/httpd/conf.d/. yum update handles pulling in updates for tree
> changes (repo locations, how often to poll for repomd.xml changes, etc.)
>
> Example:
> yum install InstantMirror-fedora
> vim /etc/httpd/conf.d/InstantMirror-fedora.conf
> #(enable stuff)
> service httpd restart
> # http://fedora.localdomain.com
> Instant Fedora mirror!
>
> InstantMirror-fedora.noarch.rpm : instant Fedora mirror
> InstantMirror-centos.noarch.rpm : instant CentOS mirror
> InstantMirror-rpmfusion.noarch.rpm : instant RPMFusion mirror
> InstantMirror-foo.noarch.rpm : instant Foo mirror

Sounds good.

> p.p.s.
> Another idea before I forget about it:
> Later add configurable fallbacks to a different upstream if
> download.fp.org is down. mirrors.kernel.org might be a good alternative
> for default, for example.

Yes, it would be easy to configure a list of upstream servers instead
of a single one, and hit them either in priority order or randomly.

--Ed

--
fedora-devel-list mailing list
fedora-devel-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-devel-list
 
Old 11-20-2007, 08:40 PM
Warren Togami
 
Default InstantMirror Proposal ApacheMirror.py for a site-local Fedora mirror

Ed Swierk wrote:



I didn't read deeply into your code yet, but I imagine that it needs
improvement to handle unique synchronization and expiration issues that
yum repos and rawhide install trees create when file contents change
without changing filenames.


If a requested file already exists in the local mirror, the handler
compares the Last-Modified time of the upstream file with the local
file, and downloads the file if the upstream version is newer. I'm not
familiar with rawhide, but this seems to work okay for the updates
repos where metadata files are frequently regenerated. It doesn't
remove files that no longer exist upstream, of course.


Ah, this works great as an initial implementation. We can at least have
something that works before we make it more efficient (and less
unfriendly in hitting the upstream server too many times).





That daemon could be configured to handle intelligent expiry of various
parts of the mirror tree in different ways. For example:
- development (rawhide) repo changes at least once per day. It also
contains install images (boot.iso, bootdisk.img, stage2, etc.) that need
to be expired every time the tree changes. (We might need to add a
hashes file to the mirror tree to allow the tool to monitor these changes.)
- Released distros never change, so don't need to monitor their
repomd.xml for changes.


An even simpler approach is to have the daemon iterate through every
local file, checking whether the file exists upstream and deleting the
local copy if it doesn't. This requres no repodata parsing, but
flooding the upstream server with HEAD requests might be considered
unfriendly.


Why don't we implement the unfriendly approach first because we can get
that out quickly. That way people can have something to run while we
work on the proper version that substantially reduces the number of hits
to the upstream server.


Warren Togami
wtogami@redhat.com

--
fedora-devel-list mailing list
fedora-devel-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-devel-list
 
Old 11-20-2007, 08:57 PM
"Jeffrey Ollie"
 
Default InstantMirror Proposal ApacheMirror.py for a site-local Fedora mirror

On 11/20/07, Ed Swierk <eswierk@arastra.com> wrote:
>
> > Later add configurable fallbacks to a different upstream if
> > download.fp.org is down. mirrors.kernel.org might be a good alternative
> > for default, for example.
>
> Yes, it would be easy to configure a list of upstream servers instead
> of a single one, and hit them either in priority order or randomly.

Why not use the mirrorlist? E.g.:

http://mirrors.fedoraproject.org/mirrorlist?repo=rawhide&arch=i386

Jeff

--
fedora-devel-list mailing list
fedora-devel-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-devel-list
 
Old 11-20-2007, 09:31 PM
Jonathan Steffan
 
Default InstantMirror Proposal ApacheMirror.py for a site-local Fedora mirror

Fedora Unity currently has a hack of a solution using a modified Yam. I
like the idea of an instant mirror that doesn't have to be pre-synced
but there are some use cases where having the full mirror local is useful.

http://damaestro.us/Members/jon/repo_sync-0.2-1.src.rpm

It still needs a lot of work. We (Fedora Unity) were planning on adding
a hosted projected called 'reflector'. Maybe these two efforts should
join forces?


--
Jonathan Steffan
daMaestro
Fedora Unity - http://fedoraunity.org/
GPG Fingerprint: 93A2 3E2F DC26 5570 3472 5B16 AD12 6CE7 0D86 AF59

--
fedora-devel-list mailing list
fedora-devel-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-devel-list
 
Old 11-20-2007, 09:33 PM
Jonathan Steffan
 
Default InstantMirror Proposal ApacheMirror.py for a site-local Fedora mirror

Jonathan Steffan wrote:
> Fedora Unity currently has a hack of a solution using a modified Yam. I
> like the idea of an instant mirror that doesn't have to be pre-synced
> but there are some use cases where having the full mirror local is useful.
>
> http://damaestro.us/Members/jon/repo_sync-0.2-1.src.rpm
>
> It still needs a lot of work. We (Fedora Unity) were planning on adding
> a hosted projected called 'reflector'. Maybe these two efforts should
> join forces?
>
>

Example Configs:
http://damaestro.us/Members/jon/configs.tar.gz


--
Jonathan Steffan
daMaestro
Fedora Unity - http://fedoraunity.org/
GPG Fingerprint: 93A2 3E2F DC26 5570 3472 5B16 AD12 6CE7 0D86 AF59

--
fedora-devel-list mailing list
fedora-devel-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-devel-list
 
Old 11-20-2007, 09:57 PM
Warren Togami
 
Default InstantMirror Proposal ApacheMirror.py for a site-local Fedora mirror

Jonathan Steffan wrote:

Fedora Unity currently has a hack of a solution using a modified Yam. I
like the idea of an instant mirror that doesn't have to be pre-synced
but there are some use cases where having the full mirror local is useful.

http://damaestro.us/Members/jon/repo_sync-0.2-1.src.rpm

It still needs a lot of work. We (Fedora Unity) were planning on adding
a hosted projected called 'reflector'. Maybe these two efforts should
join forces?




I took a quick look at your code. It seems that you have yet another
implementation of an rsync wrapper that everyone seems to write for to
manage their own mirror server. It would be good to make this into a
proper upstream project and to have people collaborate on it. This
however is an entirely different use-case than the proposed
InstantMirror project.


You would use reflector to populate a traditional mirror, while
InstantMirror operates as a proxy cache mirror. I suppose the two could
optionally work together to pre-populate the InstantMirror cache
directories, but these are two distinctly different projects.


Warren Togami
wtogami@redhat.com

--
fedora-devel-list mailing list
fedora-devel-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-devel-list
 

Thread Tools




All times are GMT. The time now is 02:55 AM.

VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org