FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 03-31-2010, 12:42 PM
Thomas Koch
 
Default Hadoop in Debian, was: Hardware trouble ries.debian.org

Joerg Jaspert:
<SNIP>
> The only trouble this setup has is that you have a pretty huge expensive
> machine always on and running, but not actually doing stuff for
> 99.999999999999% of the time.
</SNIP>

Hadoop is now in Debian: http://packages.qa.debian.org/h/hadoop.html
Hadoop is an Open Source implementation of Google's File System, MapReduce and
BigTable (HBase, not yet packaged).

The idea behind Google's infrastructure and therefor Hadoop is: Have many
cheap comodity servers that together form a powerful cluster. Each node of the
cluster is redundant and can be replaced without downtime.

I believe, but can't know for sure, that everything what FTP-Master does,
could be implemented on top of hadoop.
However it means for sure a lot of work and many hardcore sysadmins will feel
very uncomfortable to use Java, the language Hadoop is written in.

I'm planning to give a presentation of hadoop at the DebConf in Bosnia and
maybe then we may discuss, if hadoop should have a place in Debian's
infrastructure. - For now I'm happy, if somebody became curious. :-)

http://en.wikipedia.org/wiki/Hadoop

Best regards,

Thomas Koch, http://www.koch.ro


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 201003311442.09336.thomas@koch.ro">http://lists.debian.org/201003311442.09336.thomas@koch.ro
 
Old 03-31-2010, 04:20 PM
Obey Arthur Liu
 
Default Hadoop in Debian, was: Hardware trouble ries.debian.org

On Wed, Mar 31, 2010 at 2:42 PM, Thomas Koch <thomas@koch.ro> wrote:
> Joerg Jaspert:
> <SNIP>
>> The only trouble this setup has is that you have a pretty huge expensive
>> machine always on and running, but not actually doing stuff for
>> 99.999999999999% of the time.
> </SNIP>
>
> Hadoop is now in Debian: http://packages.qa.debian.org/h/hadoop.html
> Hadoop is an Open Source implementation of Google's File System, MapReduce and
> BigTable (HBase, not yet packaged).
>
> The idea behind Google's infrastructure and therefor Hadoop is: Have many
> cheap comodity servers that together form a powerful cluster. Each node of the
> cluster is redundant and can be replaced without downtime.
>
> I believe, but can't know for sure, that everything what FTP-Master does,
> could be implemented on top of hadoop.
> However it means for sure a lot of work and many hardcore sysadmins will feel
> very uncomfortable to use Java, the language Hadoop is written in.

Isn't there /some/ python/jython support ?

Would you co-mentor such a project as part of a Summer of Code project
? Do you know someone who would ?
It need not be ftpmaster. There are probably other critical debian
infrastructure which could use this.

> I'm planning to give a presentation of hadoop at the DebConf in Bosnia and
> maybe then we may discuss, if hadoop should have a place in Debian's
> infrastructure. - For now I'm happy, if somebody became curious. :-)
>
> http://en.wikipedia.org/wiki/Hadoop
>
> Best regards,
>
> Thomas Koch, http://www.koch.ro

Cheers

Arthur


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: g2vc09ddae71003310920icba0397au58fa97f4ebfb788a@ma il.gmail.com">http://lists.debian.org/g2vc09ddae71003310920icba0397au58fa97f4ebfb788a@ma il.gmail.com
 
Old 03-31-2010, 10:41 PM
Stephen Gran
 
Default Hadoop in Debian, was: Hardware trouble ries.debian.org

This one time, at band camp, Obey Arthur Liu said:
> On Wed, Mar 31, 2010 at 2:42 PM, Thomas Koch <thomas@koch.ro> wrote:
> >
> > I believe, but can't know for sure, that everything what FTP-Master does,
> > could be implemented on top of hadoop.
> > However it means for sure a lot of work and many hardcore sysadmins will feel
> > very uncomfortable to use Java, the language Hadoop is written in.
>
> Isn't there /some/ python/jython support ?
>
> Would you co-mentor such a project as part of a Summer of Code project
> ? Do you know someone who would ?
> It need not be ftpmaster. There are probably other critical debian
> infrastructure which could use this.

Hadoop is not a POSIX file system, as far as I'm aware. As ftp-master
makes heavy use of things like file locks and hard links, I doubt hadoop
would work without a significant rewrite of the software.

It would probably be helpful to take a look at the dak codebase before
coming up with other solutions to this - any sort of clustering has to
take the software that actually runs the archive into account.

Cheers,
--
-----------------------------------------------------------------
| ,'`. Stephen Gran |
| : :' : sgran@debian.org |
| `. `' Debian user, admin, and developer |
| `- http://www.debian.org |
-----------------------------------------------------------------


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20100331224108.GA9763@varinia.lobefin.net">http://lists.debian.org/20100331224108.GA9763@varinia.lobefin.net
 
Old 04-01-2010, 10:40 AM
Bernd Eckenfels
 
Default Hadoop in Debian, was: Hardware trouble ries.debian.org

In article <20100331224108.GA9763@varinia.lobefin.net> you wrote:
> Hadoop is not a POSIX file system, as far as I'm aware. As ftp-master
> makes heavy use of things like file locks and hard links, I doubt hadoop
> would work without a significant rewrite of the software.

And HDFS is optimized for very large files, only. You would have to build a
filesystem inside for the typical FTP case - or maybe use HBase, not sure if
it can store large enough blobs.

Gruss
Bernd


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 201004011040.o31AewYr097932@neskaya.eckenfels.net" >http://lists.debian.org/201004011040.o31AewYr097932@neskaya.eckenfels.net
 

Thread Tools




All times are GMT. The time now is 09:09 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org