Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   CentOS (http://www.linux-archive.org/centos/)
-   -   Deduplication data for CentOS? (http://www.linux-archive.org/centos/698104-deduplication-data-centos.html)

Rainer Traut 08-27-2012 11:55 AM

Deduplication data for CentOS?
 
Hi list,

is there any working solution for deduplication of data for centos?
We are trying to find a solution for our backup server which runs a bash
script invoking xdelta(3). But having this functionality in fs is much
more friendly...

We have looked into lessfs, sdfs and ddar.
Are these filesystems ready to use (on centos)?
ddar is sthg different, I know.

Thx
Rainer
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

John Doe 08-27-2012 12:15 PM

Deduplication data for CentOS?
 
From: Rainer Traut <tr.ml@gmx.de>

> is there any working solution for deduplication of data for centos?
> We are trying to find a solution for our backup server which runs a bash
> script invoking xdelta(3). But having this functionality in fs is much
> more friendly...
>
> We have looked into lessfs, sdfs and ddar.
> Are these filesystems ready to use (on centos)?
> ddar is sthg different, I know.

Never tried but what about zfs?

JD
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Rainer Traut 08-27-2012 12:23 PM

Deduplication data for CentOS?
 
Am 27.08.2012 14:15, schrieb John Doe:
> From: Rainer Traut <tr.ml@gmx.de>
>
>> is there any working solution for deduplication of data for centos?
>> We are trying to find a solution for our backup server which runs a bash
>> script invoking xdelta(3). But having this functionality in fs is much
>> more friendly...
>>
>> We have looked into lessfs, sdfs and ddar.
>> Are these filesystems ready to use (on centos)?
>> ddar is sthg different, I know.
>
> Never tried but what about zfs?

Yeah I know it has this feature, but is there a working zfs
implementation for linux?
Linux is a must, because the data we are backing up are Domino databases
and also is a customer's requirement.

And btrfs has not yet implemented this feature I think.


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Janne Snabb 08-27-2012 02:04 PM

Deduplication data for CentOS?
 
On 08/27/2012 07:23 PM, Rainer Traut wrote:

> Yeah I know it has this feature, but is there a working zfs
> implementation for linux?

I have heard some positive feedback about http://zfsonlinux.org/ but I
have not had time to test myself yet. It probably depends on your
intended usage. It is a new in-kernel ZFS implementation (different from
the old FUSE implementation).

RHEL 6.2 x86_64 is listed as one of the supported OSes, so it probably
works fine with CentOS too.

There is some positive and negative feedback in the following links:

https://groups.google.com/a/zfsonlinux.org/group/zfs-discuss/browse_thread/thread/5a739039623f8fb1

http://pingd.org/2012/installing-zfs-raid-z-on-centos-6-2-with-ssd-caching.html

Please share your results if you do any testing :)

--
Janne Snabb / EPIPE Communications
snabb@epipe.com - http://epipe.com/
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

John R Pierce 08-27-2012 02:23 PM

Deduplication data for CentOS?
 
On 08/27/12 4:55 AM, Rainer Traut wrote:
> is there any working solution for deduplication of data for centos?
> We are trying to find a solution for our backup server which runs a bash
> script invoking xdelta(3). But having this functionality in fs is much
> more friendly...

BackupPC does exactly this. its not a generalized solution to
deduplication of a file system, instead, its a backup system, designed
to backup multiple targets, that implements deduplication on the backup
tree it maintains.



--
john r pierce N 37, W 122
santa cruz ca mid-left coast

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Les Mikesell 08-27-2012 03:26 PM

Deduplication data for CentOS?
 
On Mon, Aug 27, 2012 at 9:23 AM, John R Pierce <pierce@hogranch.com> wrote:
> On 08/27/12 4:55 AM, Rainer Traut wrote:
>> is there any working solution for deduplication of data for centos?
>> We are trying to find a solution for our backup server which runs a bash
>> script invoking xdelta(3). But having this functionality in fs is much
>> more friendly...
>
> BackupPC does exactly this. its not a generalized solution to
> deduplication of a file system, instead, its a backup system, designed
> to backup multiple targets, that implements deduplication on the backup
> tree it maintains.

Not _exactly_, but maybe close enough and it is very easy to install
and try. Backuppc will use rsync for transfers and thus only uses
bandwidth for the differences, but it uses hardlinks to files to dedup
the storage. It will find and link duplicate content even from
different sources, but the complete file must be identical. It does
not store deltas, so large files that change even slightly between
backups end up stored as complete copies (with optional compression).

--
Les Mikesell
lesmikesell@gmail.com
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Dean Jones 08-27-2012 03:45 PM

Deduplication data for CentOS?
 
Deduplication with ZFS takes a lot of RAM.

I would not yet trust any of the linux zfs projects for data that I
wanted to keep long term.

On Mon, Aug 27, 2012 at 8:26 AM, Les Mikesell <lesmikesell@gmail.com> wrote:
> On Mon, Aug 27, 2012 at 9:23 AM, John R Pierce <pierce@hogranch.com> wrote:
>> On 08/27/12 4:55 AM, Rainer Traut wrote:
>>> is there any working solution for deduplication of data for centos?
>>> We are trying to find a solution for our backup server which runs a bash
>>> script invoking xdelta(3). But having this functionality in fs is much
>>> more friendly...
>>
>> BackupPC does exactly this. its not a generalized solution to
>> deduplication of a file system, instead, its a backup system, designed
>> to backup multiple targets, that implements deduplication on the backup
>> tree it maintains.
>
> Not _exactly_, but maybe close enough and it is very easy to install
> and try. Backuppc will use rsync for transfers and thus only uses
> bandwidth for the differences, but it uses hardlinks to files to dedup
> the storage. It will find and link duplicate content even from
> different sources, but the complete file must be identical. It does
> not store deltas, so large files that change even slightly between
> backups end up stored as complete copies (with optional compression).
>
> --
> Les Mikesell
> lesmikesell@gmail.com
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Leon Fauster 08-27-2012 03:53 PM

Deduplication data for CentOS?
 
Am 27.08.2012 um 16:23 schrieb John R Pierce:
> On 08/27/12 4:55 AM, Rainer Traut wrote:
>> is there any working solution for deduplication of data for centos?
>> We are trying to find a solution for our backup server which runs a bash
>> script invoking xdelta(3). But having this functionality in fs is much
>> more friendly...
>
> BackupPC does exactly this. its not a generalized solution to
> deduplication of a file system, instead, its a backup system, designed
> to backup multiple targets, that implements deduplication on the backup
> tree it maintains.


AFAIK - bacula has deduplication capabilities.

--
LF

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Les Mikesell 08-27-2012 04:04 PM

Deduplication data for CentOS?
 
On Mon, Aug 27, 2012 at 6:55 AM, Rainer Traut <tr.ml@gmx.de> wrote:
>
> is there any working solution for deduplication of data for centos?
> We are trying to find a solution for our backup server which runs a bash
> script invoking xdelta(3). But having this functionality in fs is much
> more friendly...
>

Below forwarded on behalf of mroth:

Les,

A favor, please? Could you post this for me? Spamhouse is bouncing me
again, this time because *they* have a bug (see below). I tried asking
Karanbir, but I guess he's not online yet....

Thanks in advance.

John R Pierce wrote:
> On 08/27/12 4:55 AM, Rainer Traut wrote:
>> is there any working solution for deduplication of data for centos? We
are trying to find a solution for our backup server which runs a bash
script invoking xdelta(3). But having this functionality in fs is much
more friendly...
>
> BackupPC does exactly this. its not a generalized solution to
deduplication of a file system, instead, its a backup system, designed to
backup multiple targets, that implements deduplication on the backup tree
it maintains.

I've tried, twice, to suggest that a workaround that doesn't involve a
new, and possibly experimental f/s would be to use rsync with hard links,
which is what we do. There's no way we have enough disk space for 5 weeks
of terabytes of data....

However, the reason I haven't been able to suggest it is that I'm being
blocked by spamhost. And when I go there, it asserts I'm listed in the
CBL. And when I go *THERE*, it tells me I'm not.

Oh, and now, when I try to go to the CBL, it's down.

I don't suppose the CentOS list has a whitelist....

mark
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

"David C. Miller" 08-27-2012 04:33 PM

Deduplication data for CentOS?
 
----- Original Message -----
> From: "Rainer Traut" <tr.ml@gmx.de>
> To: centos@centos.org
> Sent: Monday, August 27, 2012 4:55:03 AM
> Subject: [CentOS] Deduplication data for CentOS?
>
> Hi list,
>
> is there any working solution for deduplication of data for centos?
> We are trying to find a solution for our backup server which runs a
> bash
> script invoking xdelta(3). But having this functionality in fs is
> much
> more friendly...
>
> We have looked into lessfs, sdfs and ddar.
> Are these filesystems ready to use (on centos)?
> ddar is sthg different, I know.
>
> Thx
> Rainer

Although not open source, CrashplanPROe only costs $365 for a perpetual five client license. I use it to backup some of my Linux boxes. It has very good deduplication, compression, and encryption. For example I have 1.7TB of data on one linux system and another system that has 1.5TB. I NFS mount one of the systems to another and only use one Crashplan client to backup both data sets to a single backup archive. The backup archive is only 1.2TB and that also spans 90 days worth of file modification and deletion I can recover.

David.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


All times are GMT. The time now is 06:41 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.