FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 08-04-2011, 07:44 AM
Alexander Wirt
 
Default Debian mailing lists archives as mbox (was:

Christian PERRIER schrieb am Thursday, den 04. August 2011:

> (from soc-coordination)
>
> Quoting Sukhbir Singh (sukhbir.in@gmail.com):
> > Hi,
> >
> > Here is what we implemented during this time, in more detail:
> >
> > 1. Wrote a script that fetches messages for lists.debian.org from
> > Gmane and then creates a mbox archive for them. This allows us to
>
>
> I find it really sad that you have to do this while Debian mailing
> lists archives available as mbox *do exist* (but they're only
> available for DDs with accounts in Debian machines).
>
> I understand there could be some reluctance to offering these
> completely anonymously but I wonder if there could be an intermediate
> way for non DDs to get access to these archives. That would also help
> for the spam cleaning process (reviewing spam in one's MUA instead of
> online).
in general its always great if someone jumps into a discussion with reading
the discussion himself. We had an ongoing discussion about privacy and so
spam and so on about the mboxes. We even managed to get consense yesterday.
Now you have boxes with all spam included. Have fun.

Alex


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804074449.GI3348@smithers.snow-crash.org">http://lists.debian.org/20110804074449.GI3348@smithers.snow-crash.org
 
Old 08-04-2011, 08:22 AM
Andreas Tille
 
Default Debian mailing lists archives as mbox (was:

On Thu, Aug 04, 2011 at 09:44:49AM +0200, Alexander Wirt wrote:
> We had an ongoing discussion about privacy and so
> spam and so on about the mboxes. We even managed to get consense yesterday.

To bring some light into this I would like to publish this consense we
had:

A filter needs to be written (most probably this will be done
by Sukhbir who should test this on any mbox because he is not
allowed to access original mboxes). The filter should have the
following features:

- Parse the existing mboxes and strip them down to the following
information

Message-id: <ID>
From: Name of poster <e-mail@of.poster>
Date: Date
Subject: Subject
Content
- Remove those Message-IDs which should be removed (just
detected SPAM)
- Publish these mboxes (it was not yet specified by listmaster
whether for general http download or only for specific users)

The filter will be written in Python because this is Sukhbirs
prefered language and listmaster accepted this as an exception
even if they would have prefered Perl.

So far for the consensus we had reached in private discussion. I did
not got a final yes for my suggestion to include the following
information which I regard as helpful as well:

In-reply-to
References
X-Spam

IMHO the first two might be helpful to reconstruct threads (so this
information is at least implicitely inside the web archive - at least
to my poor understanding).

I also regard the X-Spam fields as valuable information which is
irrelevant for privacy but most probably quite usefull for other
purposes like further SPAM removals.

If you regard some other fields interesting but not critical for
privacy issues it might be the right moment to speak up now.

Hope this will be helpful to find a reasonable solution.

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804082243.GD9933@an3as.eu">http://lists.debian.org/20110804082243.GD9933@an3as.eu
 
Old 08-04-2011, 08:31 AM
Alexander Wirt
 
Default Debian mailing lists archives as mbox (was:

Andreas Tille schrieb am Thursday, den 04. August 2011:

> On Thu, Aug 04, 2011 at 09:44:49AM +0200, Alexander Wirt wrote:
> > We had an ongoing discussion about privacy and so
> > spam and so on about the mboxes. We even managed to get consense yesterday.
>
> To bring some light into this I would like to publish this consense we
> had:
>
> A filter needs to be written (most probably this will be done
> by Sukhbir who should test this on any mbox because he is not
> allowed to access original mboxes). The filter should have the
> following features:
>
> - Parse the existing mboxes and strip them down to the following
> information
>
> Message-id: <ID>
> From: Name of poster <e-mail@of.poster>
> Date: Date
> Subject: Subject
> Content
> - Remove those Message-IDs which should be removed (just
> detected SPAM)
> - Publish these mboxes (it was not yet specified by listmaster
> whether for general http download or only for specific users)
Just for the record. The mboxes are not for being published. We are currently
working on getting more data privacy protection in the archive so just
publishing the mboxes would just be counterproductive.

> In-reply-to
> References
> X-Spam
Ok, after thinking about it you can include In-reply-to and References. I
don't see why X-Spam should be useful for you.

Alex


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804083133.GJ3348@smithers.snow-crash.org">http://lists.debian.org/20110804083133.GJ3348@smithers.snow-crash.org
 
Old 08-04-2011, 08:54 AM
Andreas Tille
 
Default Debian mailing lists archives as mbox (was:

On Thu, Aug 04, 2011 at 10:31:33AM +0200, Alexander Wirt wrote:
> Just for the record. The mboxes are not for being published. We are currently
> working on getting more data privacy protection in the archive so just
> publishing the mboxes would just be counterproductive.

Thanks for clarifying this. Is this "we" == listmaster and if yes was
this discussed somewhere?

> > In-reply-to
> > References
> > X-Spam
> Ok, after thinking about it you can include In-reply-to and References. I
> don't see why X-Spam should be useful for you.

Just quoting myself from the mail you responded to:

I also regard the X-Spam fields ... usefull for other
purposes like further SPAM removals.

I just know that Christian is behind better SPAM removal.

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804085445.GE9933@an3as.eu">http://lists.debian.org/20110804085445.GE9933@an3as.eu
 
Old 08-04-2011, 09:02 AM
Alexander Wirt
 
Default Debian mailing lists archives as mbox (was:

Andreas Tille schrieb am Thursday, den 04. August 2011:

> On Thu, Aug 04, 2011 at 10:31:33AM +0200, Alexander Wirt wrote:
> > Just for the record. The mboxes are not for being published. We are currently
> > working on getting more data privacy protection in the archive so just
> > publishing the mboxes would just be counterproductive.
>
> Thanks for clarifying this. Is this "we" == listmaster and if yes was
> this discussed somewhere?
Yeah with the listmaster alias. So there is no public trace of the
conversation.

>
> > > In-reply-to
> > > References
> > > X-Spam
> > Ok, after thinking about it you can include In-reply-to and References. I
> > don't see why X-Spam should be useful for you.
>
> Just quoting myself from the mail you responded to:
>
> I also regard the X-Spam fields ... usefull for other
> purposes like further SPAM removals.
>
> I just know that Christian is behind better SPAM removal.
Yeah, but as the mboxes are not for public distribution that isn't important.
Christian itself has access on the mboxes via master.

Alex


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804090220.GK3348@smithers.snow-crash.org">http://lists.debian.org/20110804090220.GK3348@smithers.snow-crash.org
 
Old 08-04-2011, 09:21 AM
Andreas Tille
 
Default Debian mailing lists archives as mbox (was:

On Thu, Aug 04, 2011 at 11:02:20AM +0200, Alexander Wirt wrote:
> > > Just for the record. The mboxes are not for being published. We are currently
> > > working on getting more data privacy protection in the archive so just
> > > publishing the mboxes would just be counterproductive.
> >
> > Thanks for clarifying this. Is this "we" == listmaster and if yes was
> > this discussed somewhere?
> Yeah with the listmaster alias. So there is no public trace of the
> conversation.

Perhaps I'm the only one but changing the policy of the project lists
what information should be published and what not might deserve
discussion inside some larger audience - at least debian-private comes
to mind I would even think that debian-project is the right place to
discuss.

> > Just quoting myself from the mail you responded to:
> >
> > I also regard the X-Spam fields ... usefull for other
> > purposes like further SPAM removals.
> >
> > I just know that Christian is behind better SPAM removal.
> Yeah, but as the mboxes are not for public distribution that isn't important.
> Christian itself has access on the mboxes via master.

As far as I know it is not only Christian and if we really want to get
more people involved into Debian (including non-technicans) tasks like
cleaning up the list archive from SPAM might be somehow interesting.

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804092125.GI9933@an3as.eu">http://lists.debian.org/20110804092125.GI9933@an3as.eu
 
Old 08-04-2011, 09:29 AM
Alexander Wirt
 
Default Debian mailing lists archives as mbox (was:

Andreas Tille schrieb am Thursday, den 04. August 2011:

> On Thu, Aug 04, 2011 at 11:02:20AM +0200, Alexander Wirt wrote:
> > > > Just for the record. The mboxes are not for being published. We are currently
> > > > working on getting more data privacy protection in the archive so just
> > > > publishing the mboxes would just be counterproductive.
> > >
> > > Thanks for clarifying this. Is this "we" == listmaster and if yes was
> > > this discussed somewhere?
> > Yeah with the listmaster alias. So there is no public trace of the
> > conversation.
>
> Perhaps I'm the only one but changing the policy of the project lists
> what information should be published and what not might deserve
> discussion inside some larger audience - at least debian-private comes
> to mind I would even think that debian-project is the right place to
> discuss.
>
> > > Just quoting myself from the mail you responded to:
> > >
> > > I also regard the X-Spam fields ... usefull for other
> > > purposes like further SPAM removals.
> > >
> > > I just know that Christian is behind better SPAM removal.
> > Yeah, but as the mboxes are not for public distribution that isn't important.
> > Christian itself has access on the mboxes via master.
>
> As far as I know it is not only Christian and if we really want to get
> more people involved into Debian (including non-technicans) tasks like
> cleaning up the list archive from SPAM might be somehow interesting.
Yeah, but I don't think that deserves publishing sensitive information.

Just my 2 cent

Alex

P.S. I know its nice to be open. But publishing real names and mailaddresses
is a problem and at least problematic under german law (and probably for
other countries).

P.P.S this is not an offical statement of the listmaster team.


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804092941.GL3348@smithers.snow-crash.org">http://lists.debian.org/20110804092941.GL3348@smithers.snow-crash.org
 
Old 08-04-2011, 09:56 AM
Andreas Tille
 
Default Debian mailing lists archives as mbox (was:

On Thu, Aug 04, 2011 at 11:29:41AM +0200, Alexander Wirt wrote:
> > As far as I know it is not only Christian and if we really want to get
> > more people involved into Debian (including non-technicans) tasks like
> > cleaning up the list archive from SPAM might be somehow interesting.
> Yeah, but I don't think that deserves publishing sensitive information.

In how far are lists at lists.d.o different than any other mailing list
I know?

> P.S. I know its nice to be open. But publishing real names and mailaddresses
> is a problem and at least problematic under german law (and probably for
> other countries).

What specific law do you have in mind and did you consulted a lawyer or
debian-legal about your concerns? Disclaimer: While I can read an
understand German every day language I'm not sure that I will understand
legal texts - but I could give it a try.

Kind regards

Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804095629.GL9933@an3as.eu">http://lists.debian.org/20110804095629.GL9933@an3as.eu
 
Old 08-04-2011, 10:04 AM
Alexander Wirt
 
Default Debian mailing lists archives as mbox (was:

Andreas Tille schrieb am Thursday, den 04. August 2011:

> On Thu, Aug 04, 2011 at 11:29:41AM +0200, Alexander Wirt wrote:
> > > As far as I know it is not only Christian and if we really want to get
> > > more people involved into Debian (including non-technicans) tasks like
> > > cleaning up the list archive from SPAM might be somehow interesting.
> > Yeah, but I don't think that deserves publishing sensitive information.
>
> In how far are lists at lists.d.o different than any other mailing list
> I know?
>
> > P.S. I know its nice to be open. But publishing real names and mailaddresses
> > is a problem and at least problematic under german law (and probably for
> > other countries).
>
> What specific law do you have in mind and did you consulted a lawyer or
> debian-legal about your concerns? Disclaimer: While I can read an
> understand German every day language I'm not sure that I will understand
> legal texts - but I could give it a try.
Its the whole bunch of german data protection laws. But no I didn't asked a
lawyer, but we have several complaints a month on the listmaster alias. And I
personally share their complaints.

"Ein wesentlicher Grundsatz des Gesetzes ist das so genannte Verbotsprinzip
mit Erlaubnisvorbehalt. Dieses besagt, dass die Erhebung, Verarbeitung und
Nutzung von personenbezogenen Daten im Prinzip verboten ist. Sie ist nur dann
erlaubt, wenn entweder eine klare Rechtsgrundlage gegeben ist (d. h., das
Gesetz erlaubt die Datenverarbeitung in diesem Fall) oder wenn die betroffene
Person ausdrücklich (meist schriftlich) ihre Zustimmung zur Erhebung,
Verarbeitung und Nutzung gegeben hat (§ 13 Absatz 2 ff). Die angewendeten
Verfahren mit automatisierter Verarbeitung sind vom (behördlichen oder
betrieblichen) Datenschutzbeauftragten zu prüfen, oder (wenn ein solcher
nicht vorhanden ist) bei der zuständigen Aufsichtsbehörde anzeigepflichtig (§
4d).

Ebenfalls gilt der in § 3a definierte Grundsatz der Datenvermeidung und
Datensparsamkeit: So sollen sich alle Datenverarbeitungssysteme an dem Ziel
ausrichten, keine oder so wenig personenbezogene Daten wie möglich zu
verwenden und insbesondere von den Möglichkeiten der Anonymisierung und
Pseudonymisierung Gebrauch zu machen.
"

Alex


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110804100410.GN3348@smithers.snow-crash.org">http://lists.debian.org/20110804100410.GN3348@smithers.snow-crash.org
 
Old 08-04-2011, 03:24 PM
Michelle Konzack
 
Default Debian mailing lists archives as mbox (was:

Hello Andreas Tille,

Am 2011-08-04 10:22:43, hacktest Du folgendes herunter:
> - Parse the existing mboxes and strip them down to the following
> information
>
> Message-id: <ID>
> From: Name of poster <e-mail@of.poster>
> Date: Date
> Subject: Subject
> Content
<snip>
> So far for the consensus we had reached in private discussion. I did
> not got a final yes for my suggestion to include the following
> information which I regard as helpful as well:
>
> In-reply-to
> References

I think, they shold be there because otherwise threads are broken and if
the archive has 3000 amndmore message it will be a nightmare.

> Kind regards
> Andreas.

Thanks, Greetings and nice Day/Evening
Michelle Konzack

--
##################### Debian GNU/Linux Consultant ######################
Development of Intranet and Embedded Systems with Debian GNU/Linux

itsystems@tdnet France itsystems@tdnet
Owner Michelle Konzack Owner Michelle Konzack

Apt. 917 (homeoffice) Gewerbe Straße 3
50, rue de Soultz 77694 Kehl/Germany
67100 Strasbourg/France Tel: +49-177-9351947 mobil
Tel: +33-6-61925193 mobil Tel: +49-176-86004575 office

<http://www.itsystems.tamay-dogan.net/> <http://www.flexray4linux.org/>
<http://www.debian.tamay-dogan.net/> <http://www.can4linux.org/>

Jabber linux4michelle@jabber.ccc.de
ICQ #328449886

Linux-User #280138 with the Linux Counter, http://counter.li.org/
 

Thread Tools




All times are GMT. The time now is 01:10 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org