FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 10-17-2011, 03:31 PM
Olivier Berger
 
Default Advocating the use of RDF for Debian's published metadata - Was: Proposal for additional metadata in Debian archives (DEP-11)

Hi.

I'm not subscribed to ftpmasters, so feel free to CC me in response, and
I hope d-d@l.d.o is the proper place it will be debated, then (even
thouh -project already holds some bits too).

Le vendredi 14 octobre 2011 à 19:08 +0200, Matthias Klumpp a écrit :

> AppStream features XML to store metadata. Because we don't use XML
> somewhere in Debian, DEP-11 features a well-known RFC822-style format.

May I suggest to implement some (standardized) variant of RDF [0] to
represent this meta-data ?


I think it would help here, to adopt standards for more interoperability
of Debian's metadata with others'.
The "package metadata" could even be delivered on the Web of Data
(Linked Open Data), right from the Debian servers, to allow any
application to be created, that would consume such metadata.

If RDF/XML (as seems to be proposed by SPDX, to be verified once the
Linux Foundation site is back) is not suitable, then another format
would be great as long as it relies on some explicit prefix+suffix
combination, in order to allow for extensibility, for instance some JSON
variant of RDF like Turtle [1].

If a package can both be described with some generic purpose
"ontology"/standard/schema (for instance the one you envisioned
initially in DEP 11), and also, depending on context (embedded or
science, for instance) with another set of metadata (spdx or whatever
else), you'd be able to mix in the same file, metadata relating to
different contexts.

Still, I'm not sure RFC822-style is perfectly compliant with the habit
of RDF to separate prefix and suffix with a column character ':'. Maybe
'_' could act as such a separator (must say I haven't checked the RFC
for allowed tokens in the grammar) ?

Let's try with an example (btw, the DEP
http://wiki.debian.org/AppStreamDebianProposal *lacks* examples IMHO) :

In turtle representation format for RDF, one would have a document that
looks like this :
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix dep11: <http://www.debian.org/whatever/dep11#>.
@prefix debbugs: <http://www.debian.org/whatever/depxx#>.
@prefix spdx: <http://spdx.org/ontology#>.

<http://packages.qa.debian.org/iceweasel>
a dep11ebianPackage;
dep11:application "Iceweasel";
dep11ackage "iceweasel";
spdx:license "MPL-1.1"
debbugs:bugs <http://bugs.debian.org/iceweasel>.

(Maybe I didn't understand very well the Application and Package
meanings in your DEP11 proposal, btw.)

Anyway, as you can see, here we could have several "domains" of metadata
sources (ontologies / prefixes) to describe the same package combined in
a single document.

In RFC822-style, this could be something like :

DEP11_Application: Iceweasel
DEP11_Package: iceweasel
spdx_license: MPL-1.1
debbugs_bugs: http://bugs.debian.org/iceweasel

etc.

But clearly, not reinventing the wheel should be a goal, and adopting
existing standards for meta-data representation would be my choice, i.e.
Semantic Web standards (namely RDF).


Of course, translators from/to different syntaxes will be trivial to
develop, but if, from the source, a proper standard is used, it can be
readily delivered to the Web without any transformation needed. Such an
approach (often called Linked Data), clearly favors interoperability
(more at http://linkeddata.org/guides-and-tutorials if I failed to make
my point).


Again, in case you'd doubt it, RDF is just a model, which can be written
in a number of different formats (not only XML), but the key here is the
embedded identification of the reference of the ontologies/prefixes
which render the documents self described and extensible, out of the
box.

Note that the same rationale stands for all metadata to be eventually
published on the Web by Debian servers.

Hope this helps.

Best regards,

[0] http://www.w3.org/RDF/
[1] http://www.w3.org/TeamSubmission/turtle/
--
Olivier BERGER <olivier.berger@it-sudparis.eu>
http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8
Ingénieur Recherche - Dept INF
Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 1318865495.3675.36.camel@inf-8657.int-evry.fr">http://lists.debian.org/1318865495.3675.36.camel@inf-8657.int-evry.fr
 
Old 10-27-2011, 03:49 PM
Matthias Klumpp
 
Default Advocating the use of RDF for Debian's published metadata - Was: Proposal for additional metadata in Debian archives (DEP-11)

Hi!

2011/10/17 Olivier Berger <olivier.berger@it-sudparis.eu>:
> Hi.
>
> I'm not subscribed to ftpmasters, so feel free to CC me in response, and
> I hope d-d@l.d.o is the proper place it will be debated, then (even
> thouh -project already holds some bits too).
Unfortunately, ftpmasters don't have a mailinglist - I hope it is okay
to CC ftpmasters, if this is *not* okay, please write a short mail. -
I haven't got any reply from ftpmasters on this yet.

> Le vendredi 14 octobre 2011 à 19:08 +0200, Matthias Klumpp a écrit :
>
>> AppStream features XML to store metadata. Because we don't use XML
>> somewhere in Debian, DEP-11 features a well-known RFC822-style format.
>
> May I suggest to implement some (standardized) variant of RDF [0] to
> represent this meta-data ?
>
> I think it would help here, to adopt standards for more interoperability
> of Debian's metadata with others'.
> The "package metadata" could even be delivered on the Web of Data
> (Linked Open Data), right from the Debian servers, to allow any
> application to be created, that would consume such metadata.
>
> If RDF/XML (as seems to be proposed by SPDX, to be verified once the
> Linux Foundation site is back) is not suitable, then another format
> would be great as long as it relies on some explicit prefix+suffix
> combination, in order to allow for extensibility, for instance some JSON
> variant of RDF like Turtle [1].
I would like this very much - the proposal is extensible too, but it
also has a few limitations, if someone decides to extend it in futur.
The reason to propose a RFC822-style format for this data is, that
this format is already well-known inside Debian and we have nothing
else using XML yet. Because RFC is already used widely, it should be
easier to implement for ftpmasters.
RDF would work too, as long as it stores the same information as
described in the DEP-11 proposal. (But as far as I can see, it was
designed to do that)

> If a package can both be described with some generic purpose
> "ontology"/standard/schema (for instance the one you envisioned
> initially in DEP 11), and also, depending on context (embedded or
> science, for instance) with another set of metadata (spdx or whatever
> else), you'd be able to mix in the same file, metadata relating to
> different contexts.
This sounds like an overkill to me... Better pick only one format for
that instead of mixing stuff.

> Still, I'm not sure RFC822-style is perfectly compliant with the habit
> of RDF to separate prefix and suffix with a column character ':'. Maybe
> '_' could act as such a separator (must say I haven't checked the RFC
> for allowed tokens in the grammar) ?
We don't have prefix/suffix yet, because we haven't seen a need for it...

> Let's try with an example (btw, the DEP
> http://wiki.debian.org/AppStreamDebianProposal *lacks* examples IMHO) :
Right, maybe I should add one soon :P

> In turtle representation format for RDF, one would have a document that
> looks like this :
> * * * *@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
> * * * *@prefix dep11: <http://www.debian.org/whatever/dep11#>.
> * * * *@prefix debbugs: <http://www.debian.org/whatever/depxx#>.
> * * * *@prefix spdx: <http://spdx.org/ontology#>.
>
> * * * *<http://packages.qa.debian.org/iceweasel>
> * * * * *a dep11ebianPackage;
> * * * * *dep11:application "Iceweasel";
> * * * * *dep11ackage "iceweasel";
> * * * * *spdx:license "MPL-1.1"
> * * * * *debbugs:bugs <http://bugs.debian.org/iceweasel>.
>
> (Maybe I didn't understand very well the Application and Package
> meanings in your DEP11 proposal, btw.)
This looks very clean and extensible A package is the thing you
install via Synaptic/apt-get/aptitude, while an application is
everything which has a desktop-file and appears in the application
menu. (at least that's how it is defined at time)
A "component" is something a package provides, e.g. a shared library.
E.g. package "libgee2" provides the component "libgee.so.2" of type
shared library. Same applies for Python-modules, Plasma-Engines,
GNOME-Shell extensions etc.

> Anyway, as you can see, here we could have several "domains" of metadata
> sources (ontologies / prefixes) to describe the same package combined in
> a single document.
>
> In RFC822-style, this could be something like :
>
> DEP11_Application: Iceweasel
> DEP11_Package: iceweasel
> spdx_license: MPL-1.1
> debbugs_bugs: http://bugs.debian.org/iceweasel
>
> etc.
>
> But clearly, not reinventing the wheel should be a goal, and adopting
> existing standards for meta-data representation would be my choice, i.e.
> Semantic Web standards (namely RDF).
Agree. Your proposal looks very clean. But again the question is: Do
we want RDF in debian? This is mainly a policy-decision and has
nothing to do with technical details.


> Again, in case you'd doubt it, RDF is just a model, which can be written
> in a number of different formats (not only XML), but the key here is the
> embedded identification of the reference of the ontologies/prefixes
> which render the documents self described and extensible, out of the
> box.
For us, it is necessary that APT can process this data (will be
implemented if DEP-11 can make it) and that parts of it can be written
into a Xapian-DB for fast searching. - Both would work perfectly well
with any format.

It would be very nice, if ftpmasters could tell if they would accept a
new format in the archive or if we should stay with RFC822 which is
used for nearly everything else already.

> Note that the same rationale stands for all metadata to be eventually
> published on the Web by Debian servers.
>
> Hope this helps.
Thank you for the information... I think RDF would be much more "open"
for other people and apps to use, as the
data wouldn't be in a Debian-specific format. (I can't imagine yet
what others would do with this data, but if more people would use RDF,
e.g. other distributors too, having it all in one standardized and
extensible format would be something valuable)

Cheers,
Matthias

> [0] http://www.w3.org/RDF/
> [1] http://www.w3.org/TeamSubmission/turtle/


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CAKNHny9SVN8qM9ryMKjmVSq+hWYqG7A_KcQh3DvdJtw4HvZbE w@mail.gmail.com">http://lists.debian.org/CAKNHny9SVN8qM9ryMKjmVSq+hWYqG7A_KcQh3DvdJtw4HvZbE w@mail.gmail.com
 
Old 10-29-2011, 05:21 AM
Jonas Smedegaard
 
Default Advocating the use of RDF for Debian's published metadata - Was: Proposal for additional metadata in Debian archives (DEP-11)

On 11-10-17 at 05:31pm, Olivier Berger wrote:
> If RDF/XML (as seems to be proposed by SPDX, to be verified once the
> Linux Foundation site is back) is not suitable, then another format
> would be great as long as it relies on some explicit prefix+suffix
> combination, in order to allow for extensibility, for instance some
> JSON variant of RDF like Turtle [1].

Just for clarification: Turtle is a human-friendly RDF serialization.

RDF can also be expressed in JSON or YAML but this is less common among
semantic web developers and consumers. Popular formats are XML/RDF,
Turtle and HTMLa.


> In turtle representation format for RDF, one would have a document that
> looks like this :
> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
> @prefix dep11: <http://www.debian.org/whatever/dep11#>.
> @prefix debbugs: <http://www.debian.org/whatever/depxx#>.
> @prefix spdx: <http://spdx.org/ontology#>.
>
> <http://packages.qa.debian.org/iceweasel>
> a dep11ebianPackage;
> dep11:application "Iceweasel";
> dep11ackage "iceweasel";
> spdx:license "MPL-1.1"
> debbugs:bugs <http://bugs.debian.org/iceweasel>.

Above is Turtle!


- Jonas

--
* Jonas Smedegaard - idealist & Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/

[x] quote me freely [ ] ask before reusing [ ] keep private


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111029052125.GC25825@jones.dk">http://lists.debian.org/20111029052125.GC25825@jones.dk
 
Old 10-31-2011, 01:32 PM
Charles Plessy
 
Default Advocating the use of RDF for Debian's published metadata - Was: Proposal for additional metadata in Debian archives (DEP-11)

Le Thu, Oct 27, 2011 at 05:49:12PM +0200, Matthias Klumpp a écrit :
>
> For us, it is necessary that APT can process this data (will be
> implemented if DEP-11 can make it) and that parts of it can be written
> into a Xapian-DB for fast searching. - Both would work perfectly well
> with any format.
>
> It would be very nice, if ftpmasters could tell if they would accept a
> new format in the archive or if we should stay with RFC822 which is
> used for nearly everything else already.

Dear Matthias,

I am still not sure to understand how the data will be used. Is it only to be
used via Internet ? In that case perhaps it is not needed to distribute it via
the Debian archive. What is the Debian-specific data ? If it is the
association between a FreeDesktop menu “.desktop” file and a package name,
there is already a file in the Debian archive that provides this. Then, a
repository of the contents of FreeDesktop menu entries would definitely be
valuable, especially if served semantically, but as it would not contain data
specific to Debian, wouldn't it be better to develop it with less ties to the
Debian archive ? That would be a great contribution from Debian to the the
rest of the Free software ecosystem.

Have a nice day,

--
Charles Plessy
Tsurumi, Kanagawa, Japan


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111031143235.GA18493@merveille.plessy.net">http://lists.debian.org/20111031143235.GA18493@merveille.plessy.net
 

Thread Tools




All times are GMT. The time now is 03:15 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org