Copyright © 2009 Toby Inkster & Kjetil Kjernsmo, some rights reserved.
A set of RDF triples may be considered as a directed graph where the resources and literals form its nodes, and the predicates its edges. This graph in turn can be thought of as a resource in its own right, and dealt with in another graph. This document explores an extension for expressing named graphs in RDFa.
This document is published by buzzword.org.uk, a web site that hosts various specifications, articles and tools of use to web publishers. This is not a W3C recommendation. It is not even a buzzword.org.uk recommendation yet.
The authors welcome feedback on this draft. You can usually find one or both of us on #swig on freenode (our IRC nicks are tobyink and KjetilK).
This document is available under a licence which allows the creation of derivative works under certain conditions. For the purpose of licensing, implementations of the ideas considered in this specification shall not be considered derivative works.
RDFa is a family of attributes intended for the embedding of RDF data in XML markup [RDFA]. Its best known application is XHTML+RDFa — a format for embedding RDF data in XHTML documents — but in theory, RDFa can be incorporated into any XML-based standard. Indeed, the recent SVG Tiny 1.2 recommendation includes RDFa [SVGTINY12].
One limitation of RDFa is that it is only capable of embedding a single RDF graph per document. As this limitation is shared by RDF/XML [RDFXML], N-Triples [NTRIPLES] and Turtle [TURTLE], most people would not class it as a weakness of RDFa. However, a number of formats catering for multiple graphs do exist: Notation 3 [N3], TriX [TRIX], TriG [TRIG] and N-Quads [NQUADS] all include support for multiple graphs, allowing each graph to be identified with a URI, and thus be referred to by other graphs.
In the SPARQL Query Language for RDF [SPARQL], graph names are used to construct the default graph to query and also to restrict the query to certain named graphs.
Another use case for multiple graphs is describing assertions. A particular collection of RDF triples is bundled up as a graph and called an an assertion. We can then use another graph to describe that assertion — who asserted it? When? Has it been verified by an independent resource?
Graphs can also be used to model information that has changed over time. For example one graph might say that Ethelred the Unready is ruler of England; another might say that Elizabeth II is ruler of England. We could then use a third graph to note that the first graph was true in AD 1009, whereas the second is true in AD 2009.
This document investigates one possible method for marking up multiple graphs in RDFa. It does require some small changes to the RDFa parser to implement, but is backwards-compatible with parsers that do not support multiple graphs.
All named graphs generated are subsets of the default RDFa graph. This was a very important decision and is a core feature of the techniques outlined in this document. Processing RDFa using these techniques yields the same information as the standard RDFa algorithm: it just divides it into different "pots".
The default RDFa graph is the union of all the named graphs. There is no information that can be gleaned from the standard RDFa algorithm that cannot be gleaned from the named graphs. There are no triples generated by the standard RDFa algorithm that do not slot into one of the named graphs.
The intersection of the named graphs is nil. Each triple generated by the processing algorithm is placed in exactly one named graph. (Of course, if the same triple is expressed twice on a page, then it may be placed into a different graph each time.)
Graph names are URIs. This is a design decision taken by other RDF serialisations with support for multiple graphs, and we saw no reason to depart from it.
No new attributes added to XHTML+RDFa 1.0. To markup graphs a new attribute is required, but this document does not define what that attribute is called. Instead the producer and consumer of an XHTML document must come to a private agreement as to which attribute is used. Markup language specifications that decide to adopt the techniques described in this document may define such an attribute for usage in that particular language.
For convenience, in the rest of this document we will assume that the attribute agreed by the producer and consumer of the example markup is g:graph
where the g
namespace is defined as http://example.com/graphing.
The consumer and producer of markup must first agree on an attribute to use for marking up graphs. We will assume that this is g:graph
.
Although the value space of this attribute is full URIs, the producer and consumer need to agree on the lexical space of this attribute [XMLSCHEMA2]. This document defines two possibilities, but others are imaginable.
g:graph
has a lexical space similar to xhtml:id
. Therefore, if the base URI of the document is http://example.com/document then the attribute g:graph="foo"
represents the URI http://example.com/document#foo. Note that this decision restricts the producer's ability to mint URIs for named graphs.
g:graph
has a lexical space similar to xhtml:about
. Therefore, if the base URI of the document is http://example.com/document then the attribute g:graph="foo"
represents the URI http://example.com/foo. This allows any absolute or relative URI to be used as a named graph. Safe CURIEs and blank nodes are allowed.
Once the attribute and its lexical space have been agreed, the producer may publish RDFa containing named graphs. When the agreed attribute has been set on an element, all triples found on that element and its descendants are taken to be part of the named graph specified. The following is an example XHTML document using named graphs, with g:graph
defined to have a lexical space similar to xhtml:about
:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:g="http://example.com/graphing" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/terms/" xmlns:mind="http://example.com/mind#"> <head> <title>Example of Named Graphs in RDFa</title> </head> <body> <p about="_:gavin" typeof="foaf:Person"> <span property="foaf:name">Gavin</span> <span rel="mind:thinks" resource="#gavins_thoughts">thinks that</span> <span typeof="foaf:Document" g:graph="#gavins_thoughts"> <i property="dc:title">Moby Dick</i> was written by <span property="dc:creator">Herman Melville</span> </span> </p> <p about="_:smithy" typeof="foaf:Person"> <span property="foaf:name">Smithy</span> <span rel="mind:thinks" resource="#smithys_thoughts">thinks that</span> <span typeof="foaf:Document" g:graph="#smithys_thoughts"> <i property="dc:title">Moby Dick</i> was written by <span property="dc:creator">Melville Herman</span> </span> </p> </body> </html>
The information above can be represented in Notation 3 [N3] as:
@prefix foaf : <http://xmlns.com/foaf/0.1/> . @prefix dc : <http://purl.org/dc/terms/> . @prefix mind : <http://example.com/mind#> . _:graph0 = { _:gavin a foaf:Person ; foaf:name "Gavin" ; mind:thinks <#gavins_thoughts> . _:smithy a foaf:Person ; foaf:name "Smithy" ; mind:thinks <#smithys_thoughts> . } . <#gavins_thoughts> = { _:node1 a foaf:Document ; dc:title "Moby Dick" ; dc:creator "Herman Melville" . } . <#smithys_thoughts> = { _:node2 a foaf:Document ; dc:title "Moby Dick" ; dc:creator "Melville Herman" . } .
In RDFa, many triples are generated from attributes split across multiple elements. A slightly contrived example:
<div> <div about="#joe"> <div> <div rel="foaf:homepage" rev="foaf:primaryTopic" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p> <a href="http://joe.example.com/">http://joe.example.com/</a> </p> </div> </div> </div> </div>
When graphs are specified, it may seem unclear as to which graph the triples should be added.
<div g:graph="#g1"> <div g:graph="#g2" about="#joe"> <div g:graph="#g3"> <div g:graph="#g4" rel="foaf:homepage" rev="foaf:primaryTopic" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p g:graph="#g5"> <a g:graph="#g6" href="http://joe.example.com/">http://joe.example.com/</a> </p> </div> </div> </div> </div>
The rule is that a triple is added to the graph of the element which set the predicate of the triple. So, in the previous example, the following Notation 3 is generated.
@prefix foaf : <http://xmlns.com/foaf/0.1/> . <#g1> = {} . <#g2> = {} . <#g3> = {} . <#g4> = { <#joe> foaf:homepage <http://joe.example.com/> . <http://joe.example.com/> foaf:primaryTopic <#joe> . } . <#g5> = {} . <#g6> = {} .
The standard RDFa processing sequence [RDFA] requires only minor modifications to allow for named graphs. The modifications required are as follows:
The initial context created should have a variable called [graph]. The initial value for this is a newly created blank node.
After stage 3 in the sequence, but before stage 4, the [current element] should be checked to see if there is a graph attribute. If there is, then the attribute's value should be converted to a URI depending on how the lexical space of the attribute is defined, and the [graph] variable should be set to that URI.
Any triple created in stages 6, 7 or 9 is considered to be in the graph specified by [graph].
In stage 8, not only are the predicate and direction stored for each incomplete triple, but also the current value of [graph].
Any triple created in stage 10 is considered to be in the graph stored in the incomplete triples list.
In stage 11, the new evaluation context is passed the new value of [graph].
Known implementations of this idea: