Copyright © 2009–2010 Toby Inkster & Kjetil Kjernsmo, some rights reserved.
A set of RDF triples may be considered as a directed graph where the resources and literals form its nodes, and the predicates its edges. This graph in turn can be thought of as a resource in its own right, and described in another graph. This document explores an extension for expressing multiple graphs in RDFa.
This document is published by buzzword.org.uk, a web site that hosts various specifications, articles and tools of use to web publishers. This is not a W3C recommendation. It is not even a buzzword.org.uk recommendation yet.
The authors welcome feedback on this draft. You can usually find one or both of us on #swig on freenode (our IRC nicks are tobyink and KjetilK).
This document is available under a licence which allows the creation of derivative works under certain conditions. For the purpose of licensing, implementations of the ideas considered in this specification shall not be considered derivative works.
RDFa is a family of attributes intended for the embedding of RDF data in XML markup [RDFA]. Its best known application is XHTML+RDFa — a format for embedding RDF data in XHTML documents — but RDFa can be incorporated into any XML-based or DOM-like file format. Indeed, the recent SVG Tiny 1.2 recommendation includes RDFa [SVGTINY12].
One limitation of RDFa is that it is only capable of embedding a single RDF graph per document. As this limitation is shared by RDF/XML [RDFXML], N-Triples [NTRIPLES] and Turtle [TURTLE], most people would not class it as a weakness of RDFa. However, a number of formats catering for multiple graphs do exist: Notation 3 [N3], TriX [TRIX], TriG [TRIG] and N-Quads [NQUADS] all include support for multiple graphs, allowing each graph to be identified with a URI, and thus be referred to by other graphs.
In the SPARQL Query Language for RDF [SPARQL], graph names are used to construct the default graph to query and also to restrict the query to certain named graphs.
Another use case for multiple graphs is describing assertions. A particular collection of RDF triples is bundled up as a graph and called an an assertion. We can then use another graph to describe that assertion — who asserted it? When? Has it been verified by an independent resource?
Graphs can also be used to model information that has changed over time. For example one graph might say that Ethelred the Unready is ruler of England; another might say that Elizabeth II is ruler of England. We could then use a third graph to note that the first graph was true in AD 1009, whereas the second is true in AD 2009.
This document investigates one possible method for marking up multiple graphs in RDFa. It does require some small changes to the RDFa parser to implement, but is backwards-compatible with parsers that do not support multiple graphs.
All graphs generated are subsets of the default RDFa graph. This was a very important decision and is a core feature of the techniques outlined in this document. Processing RDFa using these techniques yields the same information as the standard RDFa algorithm: it just divides it into different "pots". Each pot is a "subgraph".
The RDFa graph generated by graph-unaware processors is the union of all the subgraphs. There is no information that can be gleaned from the standard RDFa algorithm that cannot be gleaned from the subgraphs. There are no triples generated by the standard RDFa algorithm that do not slot into one of the subgraphs.
The intersection of the subgraphs is nil. Each triple generated by the processing algorithm is placed in exactly one subgraph. (Of course, if the same triple is expressed twice on a page, then it may be placed into a different subgraph each time.)
The attribute to use for marking up different subgraphs is graph
in the same namespace as other RDFa attributes. By default RDFa attributes are not in any namespace, so neither is graph
.
Although the value space of this attribute is the set of URIs and blank nodes, graph
has a lexical space identical to about
. Therefore, if the base URI of the document is http://example.com/document then the attribute graph="foo"
represents the URI http://example.com/foo. This allows any absolute or relative URI to be used as a named graph. Safe CURIEs and blank nodes are allowed.
When the graph
attribute has been set on an element, all triples found on that element and its descendants are taken to be part of the subgraph specified. The following is an example XHTML document using multiple graphs:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/terms/" xmlns:mind="http://example.com/mind#"> <head> <title>Example of Named Graphs in RDFa</title> </head> <body> <p about="_:gavin" typeof="foaf:Person"> <span property="foaf:name">Gavin</span> <span rel="mind:thinks" resource="#gavins_thoughts">thinks that</span> <span typeof="foaf:Document" graph="#gavins_thoughts"> <i property="dc:title">Moby Dick</i> was written by <span property="dc:creator">Herman Melville</span> </span> </p> <p about="_:smithy" typeof="foaf:Person"> <span property="foaf:name">Smithy</span> <span rel="mind:thinks" resource="#smithys_thoughts">thinks that</span> <span typeof="foaf:Document" graph="#smithys_thoughts"> <i property="dc:title">Moby Dick</i> was written by <span property="dc:creator">Melville Herman</span> </span> </p> </body> </html>
The information above can be represented in Notation 3 [N3] as:
@prefix foaf : <http://xmlns.com/foaf/0.1/> . @prefix dc : <http://purl.org/dc/terms/> . @prefix mind : <http://example.com/mind#> . _:graph0 = { _:gavin a foaf:Person ; foaf:name "Gavin" ; mind:thinks <#gavins_thoughts> . _:smithy a foaf:Person ; foaf:name "Smithy" ; mind:thinks <#smithys_thoughts> . } . <#gavins_thoughts> = { _:node1 a foaf:Document ; dc:title "Moby Dick" ; dc:creator "Herman Melville" . } . <#smithys_thoughts> = { _:node2 a foaf:Document ; dc:title "Moby Dick" ; dc:creator "Melville Herman" . } .
In RDFa, many triples are generated from attributes split across multiple elements. A slightly contrived example:
<div> <div about="#joe"> <div> <div rel="foaf:homepage" rev="foaf:primaryTopic" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p> <a href="http://joe.example.com/">http://joe.example.com/</a> </p> </div> </div> </div> </div>
When subgraphs are specified, it may seem unclear as to which graph the triples should be added.
<div graph="#g1"> <div graph="#g2" about="#joe"> <div graph="#g3"> <div graph="#g4" rel="foaf:homepage" rev="foaf:primaryTopic" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p graph="#g5"> <a graph="#g6" href="http://joe.example.com/">http://joe.example.com/</a> </p> </div> </div> </div> </div>
The rule is that a triple is added to the graph of the element which set the predicate of the triple. So, in the previous example, the following Notation 3 is generated.
@prefix foaf : <http://xmlns.com/foaf/0.1/> . <#g1> = {} . <#g2> = {} . <#g3> = {} . <#g4> = { <#joe> foaf:homepage <http://joe.example.com/> . <http://joe.example.com/> foaf:primaryTopic <#joe> . } . <#g5> = {} . <#g6> = {} .
The standard RDFa processing sequence [RDFA] requires only minor modifications to allow for multiple graphs. The modifications required are as follows:
The initial context created should have a variable called [graph]. The initial value for this is a newly created blank node.
After stage 5 in the sequence, but before stage 6, the [current element] should be checked to see if there is a graph attribute. If there is, then the attribute's value should be converted to a URI depending on how the lexical space of the attribute is defined, and the [graph] variable should be set to that URI.
Any triple created in stages 8, 9 or 11 is considered to be in the graph specified by [graph].
In stage 10, not only are the predicate and direction stored for each incomplete triple, but also the current value of [graph].
Any triple created in stage 12 is considered to be in the graph stored in the incomplete triples list.
In stage 13, the new evaluation context is passed the new value of [graph].
An alternative attribute, such as id
may be used to markup graph information rather than graph
, but only through private agreement between producers and consumers. An alternative attribute may have a different lexical space.
It is not meant by "private agreement" that consumers and producers would need to personally discuss and agree on an attribute to be used. Instead, a consumer that needs a named graph facility would publish a link to this draft in their documentation together with the specific details of how their parser consumes named graphs. People then targetting that particular consumer would follow the directions in the consumer's documentation.
Known implementations of this idea: