RDF

RDF Extracted Attributes from Styled Elements (RDF-EASE)

Buzzword.org.uk Draft 15 September 2008

This version:
tag:buzzword.org.uk,2008:rdf-ease/spec-20080915
Latest version:
tag:buzzword.org.uk,2008:rdf-ease/spec
Editor:
Toby Inkster

Abstract

GRDDL provides a powerful and robust mechanism for extracting RDF triples from XHTML, and more generally any XML-based markup. By introducing an initial "tidying" stage beforehand, GRDDL can also be applied to HTML. GRDDL enables documents authored using the XHTML class attribute and other semantic XHTML techniques to be transformed to a more formal knowledge representation.

The GRDDL recommendation allows for transformations to be written in any programming language, but only documents the use of XSLT transformations in detail.

This document describes a new document description language "EASE", with a familiar syntax that borrows heavily from Cascading Style Sheets (CSS). The document also describes how EASE can be used as a transformation language for the purpose of GRDDL.

Status of this Document

This document is published by buzzword.org.uk, a web site that hosts various specifications, articles and tools of use to web publishers. This is not a W3C recommendation. It is not even a buzzword.org.uk recommendation yet.

The author welcomes feedback on this draft by e-mail to mail@tobyinkster.co.uk.

Table of Contents

  1. Introduction
  2. RDF-EASE Syntax
  3. RDF-EASE Properties
  4. Linking to RDF-EASE Transformation Sheets
  5. Parsing RDF-EASE and HTML
  6. Mixing RDF-EASE and RDFa
  1. References
  2. Change History
  3. Acknowledgments

1. Introduction

This section is informative.

@@TODO: Introduce the concepts. Differentiate between:

EASE
Abstract CSS-like syntax for setting XML/HTML attributes.
RDF-EASE
Concrete language using EASE concepts, and defining seven properties.
GRDDL + RDF-EASE
Use of RDF-EASE to transform HTML to RDF.

@@TODO: Note that when RDFa and RDF-EASE rules apply to the same element, RDF-EASE can add to RDFa, but not remove from it.

2. RDF-EASE Syntax

This section is normative.

The syntax of RDF-EASE is a subset of the syntax described in 4.1 Syntax of [CSS21]. Any valid CSS 2.1 selector may be used in RDF-EASE.

2.1. Syntax Differences With CSS

2.2. Defining CURIE Prefixes

In RDF-EASE, CURIE prefixes are not scoped. They are defined in one or more special blocks with a selector of a single underscore. Within these blocks, any property is taken to be a new prefix being defined. The value must be an IRI conforming to [IRI], optionally quoted in double or single quotes, and required to be prefixed with url( and suffixed with ). An example follows.

_ {
	foaf: url("http://xmlns.com/foaf/0.1/");
	xsd:  url(http://www.w3.org/2001/XMLSchema#);
	dc:   url('http://purl.org/dc/terms/')
}

Figure 2x: Defining CURIE prefixes in RDF-EASE.

2.3. The RDF-EASE Cascade

In CSS, properties can only take one value, so the cascade is very important. When two CSS rules both match an element, the most “specific” rule is used [CSS21]. In the following example, the element has a width of 10em.

p {
	width: 20em;
}
p.narrow {
	width: 10em;
}
...

<p class="narrow">A narrow paragraph.</p>

Figure 2x: The CSS Cascade.

With RDF-EASE the cascade works slightly differently, in that most RDFa attributes can take a list of multiple values. An element takes a collection of values from all the CSS rules with selectors matching it.

p {
	property: "foaf:name";
}
p.name {
	property: "vcard:fn";
}
...

<p class="name" property="dc:title">Joe Bloggs</p>

Figure 2x: In this example, the paragraph takes all three values.

For instances where this cascading is not required, and you wish for the more specific rule to completely replace the less specific rule, a reset keyword is available. Adding the reset keyword to a value tells the processor to ignore the input of less specific rules. RDFa attributes hard-coded in the XHTML cannot be reset, as they are more specific than any RDF-EASE rules.

p {
	property: "foaf:name";
}
p.name {
	property: reset "vcard:fn";
}
...

<p class="name" property="dc:title">Joe Bloggs</p>

Figure 2x: The paragraph has properties vcard:fn and dc:title, but not foaf:name.

3. RDF-EASE Properties

This section is normative.

3.1. about

about
Value:'document' | 'reset' | 'normal'
Initial:'normal'
Applies to:any element
Inherited:no

Setting the RDF-EASE about property is equivalent to setting the about attribute, with a meaning as specified in 2.1. The RDFa Attributes of [RDFA]. RDF-EASE does not offer the ability to specify arbitrary URIs in this attribute, but instead may take one of three values:

document
This is equivalent to setting the about attribute to the empty string. Any properties found within the element will refer to the document as a whole.
reset
Any about properties on less specific RDF-EASE selectors that match this element should be ignored.
normal
This property is a no-op, equivalent to not having set the property at all.

3.2. content

A future version of this specification may alter the name of this property, as it conflicts with CSS. If conflicts with CSS can be avoided, then it may be possible to combine RDF-EASE and CSS in the same file, and link to it using rel="stylesheet transformation".

content
Value:attr(ATTR) | 'normal'
Initial:'normal'
Applies to:any element
Inherited:no

Setting the RDF-EASE content property is equivalent to setting the content attribute, with a meaning as specified in 2.1. The RDFa Attributes of [RDFA]. Rather than setting the content attribute to a specific string, it is used to specify another attribute from where to read the content value.

abbr { content: attr(title); }

Figure 3a: Example usage of content.

The value 'normal' is a no-op, equivalent to not having set the property at all.

3.3. datatype, property, rel, rev and typeof

datatype, property, rel, rev, typeof
Value:['reset'] URI List | 'normal'
Initial:'normal'
Applies to:any element
Inherited:no

These last five RDF-EASE properties are equivalent to setting the attributes with the same name, with a meaning as specified in 2.1. The RDFa Attributes of [RDFA]. The typical value these properties will be set to is a list of tokens, each representing a URI, separated by whitespace. Optionally, the first item in the list may be the token reset instead of a token representing a URI. The value 'normal' is a no-op, equivalent to not having set the property at all.

Each token in a URI List should conform to one of the following syntaxes:

CURIE
A compact URI as defined by [CURIE], quoted in either double or single quotes.
For example:
Full IRI
An absolute IRI as defined by [IRI], optionally quoted in either double or single quotes, and required to be prefixed by url( and suffixed by ).
For example:

An example use of these properties:

div.event
{
	typeof: "ex:Event";
}
div.event span.starting
{
	property: reset "ex:start" "ical:dtstart";
	datatype: url('http://www.w3.org/2001/XMLSchema#dateTime') "ex:date-time";
}
div.event span.place
{
	rel: reset "ex:location";
	rev: reset "ex:event-here";
}

Figure 3b: Example usage of various RDF-EASE properties.

4. Linking to RDF-EASE Transformation Sheets

This section is normative.

@@TODO

4.1. XHTML Metadata Profiles

@@TODO

4.2. rel="transformation"

@@TODO

5. Parsing RDF-EASE and HTML

This section is normative.

@@TODO

6. Mixing RDF-EASE and RDFa

This section is informative.

When writing pages, authors need to take care that visual styles specified as HTML attributes and CSS properties not only work well together, but also work when the CSS is absent. In the following example, the text “Hello world!” is clearly visible as white text on a blue background when CSS is enabled, but invisible white text on a white background if CSS is disabled or unavailable.

<html>
	<head>
		<title>A Contrived Example</title>
		<style type="text/css">
			p { background: blue; }
		</style>
	</head>
	<body bgcolor="white">
		<p>
			<font color="white">Hello world!</font>
		</p>
	</body>
</html>

Figure 6a. Unsafe usage of CSS.

Similarly, authors using RDF-EASE need to check that the RDF triples generated by a pure RDFa parser, which cannot handle RDF-EASE, still make sense. An example where the meaning is radically changed follows — an RDF-EASE parser would infer that Alice knows Bob, whereas a non RDF-EASE tool would parse this as meaning that Alice's name is “Bob”!

.knows { rel: "foaf:knows"; }
...

<div about="#alice">
	<p class="knows">
		<span property="foaf:name">Bob</span>
	</p>
</div>

Figure 6b. Unsafe usage of RDF-EASE.

6.1. Safe Combinations

RDF-EASE is capable of specifying any combination of the subject, predicate and object of an RDF triple. But only certain combinations are provably safe to combine.

Figure 6c: Table illustrating combinations of RDFa and RDF-EASE which will normally be safe.
Method of Specifying Resource Behaviour of Parser Safe?
Subject Predicate Object RDFa Parser RDF-EASE Parser
RDFa RDFa RDFa parsed parsed Yes
RDFa RDFa RDF-EASE misinterpreted parsed No
RDFa RDF-EASE RDFa ignored parsed Yes
RDFa RDF-EASE RDF-EASE ignored parsed Yes
RDF-EASE RDFa RDFa misinterpreted parsed No
RDF-EASE RDFa RDF-EASE misinterpreted parsed No
RDF-EASE RDF-EASE RDFa ignored parsed Yes
RDF-EASE RDF-EASE RDF-EASE ignored parsed Yes

For styling, there is a “rule of thumb” that says “whenever you set a foreground colour, set a background colour.” The equivalent rule to combine RDF-EASE with RDFa is “whenever you use an RDFa property, rel or rev attribute, make sure the subject and object of the triple can be determined through pure RDFa.”

Appendix A. References

CONCEPTS
Resource Description Framework (RDF): Concepts and Abstract Syntax, Graham Klyne, Jeremy J Carroll, Editors, Brian McBride, Series editor, W3C Recommendation 10 February 2004 <http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/>. Latest version <http://www.w3.org/TR/rdf-concepts/>.
CSS21
Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification, Bert Bos, Tantek Çelik, Ian Hickson, Håkon Wium Lie, Editors, W3C Candidate Recommendation 19 July 2007 <http://www.w3.org/TR/2007/CR-CSS21-20070719>. Latest version <http://www.w3.org/TR/CSS21>.
CURIE
CURIE Syntax 1.0, Mark Birbeck, Shane McCarron, Editors, W3C Working Draft 6 May 2008 <http://www.w3.org/TR/2008/WD-curie-20080506>. Latest version <http://www.w3.org/TR/curie>.
GRDDL
Gleaning Resource Descriptions from Dialects of Languages (GRDDL), Dan Connolly, Editor, W3C Recommendation 11 September 2007 <http://www.w3.org/TR/2007/REC-grddl-20070911/>. Latest version <http://www.w3.org/TR/grddl/>.
IRI
Internationalized Resource Identifiers (IRI), Martin Dürst, Michel Suignard, RFC 3987 <http://www.rfc-editor.org/rfc/rfc3987.txt>.
RDFA
RDFa in XHTML: Syntax and Processing, Ben Adida, Mark Birbeck, Shane McCarron, Steven Pemberton, Editors, W3C Proposed Recommendation 4 September 2008 <http://www.w3.org/TR/2008/PR-rdfa-syntax-20080904>. Latest version <http://www.w3.org/TR/rdfa-syntax>.
URI
Uniform Resource Identifiers (URI): Generic Syntax, Tim Berners-Lee et al, RFC 3986 <http://www.rfc-editor.org/rfc/rfc3986.txt>.
XHTML
XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition), W3C HTML Working Group, W3C Recommendation 26 January 2000, revised 1 August 2002 <http://www.w3.org/TR/2002/REC-xhtml1-20020801>. Latest version <http://www.w3.org/TR/xhtml1>.

Appendix B. Change History

@@TODO

Appendix C. Acknowledgments

@@TODO