Swignition Microformat Extensions
Swignition implements a number of proposed extensions to existing Microformats.
1. Design Patterns
1.1. ABBR Pattern
Due to accessibility problems with the ABBR pattern, an alternative syntax is also supported: the title attribute may be used on non-ABBR elements, but only if the value is prefixed with the string "data:". Human-readable information may be included in the title, before the "data:" prefix. The data prefix and value following it may be wrapped in brackets. For example, the following are considered equivalent:
<span class="foo">XXXX</span>
<abbr class="foo" title="XXXX">X</abbr>
<span class="foo" title="data:XXXX">X</span>
<span class="foo" title="Foo is X data:XXXX">X</span>
<span class="foo" title="Foo is X [data:XXXX]">X</span>
Brackets, braces and parentheses are considered bracketing characters. Left and right do not have to match.
Experimental support is also included for the data-*
class
pattern, but to take advantage of it, publishers must include
the profile URI.
1.2. Datetime Pattern
The datetime design pattern specifies two profiles (subsets) of ISO8601 datetime format for use in microformats. Swignition additionally supports other ISO8601 datetimes, as parsed by the DateTime::Format::ISO8601 Perl module.
As a last ditch attempt, datetimes that cannot be parsed as above are attempted to be parsed as natural language dates using DateTime::Format::Natural. Ambiguous dates are assumed to be in the future (e.g. does "Sunday" refer to last Sunday or next Sunday?) and specified in UTC. Authors should not rely on natural language parsing, as it is not particularly predictable.
Swignition supports datetimes with nanosecond-level precision.
1.3. Durations
A duration is an abstract period of time with no fixed starting or
ending point, but a fixed length. For example, "three and a half minutes".
Various microformats include properties which take a duration as their
value — for example, hAudio's duration
property and
hCalendar's trigger
.
The standard way to mark up durations in Microformats is using ISO 8601 durations. For instance:
<span class="duration">PT1H30M</span> <abbr class="duration" title="PT1H30M">an hour and a half</abbr>
Swignition experimentally supports several alternative methods to mark up durations as well:
- SI notation, as a single number, optionally followed by an "s", representing the length of duration in seconds.
- The proposed ISO 31-1 class names,
d
,h
,min
ands
. - Draft hMeasure microformat, with the measurement's item set to null, the type of measurement left blank or set to "duration" and the unit of measurement in days, hours, minutes or seconds.
Examples follow:
The song is only <span class="duration">54 s</span>econds long! <abbr class="duration" title="5400 s">an hour and a half</abbr> <p class="duration"> The meeting will last <abbr class="h" title="1">an</abbr> hour and <abbr class="min" title="30">thirty</abbr> minutes. </p> <span class="duration"> <span class="h">1</span>:<span class="min">23</span>:<span class="s">45.67</span> </span> <span class="duration hmeasure">1 h</span> <span class="duration hmeasure">2.41 s</span> <span class="duration hmeasure">5 d</span> <span class="duration hmeasure"> <span class="num">4</span> of your Earth "<span class="unit">min</span>utes"! </span>
1.4. Intervals
Intervals differ from durations in that they have a fixed start and end time. To illustrate, while "three and a half minutes" is a duration, "three and a half minutes starting now" is an interval. Currently, intervals are only used in one microformat: hCalendar. They are used to indicate periods of time when a person is free or busy.
Again, ISO 8601 defines a syntax for time intervals.
Experimentally, Swignition also allows the ISO 31-1 and hMeasure
methods of durations explained above to be used as intervals if a start
or end date is indicated. The start date may be indicated using the class
start
or after
(there is a subtle difference
between the two: "start" is considered inclusive); the end date indicated
using end
or before
(again, "end" is inclusive).
Examples follow.
<span class="value"> <span class="h">1</span>:<span class="min">23</span>:<span class="s">45.67</span> starting <span class="start">2008-01-01T13:00</span> </span> <span class="value"> <span class="hmeasure"> <span class="num">5025.67</span> <span class="unit">s</span> </span> ending <span class="before">2008-01-01T14:23:45.67</span> </span> <span class="value"> From <span class="start">2008-01-01T13:00</span> until <span class="before">2008-01-01T14:23:45.67</span> </span> <span class="value"> From <span class="start">2008-01-01T13:00</span> until <span class="before">14:23:45.67</span> </span> (Note: 'before' has no date, so same date as 'start' is assumed.) <span class="value">2008-01-01T13:00:00/15:00:00</span>
While intervals can be expressed as start+end, start+duration or duration+end combinations, Swignition's output will always be canonicalised to start+duration.
1.5. Include Pattern
The proposed non-verbose class-based solution is supported in addition to the standard method.
1.6. Microformat Opacity
The author of Swignition is following the MFO effort with interest. Currently Swignition implements this algorithm to deal with nested and embedded microformats:
- Swignition maintains an as up-to-date as possible list of root class names used by current and draft microformats, even if the parser does not fully support those formats. (e.g. Swignition doesn't parse hAtom, hReview or hResume yet, but does have "hatom", "hreview" and "hresume" in that list.) The pseudo-root-class-name "mfo" is on the list.
-
When parsing a compound microformat:
- The parser first attempts to parse "meaningfully embedded" microformats. For example, "adr", "geo" and "agent vcard" within an hCard.
- The parser then runs through its list of root class names,
excluding any elements bearing those class names from being
parsed as part of the current object. For example, any element
with class "hatom" (and its children) would be excluded from
being parsed as part of an hCard.
- This is achieved by (temporarily) setting all
rel
,rev
andclass
attributes attached to elements within nested microformats to the empty string. - Note that this doesn't mean that Swignition will fail to parse the embedded hAtom at all — but that it will parse it completely independently of the hCard.
- This is achieved by (temporarily) setting all
- The parsing will then continue as normal. Within our example hCard that may be to look for elements with class "fn", "role", "org", etc.
2. hCard
The following additional properties are supported, taken from the vCard 4.0 draft:
- kind
- Type of contact. Usually "individual", "org" or "group". See kind optimisation.
- gender
- Gender of a contact. Usually "male" or "female".
- birth
- Place of birth. May be a nested hCard, adr or geo; or plain text.
- dday
- Date of death.
- death
- Place of death. May be a nested hCard, adr or geo; or plain text.
- impp
- Extension to vCard for instant messaging and presence ptotocols, defined in RFC 4770. Similar syntax to "email" and "tel", with "type" and "value" subproperties.
- lang
- Language(s) spoken by this contact.
- member
- Where the hCard represents a group or organisation, the "member" property may be used to indicate someone who is a member of the group. The member should be either a URL or a nested hCard.
- caladruri
- The URI (often an e-mail address) to which calendar requests (e.g. invitations) should be sent.
- caluri
- URI for the contact's calendar. Alternatively an embedded hCalendar
may be provided. (Note this must be a full hCalendar with an explicit
class="vcalendar"
— not just a collection of hCalendar events.) - fburl
- URI for the contact's free/busy information. Alternatively an
embedded hCalendar may be provided. (Note this must be a full hCalendar
with an explicit
class="vcalendar"
— not just a collection of hCalendar freebusys.)
2.1. Additional Organisation Properties
The following additional organisation sub-properties are supported:
- x-vat-number
- VAT registration number for an organisation. (Alias "vat-number".)
- x-company-number
- Registration number for a company registered with an appropriate regulatory body. (Alias "company-number".)
- x-charity-number
- Registration number for a charity or other non-profit organisation registered with an appropriate body. (Alias "charity-number".)
If there is only one organisation listed as part of an hCard, then organisation-name and other organisation sub-properties may be used without an org wrapper element.
2.2. Kind Optimisation
The hCard specification offers a method for determining whether an hCard refers to an individual or an organisation. Swignition extends this to allow hCards to also refer to organisation units (e.g. departments, working groups) and any address properties (buildings, cities, regions, countries, etc). This is done by setting the "fn" property identically to the other property. e.g.
<div class="vcard"> <a class="org url" href="http://la.ctu.gov.invalid"> <span class="organization-name">Counter-Terrorist Unit</span>: <span class="organization-unit fn">Los Angeles Division</span> </a> </div>
The "kind" property is automatically set (unless a "kind" has been specified explicitly); in the case above it is set to "group".
Property equal to FN | Inferred KIND |
---|---|
organization-name | org |
organization-unit | group |
post-office-box | x-post-office-box |
extended-address | x-extended-address |
street-address | x-street-address |
locality | x-locality |
region | x-region |
postal-code | x-postal-code |
country-name | x-country-name |
Otherwise, "kind" is assumed to be "individual".
3. hCalendar
The hCalendar specification is hopelessly incomplete. As a result, I have drafted hCalendar 1.1, and Swignition more-or-less supports that.
4. adr
The type property is parsed, even when an address is given outside an hCard.
An address may contain embedded geo microformats.
See also: geo.
5. geo
If a geo is missing its latitude or longitude, then the raw XML string for the entire element is searched for the following regular expression which represents two semicolon/comma-delimited decimal numbers:
/ \s* (\-?[0-9\.]+) \s* [\;\,] \s* (\-?[0-9\.]+) \s* /x
The first number is taken to be the latitude; the second, the longitude. This will allow the parsing of constructs like:
<a class="geo" href="http://maps.google.com/maps?q=50.8730,0.0005">home</a>
The following additional optional properties are supported:
- body
- The planet or astronomical body to which the co-ordinates apply. If not specified, then "Earth" is assumed. Names should be taken from the International Astronomical Union's Gazetteer of Planetary Nomenclature.
- reference-frame
- The co-ordinate system used. For Earth, the default co-ordinate system is "WGS84". Other appropriate values include "EtrS89" and "ITRF2005".
- altitude
- An altitude, above (negative values: below) sea level on Earth, or an agreed zero elevation on other planets. Units should be specified. (Currently there is no microformat for dealing with weights and measures.) When no unit is specified, metres are assumed.
6. rel-tag
The rel-tag specification should be fully supported as specified.
As an alternative to rel="tag"
, Swignition also supports
class="tag"
. While rel values are defined as case-insensitive
by the HTML 4.01 spec, classes are not, so lower-case must be used.
class="tag"
tags are parsed differently from
rel="tag"
tags, in that the link text (subject to the abbr pattern and value excerpting) is used instead
of the final URL component, in order to accomodate alternative
URL formats. When both rel="tag"
and
class="tag"
are found on the same element, then the element
is parsed using standard rel-tag rules. As with rel="tag"
,
class="tag"
must only be used on <a>
and
<area>
elements.
@@TODO: Document profile requirement!
The following examples are all parsed as the tag "Example":
<a rel="tag" href="/tag/Example">NotThis</a>
<a class="tag" href="/tag/NotThis">Example</a>
<a class="tag" rel="tag" href="/tag/Example">NotThis</a>
<a class="tag" href="/tag/NotThis">NotThis <span class="value">Example</span></a>
7. hAtom
hAtom is mostly implemented as per the hAtom 0.1 spec. In addition, there is support for zero or more rel-enclosure links within each hEntry, which is predicted to appear in the hAtom 0.2 spec. The nearest-in-parent algorithm for discovering authors for authorless entries is not fully implemented as it is predicted that this algorithm will be simplified for hAtom 0.2. Instead, the following algorithm is implemented:
- Look for the author in an element with class "author".
- If not found, look for an hCard within an <address> element found within the entry.
- If not found, look for an hCard within an <address> element found within the feed.
- Otherwise, no author has been specified.
hSlice is supported as a synonym for hEntry.
8. XFN
XFN is supported as per the spec and parsed into RDF using the guidelines that I published on the microformats wiki. This includes a procedure for working out the “representative hCard” for the page being parsed. The following rules are followed to determine the hCard:
- If a representative hCard has been explicitly declared using RDFa through a triple of <pageURI> <http://purl.org/uF/hCard/terms/representative> _:foo then that is taken to be the hCard;
- Otherwise if a foaf:primaryTopic exists for the page and the object represents a person, then that is the hCard;
- Otherwise, the first hCard with
rel="me"
specified on a link withclass="url"
; - Otherwise, the first hCard with a
class="url"
link back to the page being parsed; - Otherwise, the first hCard on the page.
The rev
attribute is properly supported, and inverse
and symmetric relationships are fully understood. For example, if using
rev="child"
, Swignition knows that this is the same as
rel="parent"
.
Swignition has specific support for XFN 1.0 (i.e. it will ignore the new properties defined in XFN 1.1), but only if you explicitly include the XFN 1.0 profile URI in your document head. Swignition includes support for the XHTML Enemies Network 1.0 (XEN), but again, only if you include the profile URI.
9. xFolk & hReview
Support for xFolk
entries was introduced in Swignition 0.1-α8; hReview followed in
0.1-α9. As of 0.1-α10, the parsers for both have been united:
xFolk is treated as funny-looking hReview. As a consequence, xFolk
entries may include additional classes from hReview, such as
dtreviewed
and reviewer
.
10. species
Swignition 0.1-α11 includes experimental support for the species
microformat using the root class name biota
. As some
taxonomic ranks are used differently by botanists and zoologists, you
may use the additional class names botany
and
zoology
to resolve any ambiguities. For example,
<i class="biota zoology">...</i>
.
Within the root element, the following singular properties are allowed
for marking up the various taxonomic ranks. (If you're using a CSS-capable
browser, you should see that core terms are in bold, zoology-only terms
in red, and biology-only in green.) If the rank you wanted is not on the
list, then use the generic (plural) class="rank"
instead.
aberration
aggregate
authority
biovar
branch
breed
class
claudius
cohort
complex
convariety
cultivar
cultivar-group
division
domain
empire
falanx
family
family-group
form
genus
genus-group
gigaorder
grade
grandorder
group
group-of-breeds
hybrid
hyperorder
infraclass
infradomain
infrafamily
infraform
infragenus
infrakingdom
infralegion
infraorder
infraphylum
infrasection
infraseries
infraspecies
infratribe
infravariety
interkingdom
kingdom
klepton
legion
lusus
magnorder
megaorder
microspecies
midkingdom
midphylum
mirorder
nation
order
parvclass
parvorder
pathovar
phylum
population
section
section-of-breeds
series
serogroup
serovar
species
(a.k.a.specific
)species-group
species-subgroup
strain
subclass
subcohort
subdivision
subdomain
subfamily
subfamily-group
subform
subgenus
subgroup
subkingdom
sublegion
suborder
subphylum
subsection
subseries
subspecies
subtribe
subvariety
superclass
supercohort
superdivision
superdomain
superfamily
superform
supergenus
superkingdom
superlegion
superorder
superphylum
supersection
superseries
superspecies
supertribe
supervariety
suprakingdom
supraphylum
synklepton
tribe
variety
Further plural classes binomial
and trinomial
are supported for marking up the binomial or trinomial name, and
common-name
(a.k.a. vernacular
,
cname
) for the common name of a species. The plural class
authority
is supported for marking up the classification
authority.
For convenience, as many of these properties use such generic names
(class
, form
, section
, etc) you
may prefix any of these classes with taxo
or
taxo-
. For example, instead of class="tribe"
you could equivalently use class="taxotribe"
or
class="taxo-tribe"
. This is not a
namespacing mechanism, but a simple method for you to avoid clashes
with class names.
Lastly, as an optimisation, if none of the above properties are found within the root element, then the entire string contents of the root element are taken to be a binomial/trinomial name.
Two examples of species parsed by Swignition follow:
<span class="biota" lang="zxx">Homo sapiens</span> <p class="biota zoology" lang="en"> He is a <span class="common-name">human</span>, or as they say in Basque, a <span lang="eu" class="common-name">Gizakia</span>. What scientists would classify as a <i class="trinomial" lang="zxx"> <span class="binomial"> <span class="genus">Homo</span> <span class="species">sapiens</span> </span> <span class="subspecies">sapiens</span> </i>, a member of the <span class="family" lang="zxx">hominidae</span> family of <span lang="zxx" class="taxo-order">primates</span>. </p>
[Note the use of lang="zxx"
(language code for "no
linguistic content") rather than lang="la"
(language code
for Latin). Despite the fact that these scientific terms are often called
"Latin names", in reality they are often derived from a mixture of Latin,
Greek, English and other sources — they are not usually even
close to the Latin terms for the forms of life described.
lang="zxx"
is a better way of marking up these terms and
also indicates that a translation of the terms should not be attempted.]
10.1. Nesting
Swignition applies special meaning to instances of the species microformat found nested within hCard and hCalendar events. It is strongly suggested that for either of these purposes, you should supply at least one of these properties (which are normally optional):
binomial
trinomial
common-name
(or one of its aliases)- Both
genus
andspecies
(or its aliasspecific
), from which a binomial can be implied - No properties at all, and hence rely on binomial/trinomial optimisation.
10.1.1. hCard
When class="biota"
is found nested inside an hCard, then
it is implied that the person/thing described by the hCard is a member
of the species.
10.1.2. hCalendar Events
When class="biota attendee"
is found within an hCalendar
event, at least one member of the species described is taken to have been
present at the event. Combined with location
/geo
and dtstart
this is roughly equivalent to a "sighting" of
the species.
The above combination with attendee
may cause problems
with some naive parsers, especially ones with no support for the
species microformat. Because of this, an alternative syntax is supported
to avoid triggering bugs: class="biota x-sighting-of"
.
Additionally, to specify a sighting of a species
class="vcard attendee"
may be used in conjunction with the
hCard nesting described above to record additional information such as the
name or date of birth of the creature sighted. For example:
<p class="vevent"> <abbr class="dtstart" title="20080706">Yesterday</abbr> I saw a <span class="attendee vcard"> <span class="biota"><span class="common-name">goat</span></span> called <span class="fn">Steve</span> </span> </p>
Example iCalendar output:
BEGIN:VEVENT DTSTART:20080706T000000Z X-SIGHTING-OF:goat ATTENDEE;CN=Steve;CUTYPE=INDIVIDUAL;VALUE=TEXT:Steve END:VEVENT
11. hRecipe
Swignition 0.1-α14 has experimental support for the proposed hRecipe microformat. Class names supported:
-
hrecipe
-
recipe-title
(required, singular) -
recipe-summary
(optional, singular) -
author
(optional, plural)- embedded hCard
-
published
(optional, plural) -
photo
(optional, plural) -
method
(if plural, then concatenated) -
ingredient
(required, plural)-
quantity
- embedded hmeasure
-
item
-
note
-
optional
-
-
yield
(optional, singular) -
preparation-time
(optional, singular) - embedded rel-tag.
-
Also you can use class="ingredients"
on an element as a shorthand for putting class="ingredient"
on all its direct child elements. That is, the following two lists are considered exactly equivalent:
<ul> <li class="ingredient">Tomato juice</li> <li class="ingredient"> <span class="quantity hmeasure">1 tbsp</span> <span class="item">Worcestershire sauce</span> </li> <li class="ingredient"> <span class="item">Tabasco sauce</span> <span class="note">to taste</span> </li> </ul> <ul class="ingredients"> <li>Tomato juice</li> <li> <span class="quantity hmeasure">1 tbsp</span> <span class="item">Worcestershire sauce</span> </li> <li> <span class="item">Tabasco sauce</span> <span class="note">to taste</span> </li> </ul>