SWIGNITION

Swignition Microformat Extensions

Swignition implements a number of proposed extensions to existing Microformats.

  1. Design Patterns
    1. ABBR design pattern
    2. Datetime design pattern
    3. Durations
    4. Intervals
    5. Include pattern
    6. Microformat Opacity
  2. hCard
    1. Additional Organisation Properties
    2. Kind Optimisation
  3. hCalendar
  4. adr
  5. geo
  6. rel-tag
  7. hAtom
  8. XFN
  9. xFolk & hReview
  10. species
    1. Nesting
  11. hRecipe

1. Design Patterns

1.1. ABBR Pattern

Due to accessibility problems with the ABBR pattern, an alternative syntax is also supported: the title attribute may be used on non-ABBR elements, but only if the value is prefixed with the string "data:". Human-readable information may be included in the title, before the "data:" prefix. The data prefix and value following it may be wrapped in brackets. For example, the following are considered equivalent:

  • <span class="foo">XXXX</span>
  • <abbr class="foo" title="XXXX">X</abbr>
  • <span class="foo" title="data:XXXX">X</span>
  • <span class="foo" title="Foo is X data:XXXX">X</span>
  • <span class="foo" title="Foo is X [data:XXXX]">X</span>

Brackets, braces and parentheses are considered bracketing characters. Left and right do not have to match.

Experimental support is also included for the data-* class pattern, but to take advantage of it, publishers must include the profile URI.

1.2. Datetime Pattern

The datetime design pattern specifies two profiles (subsets) of ISO8601 datetime format for use in microformats. Swignition additionally supports other ISO8601 datetimes, as parsed by the DateTime::Format::ISO8601 Perl module.

As a last ditch attempt, datetimes that cannot be parsed as above are attempted to be parsed as natural language dates using DateTime::Format::Natural. Ambiguous dates are assumed to be in the future (e.g. does "Sunday" refer to last Sunday or next Sunday?) and specified in UTC. Authors should not rely on natural language parsing, as it is not particularly predictable.

Swignition supports datetimes with nanosecond-level precision.

1.3. Durations

A duration is an abstract period of time with no fixed starting or ending point, but a fixed length. For example, "three and a half minutes". Various microformats include properties which take a duration as their value — for example, hAudio's duration property and hCalendar's trigger.

The standard way to mark up durations in Microformats is using ISO 8601 durations. For instance:

<span class="duration">PT1H30M</span>

<abbr class="duration" title="PT1H30M">an hour and a half</abbr>

Swignition experimentally supports several alternative methods to mark up durations as well:

  1. SI notation, as a single number, optionally followed by an "s", representing the length of duration in seconds.
  2. The proposed ISO 31-1 class names, d, h, min and s.
  3. Draft hMeasure microformat, with the measurement's item set to null, the type of measurement left blank or set to "duration" and the unit of measurement in days, hours, minutes or seconds.

Examples follow:

The song is only <span class="duration">54 s</span>econds long!
		
<abbr class="duration" title="5400 s">an hour and a half</abbr>

<p class="duration">
    The meeting will last <abbr class="h" title="1">an</abbr> hour and
    <abbr class="min" title="30">thirty</abbr> minutes.
</p>

<span class="duration">
    <span class="h">1</span>:<span class="min">23</span>:<span class="s">45.67</span>
</span>

<span class="duration hmeasure">1 h</span>

<span class="duration hmeasure">2.41 s</span>

<span class="duration hmeasure">5 d</span>

<span class="duration hmeasure">
    <span class="num">4</span> of your Earth
    "<span class="unit">min</span>utes"!
</span>

1.4. Intervals

Intervals differ from durations in that they have a fixed start and end time. To illustrate, while "three and a half minutes" is a duration, "three and a half minutes starting now" is an interval. Currently, intervals are only used in one microformat: hCalendar. They are used to indicate periods of time when a person is free or busy.

Again, ISO 8601 defines a syntax for time intervals.

Experimentally, Swignition also allows the ISO 31-1 and hMeasure methods of durations explained above to be used as intervals if a start or end date is indicated. The start date may be indicated using the class start or after (there is a subtle difference between the two: "start" is considered inclusive); the end date indicated using end or before (again, "end" is inclusive). Examples follow.

<span class="value">
    <span class="h">1</span>:<span class="min">23</span>:<span class="s">45.67</span>
    starting <span class="start">2008-01-01T13:00</span>
</span>

<span class="value">
    <span class="hmeasure">
        <span class="num">5025.67</span>
        <span class="unit">s</span>
    </span>
    ending
    <span class="before">2008-01-01T14:23:45.67</span>
</span>

<span class="value">
    From  <span class="start">2008-01-01T13:00</span>
    until <span class="before">2008-01-01T14:23:45.67</span>
</span>

<span class="value">
    From  <span class="start">2008-01-01T13:00</span>
    until <span class="before">14:23:45.67</span>
</span>
(Note: 'before' has no date, so same date as 'start' is assumed.)

<span class="value">2008-01-01T13:00:00/15:00:00</span>

While intervals can be expressed as start+end, start+duration or duration+end combinations, Swignition's output will always be canonicalised to start+duration.

1.5. Include Pattern

The proposed non-verbose class-based solution is supported in addition to the standard method.

1.6. Microformat Opacity

The author of Swignition is following the MFO effort with interest. Currently Swignition implements this algorithm to deal with nested and embedded microformats:

  • Swignition maintains an as up-to-date as possible list of root class names used by current and draft microformats, even if the parser does not fully support those formats. (e.g. Swignition doesn't parse hAtom, hReview or hResume yet, but does have "hatom", "hreview" and "hresume" in that list.) The pseudo-root-class-name "mfo" is on the list.
  • When parsing a compound microformat:
    1. The parser first attempts to parse "meaningfully embedded" microformats. For example, "adr", "geo" and "agent vcard" within an hCard.
    2. The parser then runs through its list of root class names, excluding any elements bearing those class names from being parsed as part of the current object. For example, any element with class "hatom" (and its children) would be excluded from being parsed as part of an hCard.
      • This is achieved by (temporarily) setting all rel, rev and class attributes attached to elements within nested microformats to the empty string.
      • Note that this doesn't mean that Swignition will fail to parse the embedded hAtom at all — but that it will parse it completely independently of the hCard.
    3. The parsing will then continue as normal. Within our example hCard that may be to look for elements with class "fn", "role", "org", etc.

2. hCard

The following additional properties are supported, taken from the vCard 4.0 draft:

kind
Type of contact. Usually "individual", "org" or "group". See kind optimisation.
gender
Gender of a contact. Usually "male" or "female".
birth
Place of birth. May be a nested hCard, adr or geo; or plain text.
dday
Date of death.
death
Place of death. May be a nested hCard, adr or geo; or plain text.
impp
Extension to vCard for instant messaging and presence ptotocols, defined in RFC 4770. Similar syntax to "email" and "tel", with "type" and "value" subproperties.
lang
Language(s) spoken by this contact.
member
Where the hCard represents a group or organisation, the "member" property may be used to indicate someone who is a member of the group. The member should be either a URL or a nested hCard.
caladruri
The URI (often an e-mail address) to which calendar requests (e.g. invitations) should be sent.
caluri
URI for the contact's calendar. Alternatively an embedded hCalendar may be provided. (Note this must be a full hCalendar with an explicit class="vcalendar" — not just a collection of hCalendar events.)
fburl
URI for the contact's free/busy information. Alternatively an embedded hCalendar may be provided. (Note this must be a full hCalendar with an explicit class="vcalendar" — not just a collection of hCalendar freebusys.)

See also: adr, geo.

2.1. Additional Organisation Properties

The following additional organisation sub-properties are supported:

x-vat-number
VAT registration number for an organisation. (Alias "vat-number".)
x-company-number
Registration number for a company registered with an appropriate regulatory body. (Alias "company-number".)
x-charity-number
Registration number for a charity or other non-profit organisation registered with an appropriate body. (Alias "charity-number".)

If there is only one organisation listed as part of an hCard, then organisation-name and other organisation sub-properties may be used without an org wrapper element.

2.2. Kind Optimisation

The hCard specification offers a method for determining whether an hCard refers to an individual or an organisation. Swignition extends this to allow hCards to also refer to organisation units (e.g. departments, working groups) and any address properties (buildings, cities, regions, countries, etc). This is done by setting the "fn" property identically to the other property. e.g.

<div class="vcard">
  <a class="org url" href="http://la.ctu.gov.invalid">
    <span class="organization-name">Counter-Terrorist Unit</span>:
    <span class="organization-unit fn">Los Angeles Division</span>
  </a>
</div>

The "kind" property is automatically set (unless a "kind" has been specified explicitly); in the case above it is set to "group".

Explanation of kinds inferred by Swignition
Property equal to FN Inferred KIND
organization-name org
organization-unit group
post-office-box x-post-office-box
extended-address x-extended-address
street-address x-street-address
locality x-locality
region x-region
postal-code x-postal-code
country-name x-country-name

Otherwise, "kind" is assumed to be "individual".

3. hCalendar

The hCalendar specification is hopelessly incomplete. As a result, I have drafted hCalendar 1.1, and Swignition more-or-less supports that.

4. adr

The type property is parsed, even when an address is given outside an hCard.

An address may contain embedded geo microformats.

See also: geo.

5. geo

If a geo is missing its latitude or longitude, then the raw XML string for the entire element is searched for the following regular expression which represents two semicolon/comma-delimited decimal numbers:

/ \s* (\-?[0-9\.]+) \s* [\;\,] \s* (\-?[0-9\.]+) \s* /x

The first number is taken to be the latitude; the second, the longitude. This will allow the parsing of constructs like:

<a class="geo" href="http://maps.google.com/maps?q=50.8730,0.0005">home</a>

The following additional optional properties are supported:

body
The planet or astronomical body to which the co-ordinates apply. If not specified, then "Earth" is assumed. Names should be taken from the International Astronomical Union's Gazetteer of Planetary Nomenclature.
reference-frame
The co-ordinate system used. For Earth, the default co-ordinate system is "WGS84". Other appropriate values include "EtrS89" and "ITRF2005".
altitude
An altitude, above (negative values: below) sea level on Earth, or an agreed zero elevation on other planets. Units should be specified. (Currently there is no microformat for dealing with weights and measures.) When no unit is specified, metres are assumed.

6. rel-tag

The rel-tag specification should be fully supported as specified.

As an alternative to rel="tag", Swignition also supports class="tag". While rel values are defined as case-insensitive by the HTML 4.01 spec, classes are not, so lower-case must be used. class="tag" tags are parsed differently from rel="tag" tags, in that the link text (subject to the abbr pattern and value excerpting) is used instead of the final URL component, in order to accomodate alternative URL formats. When both rel="tag" and class="tag" are found on the same element, then the element is parsed using standard rel-tag rules. As with rel="tag", class="tag" must only be used on <a> and <area> elements.

@@TODO: Document profile requirement!

The following examples are all parsed as the tag "Example":

  • <a rel="tag" href="/tag/Example">NotThis</a>
  • <a class="tag" href="/tag/NotThis">Example</a>
  • <a class="tag" rel="tag" href="/tag/Example">NotThis</a>
  • <a class="tag" href="/tag/NotThis">NotThis <span class="value">Example</span></a>

7. hAtom

hAtom is mostly implemented as per the hAtom 0.1 spec. In addition, there is support for zero or more rel-enclosure links within each hEntry, which is predicted to appear in the hAtom 0.2 spec. The nearest-in-parent algorithm for discovering authors for authorless entries is not fully implemented as it is predicted that this algorithm will be simplified for hAtom 0.2. Instead, the following algorithm is implemented:

  1. Look for the author in an element with class "author".
  2. If not found, look for an hCard within an <address> element found within the entry.
  3. If not found, look for an hCard within an <address> element found within the feed.
  4. Otherwise, no author has been specified.

hSlice is supported as a synonym for hEntry.

8. XFN

XFN is supported as per the spec and parsed into RDF using the guidelines that I published on the microformats wiki. This includes a procedure for working out the “representative hCard” for the page being parsed. The following rules are followed to determine the hCard:

  1. If a representative hCard has been explicitly declared using RDFa through a triple of <pageURI> <http://purl.org/uF/hCard/terms/representative> _:foo then that is taken to be the hCard;
  2. Otherwise if a foaf:primaryTopic exists for the page and the object represents a person, then that is the hCard;
  3. Otherwise, the first hCard with rel="me" specified on a link with class="url";
  4. Otherwise, the first hCard with a class="url" link back to the page being parsed;
  5. Otherwise, the first hCard on the page.

The rev attribute is properly supported, and inverse and symmetric relationships are fully understood. For example, if using rev="child", Swignition knows that this is the same as rel="parent".

Swignition has specific support for XFN 1.0 (i.e. it will ignore the new properties defined in XFN 1.1), but only if you explicitly include the XFN 1.0 profile URI in your document head. Swignition includes support for the XHTML Enemies Network 1.0 (XEN), but again, only if you include the profile URI.

9. xFolk & hReview

Support for xFolk entries was introduced in Swignition 0.1-α8; hReview followed in 0.1-α9. As of 0.1-α10, the parsers for both have been united: xFolk is treated as funny-looking hReview. As a consequence, xFolk entries may include additional classes from hReview, such as dtreviewed and reviewer.

10. species

Swignition 0.1-α11 includes experimental support for the species microformat using the root class name biota. As some taxonomic ranks are used differently by botanists and zoologists, you may use the additional class names botany and zoology to resolve any ambiguities. For example, <i class="biota zoology">...</i>.

Within the root element, the following singular properties are allowed for marking up the various taxonomic ranks. (If you're using a CSS-capable browser, you should see that core terms are in bold, zoology-only terms in red, and biology-only in green.) If the rank you wanted is not on the list, then use the generic (plural) class="rank" instead.

  • aberration
  • aggregate
  • authority
  • biovar
  • branch
  • breed
  • class
  • claudius
  • cohort
  • complex
  • convariety
  • cultivar
  • cultivar-group
  • division
  • domain
  • empire
  • falanx
  • family
  • family-group
  • form
  • genus
  • genus-group
  • gigaorder
  • grade
  • grandorder
  • group
  • group-of-breeds
  • hybrid
  • hyperorder
  • infraclass
  • infradomain
  • infrafamily
  • infraform
  • infragenus
  • infrakingdom
  • infralegion
  • infraorder
  • infraphylum
  • infrasection
  • infraseries
  • infraspecies
  • infratribe
  • infravariety
  • interkingdom
  • kingdom
  • klepton
  • legion
  • lusus
  • magnorder
  • megaorder
  • microspecies
  • midkingdom
  • midphylum
  • mirorder
  • nation
  • order
  • parvclass
  • parvorder
  • pathovar
  • phylum
  • population
  • section
  • section-of-breeds
  • series
  • serogroup
  • serovar
  • species (a.k.a. specific)
  • species-group
  • species-subgroup
  • strain
  • subclass
  • subcohort
  • subdivision
  • subdomain
  • subfamily
  • subfamily-group
  • subform
  • subgenus
  • subgroup
  • subkingdom
  • sublegion
  • suborder
  • subphylum
  • subsection
  • subseries
  • subspecies
  • subtribe
  • subvariety
  • superclass
  • supercohort
  • superdivision
  • superdomain
  • superfamily
  • superform
  • supergenus
  • superkingdom
  • superlegion
  • superorder
  • superphylum
  • supersection
  • superseries
  • superspecies
  • supertribe
  • supervariety
  • suprakingdom
  • supraphylum
  • synklepton
  • tribe
  • variety

Further plural classes binomial and trinomial are supported for marking up the binomial or trinomial name, and common-name (a.k.a. vernacular, cname) for the common name of a species. The plural class authority is supported for marking up the classification authority.

For convenience, as many of these properties use such generic names (class, form, section, etc) you may prefix any of these classes with taxo or taxo-. For example, instead of class="tribe" you could equivalently use class="taxotribe" or class="taxo-tribe". This is not a namespacing mechanism, but a simple method for you to avoid clashes with class names.

Lastly, as an optimisation, if none of the above properties are found within the root element, then the entire string contents of the root element are taken to be a binomial/trinomial name.

Two examples of species parsed by Swignition follow:

<span class="biota" lang="zxx">Homo sapiens</span>

<p class="biota zoology" lang="en">
  He is a <span class="common-name">human</span>, or as they say
  in Basque, a <span lang="eu" class="common-name">Gizakia</span>.
  What scientists would classify as a
  <i class="trinomial" lang="zxx">
    <span class="binomial">
      <span class="genus">Homo</span>
      <span class="species">sapiens</span>
    </span>
    <span class="subspecies">sapiens</span>
  </i>,
  a member of the <span class="family" lang="zxx">hominidae</span> family
  of <span lang="zxx" class="taxo-order">primates</span>.
</p>

[Note the use of lang="zxx" (language code for "no linguistic content") rather than lang="la" (language code for Latin). Despite the fact that these scientific terms are often called "Latin names", in reality they are often derived from a mixture of Latin, Greek, English and other sources — they are not usually even close to the Latin terms for the forms of life described. lang="zxx" is a better way of marking up these terms and also indicates that a translation of the terms should not be attempted.]

10.1. Nesting

Swignition applies special meaning to instances of the species microformat found nested within hCard and hCalendar events. It is strongly suggested that for either of these purposes, you should supply at least one of these properties (which are normally optional):

  • binomial
  • trinomial
  • common-name (or one of its aliases)
  • Both genus and species (or its alias specific), from which a binomial can be implied
  • No properties at all, and hence rely on binomial/trinomial optimisation.

10.1.1. hCard

When class="biota" is found nested inside an hCard, then it is implied that the person/thing described by the hCard is a member of the species.

10.1.2. hCalendar Events

When class="biota attendee" is found within an hCalendar event, at least one member of the species described is taken to have been present at the event. Combined with location/geo and dtstart this is roughly equivalent to a "sighting" of the species.

The above combination with attendee may cause problems with some naive parsers, especially ones with no support for the species microformat. Because of this, an alternative syntax is supported to avoid triggering bugs: class="biota x-sighting-of".

Additionally, to specify a sighting of a species class="vcard attendee" may be used in conjunction with the hCard nesting described above to record additional information such as the name or date of birth of the creature sighted. For example:

<p class="vevent">
  <abbr class="dtstart" title="20080706">Yesterday</abbr> I saw a
  <span class="attendee vcard">
    <span class="biota"><span class="common-name">goat</span></span>
    called <span class="fn">Steve</span>
  </span>
</p>

Example iCalendar output:

BEGIN:VEVENT
DTSTART:20080706T000000Z
X-SIGHTING-OF:goat
ATTENDEE;CN=Steve;CUTYPE=INDIVIDUAL;VALUE=TEXT:Steve
END:VEVENT

11. hRecipe

Swignition 0.1-α14 has experimental support for the proposed hRecipe microformat. Class names supported:

  • hrecipe
    • recipe-title (required, singular)
    • recipe-summary (optional, singular)
    • author (optional, plural)
    • published (optional, plural)
    • photo (optional, plural)
    • method (if plural, then concatenated)
    • ingredient (required, plural)
      • quantity
      • item
      • note
      • optional
    • yield (optional, singular)
    • preparation-time (optional, singular)
    • embedded rel-tag.

Also you can use class="ingredients" on an element as a shorthand for putting class="ingredient" on all its direct child elements. That is, the following two lists are considered exactly equivalent:

<ul>
  <li class="ingredient">Tomato juice</li>
  <li class="ingredient">
    <span class="quantity hmeasure">1 tbsp</span>
    <span class="item">Worcestershire sauce</span>
  </li>
  <li class="ingredient">
    <span class="item">Tabasco sauce</span>
    <span class="note">to taste</span>
  </li>  
</ul>

<ul class="ingredients">
  <li>Tomato juice</li>
  <li>
    <span class="quantity hmeasure">1 tbsp</span>
    <span class="item">Worcestershire sauce</span>
  </li>
  <li>
    <span class="item">Tabasco sauce</span>
    <span class="note">to taste</span>
  </li>  
</ul>