Version History
===============


cognition/0.1-alpha1 :-

* initial release
* metadata: <meta>, <link>, <title>, @role, eRDF
* eRDF does not support rdf:type syntax
* RFC 2731 is supported for namespaces
* microformats: hcard, hcalendar, adr, geo
	- hcalendar support assumes page is one giant calendar
	- no support for rel-tag, so no support for categories in hcard or
	  hcalendar
	- geo support includes body, altitiude and reference-frame extensions           
	- microformats patterns: include-pattern, abbr-pattern, extensions
		+ include-pattern supports my alternative syntax
		+ abbr-pattern supports Andy Mabbett's alternative
* RDF output of namespaced metadata


cognition/0.1-alpha2 :-

* drop usage of XML::XPath module, using XML::DOM instead
	- might use XML::DOM::XPath in future if XPath support is needed
* support XML namespaces used as metadata namespaces.
* microformats: hcalendar (complete), rel-tag, rel-license, figure, xoxo
	- rel-licence extended to support searches for 'license' in CC or
	  DCTERMS namespaces; or 'rights.license' in DC or DCTERMS namespaces
	- experimental figure microformat based on current brainstorming
* parse document structure (headings + semantic tables + semantic
  images/figures microformat? + xoxo lists)


cognition/0.1-alpha2.1 :-

* Fix handling for entities.
* Fix delay on LWP::RobotUA.


cognition/0.1-alpha3 :-

* Switch from XML::DOM to XML::LibXML. Should be my last big parser change!
* Restructure object to be more tuple-like.
* URLs:
	- Support for CURIEs.
	- support for geo: and tag: URIs
	- use XPointer to provide URLs for document fragments without identifiers
* RDF:
	- use <rdf:Bag> to wrap multiple tuples with the same subject and property
	- Remove duplicate values within bags
	- add support for microformats to RDF output
	- RDF subjects may have multiple URIs defined to help match up properties
	  that actually belong to the same subject (e.g. some properties might be
	  attached to a fragment identifier, and others to an hcard, but if we
	  know that the hcard root element has an id attribute which matches the
	  fragment identifier, then we can equate the subjects)
	- support "vocabularies" for RDF
	- convert document structure to RDF <http://purl.org/dc/terms/hasPart>,
	  <http://purl.org/dc/terms/isPartOf>.
* Improve STRINGIFY to prevent all these leading and trailing spaces
* Recognise (X)HTML predefined link types and put them in XHTML namespace.
* More reliable support for namespaces.
* Microformats:
	- Properly parse DateTimes found in microformats.
	- support table cell header pattern
	- support hcalendar 1.1 draft
* Complete support for RDFa
* Much improved support for eRDF, support rdf:type. Any bugs?
* Improved support for XHTML role attribute


cognition/0.1-alpha4 :-

* Support rel=meta: retrieve additional document metadata, parse as RDF
* GRDDL:
	- Beginnings of GRDDL support.
	- Support for rel=transformation linking to XSLT to transform doc to RDF
	- Support for grddl:transformation="" style transformations.
	- No support for <head profile> yet.
* Microformats:
	- Table cell header pattern has been changed on wiki. Implement changes.
	- Better microformat nesting handling.
* Improvements in charset handling and support for tag-soup HTML.
* Comment out pre-RDFa <link rel>, <a rel> support. It's not really useful.
* Disable eRDF by default as it seems to generate too many false positives.


cognition/0.1-alpha5 :-

* Various minor improvements to hCard and hCalendar parsing.
* Export framework
	- Add vCard export option.
		+ Parses data: URIs and outputs as base64 embedded data.
		+ Pulls in data from full gamut of supported semantics, so that, say,
		  RDFa FOAF data may end up as part of the vCard output.
		+ Test input: <http://examples.tobyinkster.co.uk/hcard>.
	- Add KML export option.
		+ Data can come from hCard, (e)RDF(a) vCard, (e)RDF(a) GeoRSS, etc.
* Re-enabled eRDF by default, but eRDF parsing is now stricter. It *requires*
  a profile of <http://purl.org/NET/erdf/profile> to be found on the <head>
  element.
* Improved command-line client. Use GetOpt::Long, Pod::Usage.
* Support RDF embedded in HTML <!-- comments -->. (Trackback uses this.)


cognition/0.1-alpha6 :-

* Microformats:
	- Add option (disabled by default) to require <head profile> for microformat
	  support. Microformat profiles are treated as OPAQUE STRINGS! Supports th
	  following profiles:
		+ http://purl.org/uF/2008/03/
		+ http://www.w3.org/2006/03/hcard or http://purl.org/uF/hCard/1.0/
		+ http://dannyayers.com/microformats/hcalendar-profile or
		  http://purl.org/uF/hCalendar/1.0/
		+ http://purl.org/uF/hAtom/0.1/
		+ http://purl.org/uF/rel-tag/1.0/
		+ http://purl.org/uF/rel-license/1.0/
		+ No profiles required for rel-enclosure, adr or geo (yet).
	- Support for hAtom, WebSlices.
		+ In addition to hAtom 0.1, rel-enclosure is supported within hEntries.
	- Improve include-pattern support to prevent some infinite loops.
* GRDDL:
	- Add option (disabled by default) to require <head profile> for GRDDL.
	- Add option to check profile URLs for profileTransformation links.
* Export:
	- Atom output. (Supports RDF/RSS and hAtom as input.)
	- iCalendar export option.
		+ hCalendar 1.1 events.
		+ hCalendar 1.1 todo items
		+ hCalendar 1.1 freebusy info.
		+ hCalendar 1.1 alarms.
		+ hAtom entries (as VJOURNAL).
		+ W3C's iCal RDF vocab (but see note in Cognition/Export/Calendar.pm)
		+ RSS Event Module <http://web.resource.org/rss/1.0/modules/event/>
* Added a "--nofollow" option to prevent secondary fetching from particular
  hosts. (Secondary fetching = requesting <head profile>, <link rel="meta">,
  <link rel="transformation">.)
* Support <rdf:RDF> elements found directly in (X)HTML.
* Much improved HTML->Text convertion. Namely: word wrapping, line breaks added
  after block elements, quote marks around <q> elements, bullet points and
  numbers before <li> elements in unordered and ordered lists, brackets around
  superscript text, parentheses around subscripts, tab characters between table
  cells, usenet-style quoting for <blockquote>, alt text from <img> and <input
  type="img">, values from other <input> tags. Should be able to handle nested
  elements like //ul/li/ol/li/dl/dd/blockquote/img[@alt]. Won't be completely
  foolproof, but should be an improvement over what was there before!
* Fix so that the entire page is not given a rdf:type of ical:vcalendar unless
  it contains some bona fide vevent/vtodo/valarm/vfreebusy nodes.


cognition/0.1-alpha7 :-

* Set '_xmllang' attribute on all elements, a la '_xpath'.
* Microformats:
	- hCard:
		+ Rename date-of-death "dday", and implement other properties from vCard
		  4.0 draft <http://www.ietf.org/internet-drafts/draft-resnick-vcarddav-
		  vcardrev-01.txt>.
		+ Empty TEL, EMAIL and IMPP no longer parsed. (e.g. telephone numbers
		  with usages but no actual number.)
		+ Automatically detect the representative hCard and contact hCard.
		  <http://microformats.org/wiki/representative-hcard>
	- hCalendar:
		+ support rel="vcalendar-(parent|sibling|child)" and class="related-to".
		+ support implicit relationships gleaned from nesting.
		+ Explicitly set RDF datatype for integers.
		+ Better support for vfreebusys.
		+ @title on root element parsed as dc:title.
		+ Support x-wr-calname/x-wr-caldesc/calscale/prodid/method.
	- XFN: <http://microformats.org/wiki/xfn-to-foaf>.
* Exports:
	- Cognition::Export::findSubject - I won't go into an explanation of why
	  this is important, but it is.
	- jCard export.
	- vCard improvements:
		+ Set TYPE parameter when ENCODING=b.
		+ Output vCard 4.0 properties. Detect instant messaging protocols which
		  have been forced into the URLs and output them as IMPP properties.
	- iCalendar improvements:
		+ Set TYPE parameter when ENCODING=b.
		+ Add RELATED-TO properties.
		+ Support X-WR-CALDESC/CALSCALE/PRODID/METHOD/VERSION.
		+ Big improvements for ATTENDEE/CONTACT/ORGANIZER.
	- RDF output no longer handled by HTMLParser -- it is in an Export module:
		+ Output RDF datatypes (e.g. <http://www.w3.org/2001/XMLSchema#date>).
		+ Output xml:lang where we can.
		+ s/rdf:Description/FOO/ where FOO is the rdf:type.
		+ Improved output for rdf:XMLLiterals.
		+ Instead of <foo:bar rdf:nodeID="X">, nest the RDF description for X.
	- RDF JSON <http://n2.talis.com/wiki/RDF_JSON_Specification> export.
* RDFa:
	- RDFa DTD has s/instanceof/typeof/. Cognition supports both (for now), but
	  prefers @typeof. Fixed this attribute to allow whitespace-delimited list
	  of (CURIE|URI)s.
	- In accordance with RDFa rules, drop resolution of absolute URIs from
	  relative URIs specified in @xmlns. This actually makes parsing dumber, but
	  it's in the recommended algorithm.
	- Improved parsing of rdf:XMLLiterals.
	- Extension to RDFa: @title parsed as rdfs:label.
* When parsing and outputting dates, retain "resolution".
* Create a data type Cognition::MagicString used in place of strings in many
  places which retains the language and XML representation of a string.
  MagicString-aware code can then pick up this data and use it if required.
  non-MagicString-aware code should usually be able to treat the MagicString
  as if it were a string, and not notice any difference, as MagicString
  overloads the stringify function.
* More improvements to STRINGIFY:
	- Better algorithm for inserting whitespace between CDATA and inline element
	  nodes. Should prevent words from accidentally running together.
	- Implement @start and @type for lists. For unordered lists, disc markers are
	  implemented as asterisks, circle markers as hyphens, and square markers as
	  plus signs. (Much like the markers used in this ChangeLog.) For ordered
	  lists, roman numeral markers work up to 3999, and alphabetical markers up
	  to 26 -- after that, the list will revert to numeric markers.
	- Better support for microformats "value excerpting".
	- Stringify now takes care of value excerpting and the ABBR pattern.
* Better HTML->XHTML conversion routine.
* Better framework for namespaces. Old system didn't handle scoped namespaces
  (e.g. xmlns attribute on a non-root element).
* Introduce a BNode concept into the Cognition RDF model. Stored in the RDF
  triple store with dummy URIs like <bnode:///string>. This pretty much
  eliminates those ugly XPointers which littered the RDF output previously. As
  a deliberate change, <div class="vcard vcalendar"> will now result in two
  different RDF subjects, however they can be united into one subject by giving
  that node an ID attribute (because then they have proper URIs, not node IDs).
	- Adjust "->uri" methods for microformats.
	- Adjust RDFa parser to create BNodes instead of #fakeid URIs.
	- Adjust RDF export to use rdf:nodeID instead of rdf:resource/rdf:about.
* Document structure parsing was disabled in alpha4 as it made the RDF output
  ugly. Because of improvements in RDF output, and ability to use BNodes, it
  is now re-enabled by default without uglying everything up. It can still be
  disabled via options.

future work :-

* Implement support for @xml:base and <base>.
* Microformats: hreview, hresume, hlisting?
