Mastodon
Where is there an end of it? | Alex Brown's weblog

The Tree by King's College Bridge

The Tree by King's College Bridge

I seem to have got into the habit of taking HDR pictures into the sun; I like the result and the way that this "justifies" the user of HDR (a normally exposure just wouldn't work here).

Real Conformance for ODF?

There has been quite a lot of hubbub recently about ODF conformance, in particular about how conformance to the forthcoming ODF 1.2 specification should be defined.

A New Conformance Clause

Earlier versions of ODF (including ISO/IEC 26300) already defined conformance - it was simply a question of obeying the schema. So in ODF 1.1, for example, we had this text:

Conforming applications [...] shall read documents that are valid against the OpenDocument schema if all foreign elements and attributes are removed before validation takes place [...] (1.5)

and that was the simple essence of ODF conformance.

This is now up for reconsideration. The impetus for altering the existing conformance criteria appears to have come from a change in OASIS's procedures, which now require that specifications have “a set of numbered conformance clauses”, a requirement which seems sensible enough.

However, the freshly-drafted proposal which the OASIS TC has been considering goes further than just introducing numbered clauses: it now defines two categories of conformance:

  1. “Conforming OpenDocument Document” conformance
  2. “Conforming OpenDocument Extended Document” conformance

as shorthand, we might like to characterise these as the “pure” and “buggered-up” versions of ODF respectively.

The difference is that the “pure” version now forbids the use of foreign elements and attributes (i.e. those not declared by the ODF schema), while the “buggered-up” version permits them.

Ructions

The proposal caused much debate. In support of the new conformance clause, IBM's Rob Weir described foreign elements (formerly so welcome in ODF) as proprietary extensions that are “evil” and as a “nuclear death ray gun”. Questioning the proposal, KOffice's Thomas Zander wrote that he was “worried that we are trying to remove a core feature that I depend on in both KOffice and Qt”. Meanwhile Microsoft's Doug Mahugh made a counter-proposal suggesting that ODF might adopt the Markup Compatibility and Extensibility mechanisms from ISO/IEC 29500 (OOXML).

Things came to a head in a 9-2-2 split vote last week which saw the new conformance text adopted in the new ODF committee specification by will of the majority. Following this there was some traffic in the blogosphere with IBM's Rob Weir commenting and Microsoft's Doug Mahugh counter-commenting on the vote and the circumstances surrounding it.

Shadow Play

What is to be made of all this? Maybe Sun, whose corporate memory still smarts from Microsoft's “extend and embrace” Java attempts, thinks this is a way to prevent a repeat of similar stunts for ODF. Or perhaps this is a way to carve out a niche for OpenOffice to enjoy “pure” status while competitor applications are relegated to the “buggered-up” bin. Maybe it is envisaged that governments might be encouraged to procure only systems that deal in “pure” ODF. Maybe foreign elements really are the harbinger of nuclear death.

Who knows?

Whatever the reasons behind the reasons, there is clearly an “absent presence" in all these discussions: Microsoft Office. And in particular the forthcoming Microsoft Office 2007 SP2 with its ODF support. It is never mentioned, except in an occasional nudge-nudge wink-wink sort of way.

This controvery is most bemusing. This is in part because the “Microsoft factor” appears not to be a factor anyway, since MS Office will (we are told) not use foreign elements for its ODF 1.1 support. But the main reason why this is bemusing is that this discussion (whether or not to permit foreign elements) is completely unreal. There seems to be an assumption that it matters – that conformance as defined in the ODF spec means something important when it comes to real users, real procurement, real development or real interoperability.

It doesn't mean anything real - and here's why...

Making an ODF-conformant Office Application

Let us consider the procurement rules of an imaginary country (Vulgaria, say). Let us further imagine that Vulgaria's government wants to standardize on using ODF for all its many departments. After many hours of meetings, and the expenditure of many Vulgarian Dollars on consultancy fees, the decision is finally made and an official draws up procurement rules to stipulate this:

Any office application software procured by the Government of Vulgaria must support ODF (ISO/IEC 26300), and must conform to the 'pure' conformance class defined in clause x.y of that Standard, reading and emitting only ODF documents that are so conformant".

Sorted, they think.

Now imagine a software company that has its eye on making a big sale of software licenses to Vulgaria. Unfortunately, its office application does not meet the ODF conformance criterion set out by the procurement officer. The marketing department is duly sad. But one day a bright young developer gets to hear of the problem and proposes a solution. He boldy proclaims “I can make our format ODF-conformant today!”, and proceeds to show how.

First he gets a template ODF document, like this:

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
office:version="1.0">
<office:body>
<office:text>
<text:p></text:p>
</office:text>
</office:body>
</office:document-content>

This document (he points out) meets the “pure” conformance criteria. Our young hacker then does a curious thing: he takes an existing (non-ODF) file from their office software, BASE-64 encodes it, and inserts the resulting text string into the element in the template document.

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
office:version="1.0">
<office:body>
<office:text>
<text:p><!-- several MBs of BASE-64 encoded content here --></text:p>
</office:text>
</office:body>
</office:document-content>

There, he proudly proclaims. All we need to do it to wrap our current documents with the ODF wrapper when we save, and unwrap when we load – I can have a fresh build for you tomorrow.

The rest of the story is not so happy: the software company makes the sale and the government of Vulgaria finds after installation that none of the files from it will interoperate with any other ODF files from other sources, despite the software company having met its procurement rules to the letter.

Far fetched?

Okay, that story makes an extreme example – but it neverthess illustrates the point. It is possible for a smart developer to represent pretty much anything as a “pure” ODF document; any differences and incompatibilities can ever-so-easily be shoehorned into conformant ODF documents. That some software deals only in such pure ODF means precisely zero in the real world of interoperability.

The central consideration here is that ODF conformance only ever was (and is only projected to be) stated in terms of XML, and XML is (in)famously “all syntax and no semantics”. The semantics of an ODF document (broadly, all the narrative text in the specification) play no part in conformance can remain unimplemented in a conformant processor. An ODF developer can safely use just the schema and never read much else. All those descriptions of element behaviour can be ignored for the purposes of achieving ODF conformance. [N.B. mistakes in this para corrected following comment from Rob Weir, below]

So my question is: what is the current debate on ODF conformance really about? It looks to me like mis-directed effort.

What ODF might usefully do is to look at the “application description” feature introduced into OOXML. This describes several types of applications, including a type called “full”. Such applications have “a semantic understanding of every feature within [their] conformance class”, and

“Semantic understanding” is to be interpreted that an application shall treat the information in Office Open XML documents in a manner consistent with the semantic definitions given in this Specification.

In other words, it is possible to specify in OOXML procurement that the processor should heed the narrative description within that Standard (not just the XML grammar). ODF currently lacks this. In my view if there is to be any connection between a definition of ODF conformance and the experience of users in the real world, then something like OOXML's “application description” feature is urgently needed. And it might be better done now, than hastily inserted during a JTC 1 BRM ...

SC 34 Meetings, Okinawa – Day 5 and Summary

Okinawan Entertainment
Singer Azusa Miyagi from the Okinawan pop duo Tink Tink

Day 5 found us all attending a joint session of the working groups to sort out more administrative details and share the recommendations made by WG 5. Anybody interested in seeing the state of play in WG 4’s work can consult the WG 4 web site, where progress on defect correction can be tracked.

Overall this has been I think a successful meeting: the two new working groups are up and running and their work is well underway. There was perhaps the occasional trace of residual angst overhanging from last year: NBs are keen to assert their sovereignty in the decision making process, and the Ecma delegates are keen to be assured the JTC 1 processes can deliver the mechanisms and timeliness needed to keep IS29500 in shape. In general however, there has been a decided “unclenching” as delegates warmed to the (let’s face it) sometime drudgery of maintaining XML document formats. This was all helped by the exceptional hospitality shown by JISC and ITSCJ in hosting the meetings, and in particular by the efforts of WG 4 convenor Murata Makoto. Whenever one wanted to know where to eat, what to drink, or where the prettiest singers could be found, Murata-san was your man!

It was great also to work with Jesper Lund Stocholm, who has also been covering these meetings on his blog. It would be better still if more countries followed Denmark, and more companies followed the shining example of CIBER, in supplying experts for assisting in this important work.

It was something of a shock coming from the 23°C sunshine of Okinawa to freezing snow-bound Britain. And also a shock to review the amount of standards work piling up to be done before the next SC 34 meetings in Prague: defects to be filed, maintenance agreements to be hammered out, agendas to write, ballots to vote on and proposals to draft. I am expecting the Prague meeting to be particularly vibrant, not least since it is preceded by XML Prague 2009. I have not been to an XML Prague before, but have heard only good things about it. Certainly, the programme looks fascinating (though I make no claims for my own presentation). It certainly seems that Prague is going to be the centre of the world for XML-heads everywhere in late March …

SC 34 Meetings, Okinawa - Days 3 & 4

Hotel Walkway at Sunset
Hotel Walkway at Sunset

Two days of hard grind. A lot of administrivia to sort (meeting dates, etc.); many paragraphs of the Directives to read; many defect reports on OOXML to address; and some vigorous discussion to be had about interoperability.

Some concrete progress was made, notably:

  • The first defect reports on ISO/IEC 29500 (aka OOXML) were addressed, and fixes agreed
  • Some principles were established how updates (as opposed to fixes) for OOXML might be processed
  • Some useful discussions in WG 5 clarified the scope of the ongoing work drafting a technical report giving guidance on how 29500 (OOXML) and 26300 (ODF) can interoperate

From my perspective, the most exciting discussion during these meetings centred on a presentation from the ODF editor, Patrick Durusau, on what he called “true” interoperability. Patrick (betraying his Topic Maps background) set out a suggestion that a PSI might be created to identify the document constructs described by the two document format Standards, and that each PSI might be in turn associated with metadata and documentation related to that construct. Essentially, this approach views the “problem” of interoperability between ODF and OOXML as a problem of documentation — though Patrick also pointed out that the interoperability problem had already been solved by corporations (maybe he meant Microsoft, for example) and that these corporations were, perhaps churlishly, keeping the information to themselves.

I see the establishment of such rich descriptive material as being a first important step on a road which leads to the dissolving of what we currently see as meaningful differences between the document formats. Perhaps in time the rest of the world will come to realise too that when we talk of a preference for ODF and OOXML we are, in the main, expressing a preference for syntax, and that the juvenile “OOXML vs ODF” arguments – however much they are loaded with corporate agendas masquerading as moral superiority – will achieve precisely nothing for those who matter: the end users.

SC 34 Meetings, Okinawa - Day 2

Sea Snake for Supper
A soup of sea snake, pig's trotter and seaweed

Another day of work in the hotel: which is a shame since the weather outside has been even warmer and very sunny. This morning was mostly given over to a meeting (via Skype™) with OASIS people to discuss how the future maintenance of ODF might be handled. This was a very constructive exchange, and while there are many details to work out over the coming weeks, my personal impressions was that all parties felt confident a good solution was in reach, and that the era of megaphone diplomacy on this topic was behind us all.

The afternoon was given over to drafting meeting notes, further readings of the JTC 1 Directives, and preparations for the WG meetings tomorrow. The coming-together of a number of people interested in both OOXML and ODF has led to some interesting lobby discussions over future directions for these standards. The groovy (but as yet unimplemented) new feature of RDF in ODF for metadata capture has certainly caught the imagination: might an NB propose that this feature is added to OOXML via an amendment? Conversely, the fact that a whole bunch of spreadsheet functions have been standardised in ISO/IEC 29500 (OOXML) potentially saves ODF a lot of work/pages. Certainly any new International Standard version of ODF would need a cast-iron reason to eschew borrowing any of these existing function definitions. Harmonious times may lie ahead …

In the evening Murata Makoto (who seems determined to test our Western sensibilities) took us for a meal of sea snake: a rare Okinawan delicacy. The charming old lady proprietor of the restaurant had been cooking our snake all day (we had had to place our orders yesterday). She explained that traditionally the sea snake was the food of kings, not because of rarity but because of the difficulty of preparation. Once the snake is caught it is smoked, turning it black. The snake is then boiled for one or two days (before domestic ovens this was a real chore) and at some point the many tiny bones in it have to be removed by hand.

And the taste? Well, it was certainly not like chicken. Quite chewy (so much muscle!), and a little like a gamier version of smoked mackerel. Yumsk.