Mastodon
Where is there an end of it? | All posts tagged 'TC1'

Real Conformance for ODF?

There has been quite a lot of hubbub recently about ODF conformance, in particular about how conformance to the forthcoming ODF 1.2 specification should be defined.

A New Conformance Clause

Earlier versions of ODF (including ISO/IEC 26300) already defined conformance - it was simply a question of obeying the schema. So in ODF 1.1, for example, we had this text:

Conforming applications [...] shall read documents that are valid against the OpenDocument schema if all foreign elements and attributes are removed before validation takes place [...] (1.5)

and that was the simple essence of ODF conformance.

This is now up for reconsideration. The impetus for altering the existing conformance criteria appears to have come from a change in OASIS's procedures, which now require that specifications have “a set of numbered conformance clauses”, a requirement which seems sensible enough.

However, the freshly-drafted proposal which the OASIS TC has been considering goes further than just introducing numbered clauses: it now defines two categories of conformance:

  1. “Conforming OpenDocument Document” conformance
  2. “Conforming OpenDocument Extended Document” conformance

as shorthand, we might like to characterise these as the “pure” and “buggered-up” versions of ODF respectively.

The difference is that the “pure” version now forbids the use of foreign elements and attributes (i.e. those not declared by the ODF schema), while the “buggered-up” version permits them.

Ructions

The proposal caused much debate. In support of the new conformance clause, IBM's Rob Weir described foreign elements (formerly so welcome in ODF) as proprietary extensions that are “evil” and as a “nuclear death ray gun”. Questioning the proposal, KOffice's Thomas Zander wrote that he was “worried that we are trying to remove a core feature that I depend on in both KOffice and Qt”. Meanwhile Microsoft's Doug Mahugh made a counter-proposal suggesting that ODF might adopt the Markup Compatibility and Extensibility mechanisms from ISO/IEC 29500 (OOXML).

Things came to a head in a 9-2-2 split vote last week which saw the new conformance text adopted in the new ODF committee specification by will of the majority. Following this there was some traffic in the blogosphere with IBM's Rob Weir commenting and Microsoft's Doug Mahugh counter-commenting on the vote and the circumstances surrounding it.

Shadow Play

What is to be made of all this? Maybe Sun, whose corporate memory still smarts from Microsoft's “extend and embrace” Java attempts, thinks this is a way to prevent a repeat of similar stunts for ODF. Or perhaps this is a way to carve out a niche for OpenOffice to enjoy “pure” status while competitor applications are relegated to the “buggered-up” bin. Maybe it is envisaged that governments might be encouraged to procure only systems that deal in “pure” ODF. Maybe foreign elements really are the harbinger of nuclear death.

Who knows?

Whatever the reasons behind the reasons, there is clearly an “absent presence" in all these discussions: Microsoft Office. And in particular the forthcoming Microsoft Office 2007 SP2 with its ODF support. It is never mentioned, except in an occasional nudge-nudge wink-wink sort of way.

This controvery is most bemusing. This is in part because the “Microsoft factor” appears not to be a factor anyway, since MS Office will (we are told) not use foreign elements for its ODF 1.1 support. But the main reason why this is bemusing is that this discussion (whether or not to permit foreign elements) is completely unreal. There seems to be an assumption that it matters – that conformance as defined in the ODF spec means something important when it comes to real users, real procurement, real development or real interoperability.

It doesn't mean anything real - and here's why...

Making an ODF-conformant Office Application

Let us consider the procurement rules of an imaginary country (Vulgaria, say). Let us further imagine that Vulgaria's government wants to standardize on using ODF for all its many departments. After many hours of meetings, and the expenditure of many Vulgarian Dollars on consultancy fees, the decision is finally made and an official draws up procurement rules to stipulate this:

Any office application software procured by the Government of Vulgaria must support ODF (ISO/IEC 26300), and must conform to the 'pure' conformance class defined in clause x.y of that Standard, reading and emitting only ODF documents that are so conformant".

Sorted, they think.

Now imagine a software company that has its eye on making a big sale of software licenses to Vulgaria. Unfortunately, its office application does not meet the ODF conformance criterion set out by the procurement officer. The marketing department is duly sad. But one day a bright young developer gets to hear of the problem and proposes a solution. He boldy proclaims “I can make our format ODF-conformant today!”, and proceeds to show how.

First he gets a template ODF document, like this:

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
office:version="1.0">
<office:body>
<office:text>
<text:p></text:p>
</office:text>
</office:body>
</office:document-content>

This document (he points out) meets the “pure” conformance criteria. Our young hacker then does a curious thing: he takes an existing (non-ODF) file from their office software, BASE-64 encodes it, and inserts the resulting text string into the element in the template document.

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
office:version="1.0">
<office:body>
<office:text>
<text:p><!-- several MBs of BASE-64 encoded content here --></text:p>
</office:text>
</office:body>
</office:document-content>

There, he proudly proclaims. All we need to do it to wrap our current documents with the ODF wrapper when we save, and unwrap when we load – I can have a fresh build for you tomorrow.

The rest of the story is not so happy: the software company makes the sale and the government of Vulgaria finds after installation that none of the files from it will interoperate with any other ODF files from other sources, despite the software company having met its procurement rules to the letter.

Far fetched?

Okay, that story makes an extreme example – but it neverthess illustrates the point. It is possible for a smart developer to represent pretty much anything as a “pure” ODF document; any differences and incompatibilities can ever-so-easily be shoehorned into conformant ODF documents. That some software deals only in such pure ODF means precisely zero in the real world of interoperability.

The central consideration here is that ODF conformance only ever was (and is only projected to be) stated in terms of XML, and XML is (in)famously “all syntax and no semantics”. The semantics of an ODF document (broadly, all the narrative text in the specification) play no part in conformance can remain unimplemented in a conformant processor. An ODF developer can safely use just the schema and never read much else. All those descriptions of element behaviour can be ignored for the purposes of achieving ODF conformance. [N.B. mistakes in this para corrected following comment from Rob Weir, below]

So my question is: what is the current debate on ODF conformance really about? It looks to me like mis-directed effort.

What ODF might usefully do is to look at the “application description” feature introduced into OOXML. This describes several types of applications, including a type called “full”. Such applications have “a semantic understanding of every feature within [their] conformance class”, and

“Semantic understanding” is to be interpreted that an application shall treat the information in Office Open XML documents in a manner consistent with the semantic definitions given in this Specification.

In other words, it is possible to specify in OOXML procurement that the processor should heed the narrative description within that Standard (not just the XML grammar). ODF currently lacks this. In my view if there is to be any connection between a definition of ODF conformance and the experience of users in the real world, then something like OOXML's “application description” feature is urgently needed. And it might be better done now, than hastily inserted during a JTC 1 BRM ...

SC 34 Meetings, Okinawa – Day 5 and Summary

Okinawan Entertainment
Singer Azusa Miyagi from the Okinawan pop duo Tink Tink

Day 5 found us all attending a joint session of the working groups to sort out more administrative details and share the recommendations made by WG 5. Anybody interested in seeing the state of play in WG 4’s work can consult the WG 4 web site, where progress on defect correction can be tracked.

Overall this has been I think a successful meeting: the two new working groups are up and running and their work is well underway. There was perhaps the occasional trace of residual angst overhanging from last year: NBs are keen to assert their sovereignty in the decision making process, and the Ecma delegates are keen to be assured the JTC 1 processes can deliver the mechanisms and timeliness needed to keep IS29500 in shape. In general however, there has been a decided “unclenching” as delegates warmed to the (let’s face it) sometime drudgery of maintaining XML document formats. This was all helped by the exceptional hospitality shown by JISC and ITSCJ in hosting the meetings, and in particular by the efforts of WG 4 convenor Murata Makoto. Whenever one wanted to know where to eat, what to drink, or where the prettiest singers could be found, Murata-san was your man!

It was great also to work with Jesper Lund Stocholm, who has also been covering these meetings on his blog. It would be better still if more countries followed Denmark, and more companies followed the shining example of CIBER, in supplying experts for assisting in this important work.

It was something of a shock coming from the 23°C sunshine of Okinawa to freezing snow-bound Britain. And also a shock to review the amount of standards work piling up to be done before the next SC 34 meetings in Prague: defects to be filed, maintenance agreements to be hammered out, agendas to write, ballots to vote on and proposals to draft. I am expecting the Prague meeting to be particularly vibrant, not least since it is preceded by XML Prague 2009. I have not been to an XML Prague before, but have heard only good things about it. Certainly, the programme looks fascinating (though I make no claims for my own presentation). It certainly seems that Prague is going to be the centre of the world for XML-heads everywhere in late March …

SC 34 Meetings, Okinawa - Day 2

Sea Snake for Supper
A soup of sea snake, pig's trotter and seaweed

Another day of work in the hotel: which is a shame since the weather outside has been even warmer and very sunny. This morning was mostly given over to a meeting (via Skype™) with OASIS people to discuss how the future maintenance of ODF might be handled. This was a very constructive exchange, and while there are many details to work out over the coming weeks, my personal impressions was that all parties felt confident a good solution was in reach, and that the era of megaphone diplomacy on this topic was behind us all.

The afternoon was given over to drafting meeting notes, further readings of the JTC 1 Directives, and preparations for the WG meetings tomorrow. The coming-together of a number of people interested in both OOXML and ODF has led to some interesting lobby discussions over future directions for these standards. The groovy (but as yet unimplemented) new feature of RDF in ODF for metadata capture has certainly caught the imagination: might an NB propose that this feature is added to OOXML via an amendment? Conversely, the fact that a whole bunch of spreadsheet functions have been standardised in ISO/IEC 29500 (OOXML) potentially saves ODF a lot of work/pages. Certainly any new International Standard version of ODF would need a cast-iron reason to eschew borrowing any of these existing function definitions. Harmonious times may lie ahead …

In the evening Murata Makoto (who seems determined to test our Western sensibilities) took us for a meal of sea snake: a rare Okinawan delicacy. The charming old lady proprietor of the restaurant had been cooking our snake all day (we had had to place our orders yesterday). She explained that traditionally the sea snake was the food of kings, not because of rarity but because of the difficulty of preparation. Once the snake is caught it is smoked, turning it black. The snake is then boiled for one or two days (before domestic ovens this was a real chore) and at some point the many tiny bones in it have to be removed by hand.

And the taste? Well, it was certainly not like chicken. Quite chewy (so much muscle!), and a little like a gamier version of smoked mackerel. Yumsk.

SC 34 Meetings, Okinawa - Day 1

The Ruins of Nakagusuku
The Ruins at Nakagusuku

Today (apart from visiting the Ruins of Nakagusuku Castle), was mostly given over to a discussion of JTC 1 Procedure. The JTC 1 Directives collectively make an ever-surprising document — just when you think you've got your head around some point, a new paragraph is discovered which calls it all into question. When combined with jet lag, this can be heavy work.

I have been following the tweets of some fellow SC 34 people as they make their ways from various corners of the globe to join the Okinawa meetings. I expect tomorrow will see the influx of even more standards wonks into the hotel. Already there is a certain amount of "geeking out" going on. Over the breakfast table the hot topic of discussion concerned marking-up pagination decisions in documents from systems which flowed footnotes over multiple pages. And at lunch there was much debate over whether editors should prefer using "shall" to "has to" in standards documents (consensus: non-native english speakers find "shall" easier to understand).

SC 34 Meetings, Okinawa - Day 0

Okinawan Bloom
It is nice to get away from the freezing drizzle of the UK,
to the milder climes and bright sunshine of Okinawa.

I am in Okinawa for a week of ISO/IEC JTC 1 SC 34 meetings. To be precise, these are not meetings of SC 34 itself (there will be no plenary), rather the week will be taken up with two activities by parts of SC 34:

  • On Monday and Tuesday, a team picked by our Chairman will meet to discuss the maintenance procedures for ODF among themselves, and with OASIS representatives.
  • On Wednesday, Thursday and Friday SC 34’s two new working groups, WG 4 and WG 5, will meet.

These in turn will generate plenty of input for SC 34’s full Prague meeting in March.

ODF Maintenance

I have already written about the background to this activity, both the issues caused by the current lack of agreement on how maintenance should proceed, and JTC 1’s instruction to SC 34 from Nara that SC 34 and OASIS should develop a document specifying “detailed operation of joint maintenance procedures”.

At this stage the negotiations are completely informal, and expected simply to offer an opportunity for all parties to have an open discussion aimed at increasing the level of mutual understanding to a point where they are ready to start working together in earnest on drafting the agreement text. For SC 34, this text will need to be presented to members in time for consideration in Prague, at which meeting it will seek SC 34’s blessing to be passed up to JTC 1 for further consideration.

WG 4 & WG 5

Okinawa will see the first two meetings of our two new working groups, WG 4 (dedicated to maintenance of ISO/IEC 29500, aka OOXML), and WG 5 (dedicated to document file format interoperability). Both groups are expected to meet face-to-face more frequently than the rest of SC 34, and to make heavy use of the newfangled teleconferencing technology that JTC 1 has recently embraced.

WG4’s business in the short term will be largely taken up with correcting defects in the 29500 text (in JTC 1 parlance, producing corrigenda) in response to reported defects. A number of these have been submitted already, by Japan, the UK and Ecma themselves. The UK has a large number on additional ones brewing and is likely to submit a second batch in February.

WG 5’s short-term work is to concentrate on the Technical Report (a more informal document that an International Standard) being drafted which sets out some of the considerations when mapping between ISO/IEC 26300 (ODF 1.0) and ISO/IEC 29500 (OOXML). I’m wondering too whether there will be any moves in this WG to garner support for new work in this area. Now that the dust has settled over document formats themselves, even non XML experts are beginning to grok that by themselves these standards don’t actually give us that much, but are a useful foundation on which to work. “Interoperability” in particular requires so much more than simply having standardised document formats. I await developments in this space with interested anticipation …