Mastodon
Where is there an end of it? | All posts tagged 'enmark'

SC 34 meetings, Copenhagen

This week I attended 4 days of meetings of SC 34 working groups. WG 4 (OOXML maintenance) and WG 5 (OOXML/ODF interoperability). Last year I predicted that OOXML would get boring and, on the whole, the content of these meetings fulfilled that prophecy (while noting, of course, that for us markup geeks and standards wonks “boring” is actually rather exciting). There was however, one hot issue, which I’ll come to later …

Defects

Since the publication of OOXML in November last year, the work of WG 4 has been almost exclusively focussed on defect correction. To date over 200 defects have been submitted (the UK leading the way having submitted 38% of them). Anybody interested in what kinds of thing are being fixed can consult the material on the WG 4 web site for a quick overview. During the Copenhagen meeting WG 4 closed 53 issues meaning that 71% of all submitted defects submitted have now been resolved. By JTC 1 standards that is impressively rapid. The defects will now go forward to be approved by JTC 1 National Bodies before they can become official Amendments and Corrigenda to the base Standard. Among the many more minor fixes, a couple of agreed changes are noteworthy:

  • In response to a defect report from Switzerland, for the Strict version (only) of IS 29500, the XML Namespace has been changed, so that consumers can know unambiguously whether they are consuming a Strict or Transitional document without any risk of silent data loss. This is (editorially) a lot of work, but the results will be I think worthwhile.
  • As I wrote following the Prague meeting, there was a move afoot to re-instate the values “on” and “off” as permissible Boolean values (alongside “true”, “1”, “false” and “0”) so that Transitional documents would accurately reflect the existing corpus of Office documents, in accord with the stated scope of the standard. This change has now been agreed by the WG.

The ISO Date Issue

The “hot issue” I referred to earlier is ISO dates. What better way to illustrate the problem we face than by using one of Denmark’s most famous inventions, the LEGO® brick …


f*cked-up lego brick
OOXML Transitional imagined as a LEGO® brick

More precisely, the problem is about date representation in spreadsheet cells. One of the innovations of the BRM was to introduce ISO 8601 date representation into OOXML. However the details of how this have been done are problematic:

  1. Despite the text of the original resolution declaring that ISO 8601 dates should live alongside serial dates (for compatibility with older documents), one possible reading of the text today is that all spreadsheet cell dates have to be in ISO 8601 format
  2. Having any such dates in ISO 8601 format is particularly problematic for Transitional OOXML, which is meant to represent the existing corpus of office documents. Since none of these existing documents contain ISO 8601 dates having them here makes no sense
  3. Even more seriously, if people start creating “Transitional” OOXML documents which contain 8601 dates, then a huge installed base of software expecting Ecma-376 documents will silently corrupt that data and/or produce surprising results. (My concern here is more for things like big ERP and Financial systems, rather than desktop apps like MS Office). Hence the odd LEGO brick above: like those bricks, such files would embody an interoperability disaster
  4. Even the idea of using ISO 8601 is pretty daft unless it is profiled (currently it is not). ISO 8601 is a big complex standards with many options irrelevant to office documents: it would be far more sensible for OOXML to target a tiny subset of ISO 8601, such as that specified by W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes
  5. Many date/time functions declared in spreadsheetML appear to be predicated on date/time values being represented as serial values and not ISO 8601 dates. It is not clear if the Standard as written makes any sense when ISO 8601 dates are used.

The solution?

Opinions vary about how the ISO date problem might best be solved. My own strong preference would be to see the Standard clarified so that users were sure that Transitional documents were guaranteed to correspond to the existing document reality – in other words that Transitional documents only contain serial date representations, and not “ISO 8601” date representations. In my view the ISO dates should be for use in the Strict variant of OOXML only.

If a major implementation (Excel 2010, say) appears which starts pumping incompatible, ISO 8601-flavoured Transitional documents into circulation, then that would be an interop disaster. The standards process would be rightly criticised for producing a dangerous format; users would be at risk of corrupted data; and guilty vendors too would be blamed accordingly.

It is imperative that this problem is fixed, and fixed soon.

Impressions of Copenhagen

Our meetings took place at the height of midsummer, and every day was glorious (all meetings started with a sad closing of the curtains). Something of a pall was cast over proceeding by thefts, in two separate incidents, of delegates’ laptop computers; but there is no doubt Copenhagen is a wondeful city blessed with excellent food, tasty beer, and an abundance of good-looking women. Dansk Standard provided most civilised meeting facilities, and entertainment chief Jesper Lund Stocholm worked hard to ensure everyone enjoyed Copenhagen to the full, especially the midsummer night witch-burning festivities! Next up is the SC 34 Plenary in Seattle in September; I’m sure there will be many more defect reports on OOXML to consider by then, and that WG 4's tireless convenor Murata-san will be keeping the pressure on to mantain the pace of fixes  …


Jesper Among the Beers
Jesper Lund Stocholm