Notes on Document Conformance and Portability #1

Richard Gillam’s handy book, Unicode Demystified: A Practical Programmers Guide to the Encoding Standard, contains an example of right-to-left text appearing in a prevailing left-to-right writing direction:

Avram said “מזל טוב.‏” and smiled.

Whether you see here what you are meant to see here will depend on your browser's Unicode support, and whether you have Hebrew fonts installed. Properly rendered, it will look something like this:

In reading order, the first character after “said” is the “מ” character to the left of the closing quotation mark. The text then runs from right to left until the full-stop, and then resumes with “and smiled”. In Unicode, this text is not represented in rendering order, but reading order – it is up to the renderer to make space and reverse direction at the correct points. Here is the text represented as XML in a paragraph in an ODF document (get the document here):

<text:p>Avram said “&#x5de;&#x5d6;&#x5dc; &#x5d8;&#x5d5;&#x5d1;.&#x200f;” and smiled.</text:p>

One of the great things about XML is its solid basis in Unicode and therefore its use of the Universal Character Set (ISO/IEC 10646). XML defines a number of encodings for this character set, and in the XML above the numeric character reference mechanism is used for the Hebrew characters. Notice, just to the left of the full stop the use of U+200F 'RIGHT-TO-LEFT MARK' which specifies that the full stop is part of the right-to-left character sequence.

Viewing this document in three ODF applications (OpenOffice 3, Google Docs with FireFox, and the new MS Office 2007 SP2) give the correct result every time. That is good news.

And if, for an ODF application, the character sequence did not appear correctly (if, say, the full stop was out-of-place) we would be able to say unequivocally that it was faulty; and we would be able to point to the Unicode specification where the correct behaviour was described. We (the user) would be able to bang the table and demand the bug was fixed.

This kind of process is one one of the pillars of conformance testing: application conformance testing, to be exact. Where we have a solid spec and observable behaviour we can compare the two and make a judgement.

Where we don't have a solid spec, things get trickier. For the standardiser's viewpoint, and if its not too highfalutin (and anyway, I claim Cambridge resident's special rights), we might want to quote Wittgenstein on such occasions: "Whereof one cannot speak, thereof one must be silent".

Comments (13) -

  • Alex

    5/8/2009 1:43:20 AM |

    Truly we Illuminati have been rumbled!

    Luckily they didn't find out about Kevin Bacon ...

  • anonymous-insider

    5/8/2009 2:05:43 AM |

    Let me bring back an important issue...

    **********************************************

    Apple v. EFF: The iPhone Jailbreaking Showdown

    By David Kravets May 2, 2009 http://www.wired.com

    PALO ALTO, California – To jailbreak or not to jailbreak the iPhone.

    That was the heated topic of discussion late Friday between Apple’s iPhone marketing czar Greg Joswiak, Fred von Lohmann, the Electronic Frontier Foundation’s copyright genius, Copyright Office officials including registrar Marybeth Peters, the record labels, movie studios and software industry.

    Apple vigorously opposed authorizing jailbreaking, saying copyright protections is what gave birth to the iPhone, the 1 billion app sales, 50,000 app developers and 35,000 apps. The EFF vigorously urged the Copyright Office to authorize jailbreaking, which in this case is hacking the phone’s OS, and hence allowing consumers to run any app on the phone they want, including those not authorized by Apple.

    “It is my automobile at the end of the day,” von Lohmann said, a reference that iPhone users should be allowed to do what they want with their phones, just like car owners do.

    At stake for Apple is the very closed business model Apple has enjoyed since 2007, when the iPhone debuted. More than 30 million have been sold so far. “This would severely limit our ability to continue what we are doing as well as innovate for the future,” Joswiak said.


    The panelists squared off here as part of the Copyright Office’s three-year review on whether to grant exemptions to the Digital Millennium Copyright Act of 1998. The act forbids circumventing encryption technology to copy or modify copyrighted works – in this instance encryption protecting the bootloader connected to the OS operating system itself.

    Neither Peters nor the three Copyright Office attorneys at the three-hour hearing here tipped their hats on whether they would recommend the librarian of Congress to grant the exemption, a decision expected later this year. Still, the changeover seemed unlikely as the Copyright Office has repeatedly denied consumer-friendly oriented fair use changes, such as requests to make up backup copies of DVDs or video games, as well as requests for exemptions to enable copying DVDs to laptops and portable devices.

    Despite the iPhone’s popularity, about three-dozen people attended the marathon, three-hour jailbreaking hearing here at Stanford University. The hearing came just months before Apple releases its OS 3.0, its latest operating iPhone system. There is an estimated 1 million-plus jailbroken iPhones, von Lohmann said.

    Apple said jailbreaking would amount to such a major modification – for example with apps turning the iPhone into a WiFi center — that the law does not permit it. Apple maintained allowing any app on the iPhone could be detrimental to the phone’s functionality that Apple will be overrun by service calls from angry customers. It also goes against the agreements it has with its 30 phone-connection carriers worldwide, Joswiak said.

    Ben Golant, the Copyright Office’s assistant general counsel, asked whether AT&T, the exclusive provider for the iPhone in the United States, “prohibits you from implementing certain applications?”

    “We don’t allow any bandwidth hogs,” Joswiak said. He added that the Cupertino company does not allow apps associated with porn or other distasteful content, including a so-called “Baby Shaker” app barred last week.

    The EFF, of San Francisco, requested the exemption.

    Apple (.pdf) fears opening its iPhone platform to non-approved apps could cost it money. It earns 30 percent for every app sold, but would get nothing from those not sold via iTunes. Most important, however, it fears that opening up the OS would lead to piracy of sanctioned iPhone apps as well as create a giant iPhone platform to play and copy infringing content like movies and games.

    The Motion Picture Association of America, the Recording industry Association of America, the Business Software Alliance and others said the DMCA does not allow  circumvention exemptions to create a “venue for infringing activity.”

    “The impact will be to open up fast fields for the manufacturers and purveyors of pirated games,” said Steve Metalitz, a representative for those groups.

    But von Lohmann countered and said the exemption, which would apply to all mobile phones including Google’s Android platform, is warranted because opening a venue for third-party apps is, by itself, a non-infringing activity the DMCA authorizes.

    “This is a close ecosystem of a business model,” von Lohmann said, adding:
    “I don’t think Congress meant that when they passed the DMCA.”

    The DMCA, which President Clinton signed ten years ago, dictates “no person shall circumvent a technological measure that effectively controls access to a work protected under this title.”

    But under the law, every three years the Librarian of Congress is charged with considering the public’s request for exemptions to that anti-circumvention language.

    An exception adopted during the last review in 2006 granted mobile-phone owners the right to circumvent the technological locks on their phones.

    Doing that allows users to switch phone carriers without buying a new phone. That is up for review again this year, and Peters and the other members of the Copyright Office entertained proposals to extend that for another three years as well.

    Over the decade, a handful of other exemptions have been granted. They include circumvention of anti-copying restrictions on DVDs for the purpose of making compilations of portions of those works for educational use in a classroom. Another was directed at the blind, allowing the circumvention of an e-book’s shuttered read-aloud function. Another allows the circumvention of access controls on CDs to research for security flaws.

    Copyright 2009

    **********************************************

    Groklaw wrote:

    "It may be your automobile; but it's Apple's business, associated with the apps, not the gadget alone, and it's Apple's brand. Brand matters, and brand is built on quality. If, for example, someone passed a law that everyone could put whatever they wish on Groklaw and I lost control over that, I'd shut Groklaw down rather than see it ruined. From dealing with spammers, I know exactly how ruined it would quickly be. Apple says they block bandwidth hog apps and porn and other things they don't want to be associated with, like Baby Shaker. I'd want to be able to do that too, because it affects the brand, not just the business model. At the end of the day, this will be decided on the basis of the legal issues the EFF has raised, but common sense tells me that Apple won't sell certain gadgets that people really love if this exemption is approved. This is part of the periodic review of the DMCA, and the article has a link to the 9 proposed exemptions being considered now."

    Screenshot: 2.bp.blogspot.com/.../groklaw-perspective.jpg

    Is Groklaw against interoperability on the iPhone?

  • Alan Bell

    5/8/2009 5:40:36 AM |

    It is indeed a tricky issue to assess conformance and interoperability to a standard. I can think of only one way to half the workload.
    From what I can work out it seems that wordprocessed documents are now reasonably interoperable as .odt however spreadsheets are a basket case thanks to a lack of a formula specification in ODF 1.1 and Microsoft just dumping in their own formula syntax in a new namespace rather than attempting to be interoperable. Actually I don't think they made the worst decision in this aspect, if they tweaked their formula language to be mostly like OpenOffice.org and used the same namespace there would be weird edge cases where interoperability failed (probably date handling). As it is they broke it altogether. The really bad thing they did is ripping out other people's formulas when they open and save a spreadsheet, it then saves just the values of the most recent calculation. I think that is pretty bad application behaviour, but as you point out it could still be compliant technically.
    So if odt works now and ods is a future goal then progress is being made.

    @anonymous-insider what is the point you are attempting to make? what does it have to do with document standards?

  • Alex

    5/8/2009 4:08:13 PM |

    @Alan

    Something I'd like to have more facts on is how interoperable the OO.o "convention" really is. "Maya's Wedding Planner" obviously isn't a sufficient guarantee that this (undocumented) technology is fit for purpose, safe, or consistently implemented.

    There are some problems with Microsoft's .odt implementation (as there are with other implementations also)  - I'll come to those in a bit.

  • Alan Bell

    5/9/2009 1:57:57 AM |

    I could link to the PDF, but for irony's sake here is the sc.openoffice.org/.../...tion%20of%20Functions.doc .doc file where function results have been carefully compared across various products to assess interoperability and improve compatibility. OOo makes real efforts to be interoperable.

  • Alex

    5/9/2009 3:00:57 PM |

    @anonymous-insider

    I'm not sure which is the less valuable: the tinfoil brigade's chunterings, or your commentary on it!

  • anonymous-insider

    5/10/2009 3:09:03 AM |

    I'm always baffled by the Eminence Rouge of Groklaw!

  • Alex

    5/10/2009 3:22:48 PM |

    @Alan

    That certainly looks useful -- but is a long way from offering comfort. Showing that an implementation has black box behaviour (nearly) identical to other applications doesn't guarantee accurate or predictable behaviour for this "convention". Behaviour really needs to be precisely detailed in all respects - or in other words, standardized.

    Microsoft's Doug Mahugh has posted a new piece at blogs.msdn.com/.../1-2-1.aspx - this seems to call the safety of the "convention" into some doubt...

    - Alex.

Comments are closed