Michiel Leenaars Michiel Leenaars Michiel Leenaars Michiel Leenaars

 

DIS 29500 * DIS 29500 *DIS 29500 *DIS 29500 *DIS 29500 *DIS 29500 *

Comments on DIS 29500

Michiel Leenaars

Back to central DIS 29500 page

Author acronym Clause No./ Subclause No./ Annex/Page Type of comment (ge = general te = technical ed = editorial ) Comment (justification for change) Proposed change Secretariat observations on each comment submitted

MAGJL



te

New graphics formats such as DrawingML and VML are not the expertise of this particular subcommittee nor of ISO/IEC JTC 1/SC 34 and require other expertise than is available.

Any such components need to be extracted from DIS 29500 and resubmitted to ISO/IEC JTC1/SC24. Alternative to resubmitting existing standards with similar functionality might be normatively referenced by DIS 29500.


MAGJL

2.15.1.28
(p. 1941)

3.2.29
(p. 2698)

3.3.1.69 (p.2786)


te

Hashing algorithms for password protection and other security measures are not the expertise of this particular subcommittee nor of ISO/IEC JTC 1/SC 34.

Any such components need to be extracted from DIS 29500 and resubmitted to ISO/IEC JTC1/SC27. Alternative to resubmitting existing standards with similar functionality (such as ISO 10118-3) might be normatively referenced by DIS 29500.


MAGJL

All


ge

The proposed compound standard is unmaintainable and not workable due to its size and vast differences in maturity, relevance of its components, textual style and area of expertise required to evaluate them (see previous comments). Procedures like the ISO fast track are simply outscaled by DIS 29500 which as a family of document formats in an area of economic importance deserves to be maintained better than a single, slow-moving blockbuster specification will allow. Current assessment is impossible and any significant edit of one of the minor components in its current form requires publication of a new version of the whole standard. This which would create an unnecessary update pressure and bring forth high costs as well as conformance chaos among implementers - and cause much uncertainty for users whether they can still safely use critical applications based on DIS 29500.

The main categorisation between WordprocessingML, SpreadsheetML and PresentationML seems undesirable as it reflects historical application needs on the presentation level rather than trying to generically represent abstract types of document content in the broadest sense across many different applications. I.e. WordprocessingML and SpreadsheetML might profit from better templating facilities such as available in the Master elements of PresentationML. There are two separate printer settings in WordprocessingML and SpreadsheatML that should be integrated. There are seperate items on drawing objects in different application.

The context in which content is placed should be used to render its content, not application specific tag jungle. Instead of application-centric clustering possible clustering could take place around representation of text, representation of embedded media, representation of math and calculations, representation of formulae, presentation styles, container formats etc. Note that ISO/IEC 26300 can be reused as a basis for most if not all tags.

Withdraw DIS29500 in its current form and resubmit with a lightweight document container format. There are at least two options:

The best option would be to reduce DIS 29500 in size by taking out duplicate functionality with ISO/IEC 26300 and rewriting any functionality as an extension to ISO/IEC 26300 as this standard explicitly allows new namespaces in order to be able to adopt new functionality (see section 1.5 of ISO 26300).

The alternative is to resubmit a shaved standard with all supporting MLs as separate standards in the appropriate ISO committees (see earlier remarks about the lack of expertise on security and graphics in this subcommittee). Support for existing alternatives to supporting ML's such as SVG, SMIL etc. should be added to give the market choice on what formats to use and produce.



MAGJL

All


ed

Since DIS 29500 in its final form cannot be identical to the Microsoft Office 2007 binary format, and the docx/xlsx/pptx extensions used in many examples reference files produced in the latter application a new set of file extensions will need to be created in order not to confuse the market place and protect existing users of MS Office.

Invent new extensions for these application files and rename file examples throughout the specification.


MAGJL

2.3.1.8, 2.4.7,
2.4.8,
2.4.51
2.4.52,
2.8.2.16
2.8.7, 2.15.1.86 2.15.1.87, 2.18.11, etc.


te

Inclusion of bitmasks in XML components of DIS 29500 adds significant cost in complexity and performance, as well as in the area of security. Bitmasks in the spec also break functionality in existing XML tools and are a vast danger to interoperability with other XML-related standards. Additionaly, many bitmasks in DIS 29500 are of fixed length (i.e. have a fixed number of bits) which means that extensibility is extremely limited.

All bitmask functionality throughout the specification should be replaced by XML alternatives, e.g. attributes.


MAGJL

p. 10


te

Text states: preserving investment in existing files

Contrary to what is stated here it is unclear to me how DIS 29500 can and will faithfully represent the internal meaningful structure of so called 'complex' binary Office 95-2007 documents that have potentially many different 'streams' spread meaningfully across the binary format. The level of detail of description in the documentation of proprietary Office 95-2007 binary file format documentation (p. 21 - ) is insufficient to judge the size of this problem reliably.

Unless the statement can be publicly demonstrated by a roundtrip this statement needs to be altered into: "Your documents are transformed into another document model. Depending on how you have stored your documents and there is a chance that DIS 29500 is unable to retain some characteristics of historical Office documents".


MAGJL

all



te

The division between normative and informative sections seems in some cases incorrect and would leave important gaps that will create significant implementation problems.

Where necessary change informative sections to normative.


MAGJL

all


ed

XML name spaces should not reference individual commercial stakeholders' domain names or domain names of dependent entities like schemas.openxmlformats.org.

Inconsistencies with name space usage are hard to find because no overview of all name space prefixes is available anywhere in the document.

Use the purl.oclc.org domain as for other SC34 ISO standards exclusively.

Provide an overview of all namespace prefixes used in the specification.


MAGJL

p.3032


ge

Unclear reference:

For more information on balanced hierarchies, see the documentation provided for your OLAP server.

This induces risk of misaligment of applications due to different vendor definitions.

Specify exactly what balanced hierarchies mean in this context.


MAGJL

15.2.14 (p. 166)


ge

Use of binary content element e.g. DEVMODE containing binary printer software poses an unacceptable security threat across all platforms (risk of malicious payload being smuggled in, either loaded directly by the application or through some form of scripting) nor is it compliant with the platform-agnostic nature of the spec.

Shared ownership of operating system and DIS 29500 compliant producer software is not going to be the common denominator, which may have legal consequences for the ability to include software code in documents.

The limitation to only allow one printer setting in SpreadsheetML will frustrate collaboration across platforms.

Delete all binary content elements and replace with textual alternative or other reliable mechanism, similar to XML schema for network device configuration registered by Microsoft at EPA under EP1555789 - 2005-07-20.

Allow multiple printer settings to be included for cross-platform interoperability and roundtrip purposes.


MAGJL

12.3.5


te

Again unspecified binary part that cannot be checked by security tools infers risk of malicious payload being smuggled in.

Delete.


MAGJL

3.11.1.25


te

undo (Undo) is an application level run-time issue. What is the need for it in this spec? Why does it need to be adressed in the specification of a file format?

Specify or delete.


MAGJL

3.11.2.2


te

3.11.2.2 users (User List) is an application level run-time issue and need not be addressed in the specification of a file format.

Delete


MAGJL

3.13.8


te

No proper syntax is provided for linking to remote file systems and methods, although it is stipulated elsewhere that both 'relative and absolute references' are valid. The given example (c:\source.xlsx) only is relevant for accessing the local disk on the Microsoft Windows platform, but would fail in many other important (cross-platform) areas including non-Windows MS Office implementations.

Specify.


MAGJL

p. 1, all


ed

Since the file format described by DIS 29500 is not actually just XML but a binary container file format including an internal file structure with interdependencies the proposed 'friendly' name Office Open XML does not accurately describe the proper technical status of the proposed standard. Also, the proposed document format is not limited to use in office environments but is intended to be used by private persons as well.

Change 'friendly' name of ISO specification to Document Binary Container Format.


MAGJL

all


ge

Review possible issues with ISO Technical Report 9573:1988 and ISO 12083:1993 "Electronic manuscript preparation and markup".

Various.


MAGJL

4.1, p.548, 549

4.1.2.8

4.1.3

ed

Text states

"There is a set of utilities that facilitate the storage of customer XML data within the file format. Although a topic for a separate paper, essentially, this functionality comes down to the ability to store customer-defined XML in the file format in a way that it can be easily queried, modified and/or surfaced in the presentation. Suffice it to say, the data is stored in a separate part within the package, and hence the utility pairs the object using it with the part within the package."

It is totally unclear what this text means with utilities. The reference to 'papers' reoccurs on page 549 ('In addition to structural and presentation-level data defined by this schema, there are also definitions for handling customer data and future extensibility. Again, both of these will be addressed in additional papers.') Text seems imported from internal documentation.

Specify how and where customer XML data can be stored in the file format for a presentation to be able to easily query, modify and/or surface this data. Supply additional information as promised. Provide normative reference to which XML version is intended. Clean up wrong statements.


MAGJL

?


te

It is seemingly not specified in the specification where and how scripting is embedded in the ZIP container file, which is necessary for interoperable implementations of DIS 29500. Deletion is probably not in line with the need to be compatible with legacy formats.

Add specification.


MAGJL

15.2.8 (p. 155)


te

Control elements should be defined normatively and fully, not imported from specific versions of operating environments API's.



MAGJL

4.1.4.1


te

DIS 29500 incorrectly does not reference proper W3C and IETF recommendations and RFC's for HTML/MHTML rendering but instead stipulates use of unspecified 'web browser generations':

"Indirectly, the HTML Publish properties can prime the Web Properties by defining a target web browser generation (i.e., third, fourth or third and fourth). This is done by setting the appropriate ST_HtmlPublishWebBrowserSupport attribute".

MHTML is limited in scope to email applications and does not at all conform to any web browser generation.

Translate to support for versions of actual W3C standards such as HTML 3.2, XHTML 4.01, different CSS versions, and RFC 1942, 2557.


MAGJL

4.1.4.1.1


te

Suggested publication of presentation on the web in frames is deprecated behaviour as content is likely to be reached mid-way through search engines like Google. This will pose accessibility problems.

Find alternative presentation that is not troubled by accessibility problems.


MAGJL

4.3.3, p. 564


te

Comment Author List contains an unnecessary component called Color Index (clrIdx). This defines an integer into a color table that is used to provide the solid background fill for the comment shape. The utility that this provides is that all of the comments by a particular author share the same color.

Should be replaced by a proper color or color name, or be deleted.


MAGJL

4.3.3, p. 564


te

Last Index (lastIdx) mirrors data that is implicit in the text already. It documents how many comments an associated author has made in a presentation. This value needs to be updated too much, is error-prone and does not belong in the spec. It can be replaced by application level functionality.

Delete.


MAGJL

5.1.5, p. 578


te

'Legacy drawings' break compatibility with other consumers and producers than the generating application. "Compatibility deals with the notion of legacy drawings. Legacy drawings are objects that were supported by previous versions of a generating application, but are no longer provided as an option. In order to store these drawing objects correctly, we introduce the notion of legacy drawing compatibility. This allows for the specification of information used to identify this legacy object and thus allow for full rendering support within current versions of the generating application."

Fully specify legacy drawing formats.


MAGJL

5.1.5, p. 578


tc

There seems to be no proper specification of what the locked canvas actually does or looks like in DIS 29500, where it is created and how it can be unlocked - and thus leaves it open under what conditions other consumers and producers will have problems accessing said content. It is unclear from the description of behaviour what the relevance of this is in the current specification as it seems an application level issue; yet it might pose significant problems for interoperability (also with current installed base) and should be solved.

Locked Canvas is a minor topic that is similar to compatibility in that it is used to render drawing objects that would otherwise not be recognized due to a lack of information. Locked Canvas, however, goes in the opposite direction from compatibility, and deals with objects that have been created and saved in the current version of a generating application and are being opened in a previous version of the generating application. The locked canvas element acts as a container for more advanced drawing objects. The notion of a locked canvas comes from the fact that the generating application opening the file cannot create this object and thus cannot perform edits either. Thus, the drawing object is locked from all UI adjustments that would normally take place.

Further describe or delete.


MAGJL

9.1.8, p. 27


te

Unnecessary future incompatibility risk:

ZIP archive items that do not conform to OPC part naming guidelines or are not associated with a content type shall not be allowed in an Office Open XML document, with the exception of items specifically defined by Part 2: "Open Packaging Conventions" and trash items.

Add a mechanism to add new features at ZIP item level in future specs without breaking DIS 29500.


MAGJ

8.3 (p. 769)


ed + te

Future proof-ness is not taking into account multiple vendor scenario. When referencing a single (fictitious) PresentationML consumer/producer called PML is used, which makes too many presumptions about vendor control. A 2003 version of PML is used which is irrelevant to future proofness as no DIS 29500 existed at the time.

Rewrite text to properly reflect multiple implementation scenario.


MAGJL

p. 812, 813


ed

Apparently some missing characters and elements (PDF version of ECMA 376).

?


MAGJL

p. 1950


ed

Table contains improper word usage that may break futureproofness.

"Undefined. Shall not be used."

Replace with "reserved".


MAGJL

p. 823


ed

Trailing " and > missing in example:

<w:bottom … w:space="24/>
</w:pgBorders



MAGJL

p. 1966


ed

Incomplete sentence.

Consider a WordprocessingML document which should no visual indication of form fields.

Complete.


MAGJL

2.15.1.40 and elsewhere


ed + te

Track changes does not belong to WordprocessingML parts alone and therefore the described behaviour and definition of elements like doNotTrackMoves deserves a seperate supporting ML.

Move up to a higher level by itself.


MAGJL

p. 169


te

XML content without proper name space:

The root element for a part of this content type shall be xml in the null namespace, encapsulating an arbitrary amount of VML markup as defined by this Standard.

Since this part will be moved out of the spec, add name space before resubmission to ISO/IEC JTC1/SC24.


MAGJL

Annexes


ge

Review period should be extended in order to review (very sizeable and complex) annexes that were initially missing as ECMA staff had not submitted them to ISO.

Extend final voting procedure for three months.


MAGJL

many


ge + te + ed

Many of the answers in the Response Document for Fast Track Ballot of ISO/IEC DIS 29500 (ECMA-376) point towards resolution in the five month ballot period. Many answers are also generally very unsatisfactory and in the light of the five month review procedure need to be discussed more extensively - especially the parts about legal issues, presence of non-ISO-conformant elements etc.

T.b.d.


MAGJL

all


ge

Due to the limited time available in the fast track procedure regrettably only a small subset of the spec could be reviewed. However, review of similar (openly documented) review efforts by other national standards bodies such as from the UK, USA, Czech Republic and India as well as community efforts like Grokdoc have provided the insight that many hundreds if not thousands of other legitimate technical and editorial issues exist – many of which in important areas and certainly complementary to the comments raised by us so far. In order to make DIS 29500 a successful candidate we need to make sure that all significant comments are submitted to ISO, but due to its ailing consensus procedure one cannot be certain until the last minute. Since these reports are in most cases already publicly available through the web, and I do not want to repeat all of them under my own name as this would burden the Netherlands mirror committee and ISO secretariat unnecessarily at this point, I would like to reserve the right to use the comments available in these resources at the moment of writing in the Netherlands discussion where necessary. If necessary I can specify to which sources this proviso would apply.

T.b.d.


Michiel Leenaars Michiel Leenaars Michiel Leenaars Michiel Leenaars