What InDesign Cannot Do (or Do Well) with XML

The 1:1 Import Conundrum

It's fairly natural to expect that you could use one piece of XML data in multiple places in an InDesign layout—but that's not at all the way that InDesign works. Once you've imported XML, there is a one-to-one correspondence between the elements in the Structure view and their expression in the layout. If you want an element to appear multiple times, you've got to duplicate the element for each appearance on a document page. (Obviously, you can get around this in some cases by placing the XML element on a master page.)

Olav Martin Kvern and David Blatner, Real World Adobe InDesign CS4

As the quote above states (and this is still true for InDesign versions up to CS5), the expectation is that you import one XML file to fill one content area (text flow) in your InDesign document. This is contradictory to the spirit of XML, which is all about reuse of content in multiple documents and in multiple ways. For example, you might want a standard warning or copyright or other block of content to appear in many places in a single document of a set of documents collected as a book. However, from the Structure pane, you cannot drag the same piece of structure into multiple locations in an InDesign document. If you drag an element into the layout a second time, InDesign will remove it from its first location in the layout.

Bad Characters

Note regarding CS2 only: InDesign CS2 XML export controls are more limited than those in InDesign CS3 and later. CS2 does not have the Remap Break, Whitespace and Special Characters option. As a consequence, the XML that you generate from InDesign CS2 may contain characters used in publishing applications that are problematic in XML processing. Chief of these are the characters that make paragraph and manual line breaks in the text layout. XML doesn't use these types of characters, and depending on the processes you run after exporting XML from InDesign, you may have to clean up the XML to remove these types of characters.

Example 14. Unwanted characters (square) in XML exported form InDesign CS2

☐Course offerings subject to change. Please check the web site for current course offerings.☐

Some typographic controls may generate characters even in CS3-5 that are not XML-compatible. Adobe warns about this in the Help section about exporting XML: "Not all characters are supported in XML (such as the Automatic Page Number character). InDesign warns you if it cannot include a character in the XML file."

In all versions of InDesign CS: Related to the "bad characters" export problem above is the issue of imported XML that might contain tabs, spaces, and line breaks. Often this is seen in applications that "pretty print" XML files with indents and coloring to make them easier to read. For example, I use XML Spy by Altova and SynchroSoft Oxygen Editor. When the pretty-printed XML created in XML Spy is used for import, it creates unwanted effects in the layout.

To get a clean import, it is sometimes necessary to edit the text in a text editor to remove the tabs and spaces, play with the import dialog whitespace controls (Do not import contents of whitespace only elements in CS3 and later), or to run an XSL transformation to remove line endings and tabs from the XML before importing it into InDesign.

Inscrutable Errors, Messages, and Crashes

The Devilish DTD suggestions

Are you missing a required attribute? Have you forgotten to put a required element in your structure? The validation window at the bottom of the structure pane will tell you the sad story of your incompetence with the DTD, but the suggestions it offers won't always tell you enough about fixing the problem—see the section on Validating XML in InDesign for more information.

Exporting from the element with the included DTD will not be valid

Several times when I had a DTD included in the XML that I was exporting, and checked the box to include the DTD declaration on export, I saw a message that the XML I was exporting was not going to be valid using the DTD. It seemed to me at the time that the message was bogus, as I had validated the content with the DTD before export. I opened the exported content in XML Spy to check it, and found that there was some kind of invisible (line break) character in the XML between elements. When I switched to EditPlus and looked at the same file, I saw square box characters in these places in the XML file. I had to do a search and replace on that character to get an XML file that would validate in XML Spy. This is related to the issue described in the topic "Bad characters."

Don't make InDesign "think" too hard on import or export with XSL

It seemed to me that the most likely times to make InDesign crash were if I tried to get too fancy with my XSLT. I am accustomed to being able to sort, filter, wrap and unwrap elements, make substring operations on text in elements, and other tricks of the XSL trade. If I used these types of functions in XSLT that I was using when importing or exporting XML with InDesign, sometimes it didn't work, and sometimes it froze the application. See Advanced Topics: Transforming XML with XSL. My recommendation, if you need to do a lot of fancy manipulation of your XML, would be to use XSLT as a pre- or post-processing step external to InDesign.

InDesign CS5: XML Structure option for exporting XHTML

This aspect of exporting XML relates to CS5 only. Related content is under the section Exporting XHTML when XML is in your InDesign file.

When you tag content in InDesign, the order of the elements can be seen in the Structure pane. If you first tag a heading as <title>, then the author's name as <byline>, then you jump across the page spread and tag a text frame as <sidebar>, then you return to the main story and tag it and its paragraphs with various XML tags, that will be the XML order of the content. In other words, you do not have to be in a top-to-bottom and left-to-right sequence of tagging the content with XML.

In InDesign, if you check "Base On Page Layout" when exporting XHTML, the sequence of the content is determined by the X-Y coordinate system starting with the upper-left corner of a page, then to the next page of the spread containing the XML, and so on. If you tagged XML content in exactly that order, this setting will work fine, as long as the required sequence of elements in the DTD for your XML and the order of the XML content in InDesign are the same.

If you check "Same As XML Structure", then your XML will be exported in the sequence it was tagged. Again, the content may or may not be valid, depending on whether that order matches your DTD, if you use a DTD.

In most cases, using the "Same As XML Structure" will provide a closer match to your DTD.

You can reorder XML structure to match the DTD in use in InDesign (or a different DTD) with an XSLT file applied during export. For example, if you need to get that <title> element always as the first element inside your root element, then the XSLT used on export can process the <title> element first, then go on to process other elements in the order that you need. XSLT can also rename, group, sort, add or rename attributes, wrap and unwrap XML elements and add a reference to a DTD or schema to your exported XML. Sometimes you will need to chain together several XSLT transformations to get the final XML that you want. See Advanced Topics: Transforming XML with XSL.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.6.77