|
Microsoft Corporation November 2003 © 2003 Microsoft Corp. All rights reserved. The information contained in this document represents the current view of Microsoft Corp. on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording or otherwise), or for any purpose, without the express written permission of Microsoft Corp. Microsoft may have patents, patent applications, trademarks, copyrights or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights or other intellectual property. Microsoft, ActiveX, Outlook, and Visual Basic are either registered trademarks or trademarks of Microsoft Corp. in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. Microsoft Corp.—One Microsoft Way—Redmond, WA 98052-6399—USA ContentsIntroduction IntroductionThis document describes the elements in the WordprocessingML Schema that are important to document developers and to application developers whose programs will read and write WordprocessingML documents. The text assumes that you have a basic understanding of XML 1.0, XML namespaces, and the functionality of Microsoft&174; Office Word. Each major section of this document introduces new features of the language and describes those features in the context of concrete examples. In this document, you'll see how to:
Following this introduction to WordprocessingML is a reference to the WordprocessingML elements that are most useful to developers. Structure of This DocumentAfter an initial overview of WordprocessingML and document-level properties and information, this white paper looks at WordprocessingML topics in the order that developers will presumably need them. This structure means that some elements are not discussed in detail in one location. For instance, the documentProperties element contains elements that affect how fields and headers are handled. As a result, the child elements of the documentProperties element are discussed in two different places in the document.
Section 1: WordprocessingML OverviewTop-Level Elements, Namespace, Basic Document StructureThe top-level elements in a WordprocessingML document are:
However, the simplest Word document consists of just five elements (and a single namespace). The five elements are:
The namespace for the root WordprocessingML Schema (also known as the XML Document 2003 Schema) is "http://schemas.microsoft.com/office/word/2003/wordml". This namespace is normally associated with the WordprocessingML elements by using a prefix of "w." The simplest possible WordprocessingML document looks like this: <?xml version="1.0"?>
<w:wordDocument
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
</w:wordDocument>
In Figure 1, you can see the resulting document, displayed in Microsoft Word.
Figure 1. A WordprocessingML document in Microsoft Word Tying the Document to Microsoft WordIf you save a Word document with the .xml extension, Windows will treat
the file like any other XML file. Double-clicking the file, for instance,
will open it in the standard XML processor (usually Microsoft Internet
Explorer). However, adding the <?xml version="1.0"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
</w:wordDocument>
A side effect of this automatic behavior, however, is that it prevents the display in Internet Explorer of the XML markup of XML files saved by Word. You can temporarily disable this behavior by deleting the following registry entry and value Word.Document = "application/msword" from the following subkey: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\11.0\Common\Filter\text/xml Section 2: Adding Text to the DocumentThe document's content is held in the body element. Text within the body element is kept in a nested set of three elements: t (a piece of text), r (a run of text within a paragraph), and p (a paragraph). The t (Text), r (Run), and p (Paragraph) ElementsThe lowest level of this hierarchy is the t element, which is the container for the text that makes up the document's content. You can put as much text as you want in a t element &151; up to and including all your document's content. However, in most WordprocessingML documents, long runs of text will be broken up into paragraphs and strings with different formats, or be interrupted by line breaks, graphics, tables, and other items in a Word document. A t element must be enclosed by an r element &151; a run of text. An r element can contain multiple occurrences of t elements, among other elements. The r element allows the WordprocessingML author to combine breaks, styles and other components but apply the same characteristics to all the parts of the run. All of the elements inside an r element have their properties controlled by the rPr element (for run properties), which is the first child of the of the r element. The rPr element, in turn, is a container for a set of property elements that are applied to the rest of the children of the r element. The elements inside the rPr container element allow you to control, among other options, whether the text in the following t elements is bold, underlined, or visible. SectionsIn a WordprocessingML document, the layout of the page that your text appears in is controlled by the properties for that section of the document. However, there is no container element for sections in WordprocessingML. Instead, the information about a section is kept inside a sectPr (section properties) element that appears at the end of each section. Though a sectPr element isn't necessary in a WordprocessingML document, Word always inserts a sectPr element at the end of any new document that it creates. Here is a typical sectPr element generated by Word when a document is created: <w:sectPr>
<w:pgSz w:w="12240" w:h="15840"/>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" w:gutter="0"/>
<w:cols w:space="720"/>
<w:docGrid w:line-pitch="360"/>
</w:sectPr>
When new sections are added to a WordprocessingML document, the new sectPr elements must appear inside pPr elements (which are discussed later) inside p elements. This example shows a sectPr element added to a document to mark the end of a section: <w:p>
<w:pPr>
<w:sectPr>
<w:pgSz w:w="6120" w:h="7420" />
<w:pgMar w:top="720" w:right="720" w:bottom="720"
w:left="720" w:header="0" w:footer="0"
w:gutter="0" />
</w:sectPr>
</w:pPr>
</w:p>
Each sectPr element marks the end of a section and the start of a new section. The child elements of the sectPr element provide the definition of the section just ended. All the child elements for the sectPr element are listed in Table 4. While WordprocessingML does not have a container for sections, Word does generate sect elements that act as containers for sections. These are not part of WordprocessingML but belong to the Auxiliary XML Document 2003 namespace ("http://schemas.microsoft.com/office/word/2003/auxHint"). The sect elements (and other auxiliary elements) are discussed later in this document. Organizing TextThe following example has multiple t elements inside an r element (for the following examples, only the body element and its children are shown): <w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
<w:t> How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
Although this document is valid, duplicating the t element isn't necessary. Therefore, the following example would give the same result as the previous one: <w:body>
<w:p>
<w:r>
<w:t>Hello, World. How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
Inserting BreaksTypically, if you have multiple t elements in an r element, it's because you need to insert some other element in between the pieces of text. In the following example, a br element appears between the two t elements. The br element will force the second t element to a new line when the text is displayed in Word: <w:body>
<w:p>
<w:r>
<w:t>Hello, World. </w:t>
<w:br w:type="text-wrapping"/>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
The br element's type attribute allows you to specify the kind of break ("page", "column", "text-wrapping"). Because the default is "text-wrapping" (a new line), the type attribute in the previous example could have been omitted. Figure 2 shows the results of using a br element between r elements.
Figure 2. A Word document with a br element between t elements Creating ParagraphsYou use p elements to define new paragraphs (a br element with text-wrapping is equivalent to the "soft break" in Word that's created by pressing SHIFT+ENTER and doesn't start a new paragraph). A WordprocessingML document with text in two separate paragraphs would look like this: <w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
The resulting document can be seen in Figure 3. As comparing Figures 2 and 3 shows, depending on your formatting options, the difference between using br elements and p elements may not be visible. The display of a WordprocessingML document in Word may not reveal the underlying structure of the document.
Figure 3. A Word document with multiple p elements TabsThe tab element allows you to position text horizontally on a line. Tab elements move the following text to the next tab stop. Exactly where on the line that will be depends on how tab stops are defined in the document. In this example, the text will appear on a single line but with each t element's text positioned at a separate tab stop: <w:p>
<w:tab/>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
<w:tab/>
<w:r>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
Tab stops are defined in the pPr element, which is also a child of the p element. Within the pPr element, you can set the tab stops for the paragraph by using tab elements with the tabs element. Three attributes on the tab element define the tab stop:
For example, this paragraph has three tab stops at 1 inch (1,440 twips), 3 inches (4,320 twips), and 5 inches (7,200 twips), with each tab stop being a different type. In the example, the tab elements before the r element move the text to the second tab stop: <w:p>
<w:pPr>
<w:tabs>
<w:tab w:val ="center" w:pos="1440"/>
<w:tab w:val="left" w:pos="4320"/>
<w:tab w:val="decimal" w:pos="7200"/>
</w:tabs>
</w:pPr>
<w:r>
<w:tab/>
<w:tab/>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
Tabs function only if the xml:space attribute is present on the wordDocument element and is set to "preserve": <w:wordDocument xml:space="preserve"> Table 7 lists the attributes for the tab element and the options that you can use. Section 3: Formatting TextThe most powerful formatting tool discussed in this section is WordprocessingML styles. Although you can format your document by setting individual properties at the paragraph and run level, this approach may not be your best choice. If you're doing more than setting bold, underline, or italics for a single run, using styles to format your document makes it easier to manage the appearance of your document. Formatting Runs of TextThe rPr (run property) element is a container that holds the property elements that define how a run is to be treated by Word. You can have multiple rPr elements within an r element. Table 2 lists all of the elements that can be included inside the rPr element with their description (taken from the WordprocessingML schema). Most the children of the rPr element have a single val attribute that is limited to a specific set of values. For instance, the b (bold) element causes the text that follows it to be bold when the b element has its val attribute set to "on". In this example, both "Hello, World." and "How are you, today?" will be bold because both sets of text are in the same run and follow the rPr element with the b element (note that the prefix "w:" on the val attribute is not optional): <w:r>
<w:rPr>
<w:b w:val="on"/>
</w:rPr>
<w:t>Hello, World.</w:t>
<w:br/>
<w:t>How are you, today?</w:t>
</w:r>
Figure 4 shows the result of this change.
Figure 4. Text in an r element with the b element used in the rPr element to turn on bold formatting If the val attribute isn't present for the b element, it defaults to "on". So the element <w:b/> is equivalent to the element <w:b w:val=""on"/>. You can also use the b element to suppress bold formatting like this: <w:r>
<w:rPr>
<w:b />
</w:rPr>
<w:t>Hello, World.</w:t>
<w:rPr>
<w:b w:val="off"/>
</w:rPr>
<w:t>How are you, today?</w:t>
</w:r>
While most rPr elements use just the val attribute, there are exceptions (the asianLayout property, for instance, takes several attributes). Table 2 provides the values for the val attribute for each of the rPr properties, provided that the list of values is short. Where the element has multiple attributes, doesn't use the val attribute, or has a large number of values, the table gives the name of the type definition in the WordprocessingML schema that describes the element. For example, the underline element uses the val attribute but offers more choices than "on" and "off". This example gives the text a single, continuous underline (other options include "words", "double", and "thick"): <w:r>
<w:rPr>
<w:u w:val="single"/>
</w:rPr>
<w:t>How are you today?</w:t>
</w:r>
The result appears in Figure 5.
Figure 5. Applying the u element to text Formatting ParagraphsThe pPr element defines the properties for a paragraph. Table 3 lists the permitted child elements. For example, within the pPr element, the jc element is used to control the paragraph's alignment. In this document, the text in the paragraph will be centered (see Figure 6): <w:p>
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
<w:br/>
<w:r>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
Figure 6. Centered text StylesStyles allow you to create a group of style properties that can be applied as a unit either to individual paragraphs (within the pPr element) or runs (within the rPr element). Styles reduce the amount of WordprocessingML text that you have to produce and the amount of work required to make changes to your document's appearance. With styles, changing the appearance of all the pieces of text that share a common style has to be done in only one place: the style definition. Using StylesThe pStyle element inside the pPr element specifies which style is to be used for all runs in the paragraph. In the rPr elements, the rStyle element specifies the style for individual runs. The text inside the t elements will reflect a merging of the styles set at the pPr and set at the rPr level. There are no child elements in common between the pPr and rPr elements, so merging the two property sets is straightforward. In this example:
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="MyStyle"/>
</w:pPr>
<w:r>
<w:rPr>
<w:rStyle w:val="MyFirstRunStyle"/>
</w:rPr>
<w:t>Hello, World.</w:t>
</w:r>
<w:r>
<w:rPr>
<w:rStyle w:val="MySecondRunStyle"/>
</w:rPr>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
Defining StylesStyles are defined in the WordprocessingML styles element, which is a top-level element under the wordDocument element. Within the styles element, each style element defines a single style. A style element is a container element for elements that define the style (all the children for the style element are listed in Table 6). The style element itself takes three attributes: type, styleId, and default: The type attribute allows you to indicate what kind of style you're defining: paragraph, character, table, or list. Styles used in the pPr element must be paragraph styles; styles in the rPr element must be character styles. The styleId attribute gives your style the name that you use to invoke the style in your WordprocessingML document. When the default attribute is set to "on," it indicates that this style is the default style for a particular type of style: paragraph, character, table, list. In the following example, three styles are defined:
<w:styles> <w:style w:type="paragraph" w:styleId="MyParagraphStyle" w:default="on"/> <w:style w:type="paragraph" w:styleId="AnotherParagraph" w:default="off"/> <w:style w:type="character" w:styleId="EmphasisStyle" w:default="off"/> </w:styles> The following sample applies those styles. "AnotherStyle" is used for the first paragraph in the document. In the second paragraph, no paragraph style is specified, so the second paragraph will be formatted using the default style ("MyParagraphStyle"). However, within the r element in the second paragraph, a character style is used to control the appearance of the text: <w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="AnotherParagraph"/>
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:rPr>
<w:rStyle w:val="Emphasis"/>
</w:rPr>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
Extending StylesTo create a style by extending another style, you use the basedOn element. The basedOn element allows you to create variations on a style by adding or overriding the properties of the base style. This example defines an "Italic" style and then uses it as the base for a "ItalicBold" style: <w:styles>
<w:style w:type="paragraph" w:styleId="Italic" >
<w:rPr>
<w:i w:val="on"/>
</w:rPr>
</w:style>
<w:style w:type="paragraph" w:styleId="ItalicBold" >
<w:basedOn w:val="Italic"/>
<w:rPr>
<w:b w:val="on"/>
</w:rPr>
</w:style>
</w:styles>
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="ItalicBold" />
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
The order of the style elements with the styles element doesn't matter: a basedOn style can extend style elements that precede or follow it. Other useful child elements of the style element include:
As a more comprehensive example, the following style element establishes:
<w:style w:type="paragraph" w:styleId="ReferenceName" >
<w:name w:val="DisplayName" />
<w:locked w:val="on" />
<w:hidden w:val="off"/>
<w:next w:val="ItalicBold"/>
<w:rPr>
<w:i w:val="on"/>
</w:rPr>
</w:style>
This paragraph uses the style just defined, by setting the val attribute to the name specified in the style's styleId attribute: <w:p>
<w:pPr>
<w:pStyle w:val="ReferenceName"/>
</w:pPr>
<w:r>
<w:t>Hello, World</w:t>
</w:r>
</w:p>
Figure 7 shows the style applied to the first paragraph in the document. On the Formatting toolbar in Word, the Style drop-down list shows the name established for the style through the name element in the style element. The second paragraph in Figure 7 was created by pressing the ENTER key at the end of the first paragraph and is in the "ItalicBold" style, as specified by the next element.
Figure 7. The "DisplayName" style in use Style PropertiesYou define a style by adding child elements to the style elements (all of the children are listed in Table 6). Within the style element, rPr and pPr elements allow you to define the formatting to be used at the r and p levels. The only limitation is that pPr elements used in a character style are ignored (and, as mentioned before, you can only refer to paragraph styles within a pPr element and only to character styles within an rPr element). Putting it all together, this document defines a style that sets the justification for the paragraph (in the pPr element of the style) and combines bold and italic (in the rPr element of the style). The style is then used to format a paragraph: <w:styles>
<w:style w:type="paragraph" w:styleId="ItalicBold">
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
<w:rPr>
<w:i w:val="on"/>
<w:b w:val="on"/>
</w:rPr>
</w:style>
</w:styles>
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="ItalicBold" />
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
The result of applying this paragraph style using the pPr element inside the body element is that the text will be italic, bold, and centered. If you violate the restrictions that Word puts on using styles, Word won't raise an error but Word also won't apply your styles. Consider this example, which is similar to the previous example but has some key changes that prevent the style from being applied: <w:styles>
<w:style w:type="character" w:styleId="ItalicBold" >
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
<w:rPr>
<w:i w:val="on"/>
<w:b w:val="on"/>
</w:rPr>
</w:style>
</w:styles>
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="ItalicBold" />
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
In this example, the "ItalicBold" style has its type attribute set to character. The result is that Word will ignore the use of the style in the pPr element inside the body element. In this example, the character version of the style is used correctly inside the rPr element but the result will still not reflect all of the settings made in the "ItalicBold" style: <w:body>
<w:p>
<w:r>
<w:rPr>
<w:rStyle w:val="ItalicBold" />
</w:rPr>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
Because the style is specified as being a character style, the pPr element in the style definition (where the center justification is specified) will be ignored. The rPr element inside the style is applied, though. As a result, the text will be bold and italic but not centered. You could also make text centered, bold, and italic by making the "ItalicBold" style the default paragraph style and by not specifying a style at the paragraph level: <w:styles>
<w:style w:type="paragraph" w:styleId="ItalicBold"
w:default="on">
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
<w:rPr>
<w:i w:val="on"/>
<w:b w:val="on"/>
</w:rPr>
</w:style>
</w:styles>
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
Property ConflictsBecause properties can be set at the style, p, and r element levels, Word must deal with conflicts between the three levels. In this example, for instance, Word must reconcile:
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="MyStyle"/>
</w:pPr>
<w:r>
<w:rPr>
<w:rStyle w:val="MyFirstRunStyle"/>
<w:b w:val="on"/>
</w:rPr>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
Three rules are used to reconcile settings for properties that are either "on" or "off":
Applying the rules to the previous example, the text "Hello, World" will be bold because of the b element in the rPr element of the run. If the b element hadn't been used in the rPr element, then if either "MyStyle" or "MyFirstRunStyle" turned on bold formatting, the text would be bold &151; even if one of the styles turned bold off. Finally, any bold setting in a default paragraph or character style would apply only if no style was applied to the text. FontsWordprocessingML provides two separate kinds of support for fonts:
Specifying FontsYou can specify which fonts are used in your document by using the fonts element. For each font that you use, you can specify a variety of properties that allow Word to manage the document's fonts and make appropriate substitutions when the requested font isn't available. Setting these properties requires an understanding of font information that's beyond the scope of this document. The fonts element has no direct effect on your document's appearance but is simply a place to supply Word with information about the fonts used by the document. Using FontsWithin the fonts element, the defaultFont element specifies the default fonts for the document. This element directly controls which font is to be used to display the text in the document (unless overridden by a style or an rPr child element).The defaultFont element has a set of attributes that let you specify the default fonts for four character sets: ascii, fareast, h-ansi, and cs (complex scripts, for example, those scripts that allow bidirectional rendering). The defaultFont element is one of the elements that control what font is to be used in displaying the document. You can override the default font by using the rFonts element in the rPr element. This can be done either in the rPr element preceding the t element with the text, in an rPr element inside a pPr element, or in a style. The rFonts element takes the same attributes as the defaultFonts element. For example, the following element sets the font for a run to Tahoma: <w:r>
<w:rPr>
<w:rFonts w:ascii="Tahoma" w:h-ansi="Tahoma" w:cs="Tahoma"/>
</w:rPr>
<w:t>Hello, World</w:t>
</w:r>
The font that you use to display your text doesn't have to be listed in the fonts element at the start of the document. However, without the information in the fonts element, if the font that you specify in the rFonts element isn't available on the computer where Word is displaying the document, Word may not make the best choice in selecting a substitute font. Formatting a SectionAt the section level, formatting information is held with the sectPr element at the end of the section. Within the sectPr element, child elements allow you to control the page's size and margins and to define columns for the page. Setting a Page's Size and MarginsIn the sectPr elements, there are two elements that control your page's layout:
The following pgSz element uses the w attribute to set a page width of 12,240 twips (8.5 inches) and the h attribute to set the height at 15,840 twips (11 inches): <w:sectPr> <w:pgSz w:w="12240" w:h="15840" w:code="1"/> </w:sectPr> The following pgMar element sets the top and bottom margins at 1,440 twips (1 inch) and the left and right margins at 1,800 twips (1.25 inches). In addition, the header and footer are 720 twips (0.5 inches) each. <w:sectPr>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" />
</w:sectPr>
The pgMar element also lets you specify how much space is to be set aside for the gutter, which is the part of the page that is lost to the binding process when pages are bound together. In the previous example, no space has been left for the gutter. This next example sets aside 360 twips (0.25 inches) for the gutter: <w:sectPr> <w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800" w:header="720" w:footer="720" w:gutter="360"/> </w:sectPr> Typically, documents are bound down their inside edges. If your documents are bound along the top, you'll need to specify that in the docPr element: <w:docPr> <w:gutterAtTop w:val="on"/> </w:docPr> To force the gutter to the right side of the page, set the rtlGutter element to "on" in the sectPr element: <w:sectPr> <w:rtlGutter>on</w:rtlGutter> </w:sectPr> ColumnsYou can define columns in the sectPr element by using the cols element. If your columns are all the same width, you need only to specify the number of columns (in the num attribute) and the space between columns (in the space attribute): <w:sectPr> <w:cols w:num="4" w:space="720"/> </w:sectPr> If the columns have different widths, you must insert col elements inside the cols element. However, you must still specify the number of columns on the cols element. You must also turn off the equalWidth attribute. <w:sectPr> <w:cols w:num="4" w:sep="on" w:space="1440" w:equalWidth="off"> <w:sectPr> For each col element except the last one, you specify the width of the column and the space following it. <w:cols w:num="4" w:sep="on" w:space="1440" w:equalWidth="off"> <w:col w:w="1440" w:space="500"/> <w:col w:w="2880" w:space="500"/> <w:col w:w="1080" w:space="750"/> <w:col w:w="1080"/> </w:cols> You do not have to do anything further. Word will make the content of the t elements in the document's body flow, or "snake," through the columns. Section 4: Document ComponentsThis section shows how to add lists, tables, headers, footers, and title page elements to a WordprocessingML document. You'll also see how to add both document properties and document information to your document. ListsIn WordprocessingML, lists are a series of paragraphs that have a list style applied to them, with each item in the list in a separate paragraph. What distinguishes a "list paragraph" from an ordinary paragraph is the presence of a listPr element in the pPr element in the paragraph. The listPr element specifies the list style to be used with the paragraph's content and the level of the list. Here is a sample of a list with two items, which are represented by the two paragraphs with listPr elements: <w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="0" />
<w:ilfo w:val="1" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Item 1</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="0" />
<w:ilfo w:val="1" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Item 2</w:t>
</w:r>
</w:p>
Only two elements can appear inside the listPr element:
As an example of the ilvl element in action, consider the list shown in Figure 8:
Figure 8. Example of the ilvl element in use There are actually three lists in the example. First, there is an outer list with two items ("Types of Web sites" and "WayFinding", numbered 1 and 2). Within those items are two nested lists. The first is the list consisting of "Applications", "Content", and "Hybrid"; the second list consists of "Planning strategies" and "Executing plans with feedback". In WordprocessingML, this example consists of seven paragraphs, one for each list item. The paragraphs in different lists are at different paragraph levels and have different list styles assigned to them. For the first three paragraphs, the WordprocessingML would look like this: <w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="0" />
<w:ilfo w:val="2" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Types of Web sites</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="1" />
<w:ilfo w:val="2" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Applications</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="1" />
<w:ilfo w:val="2" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Content</w:t>
</w:r>
</w:p>
The entry in the outermost list ("Types of Web sites") has an ilvl element with a val attribute of "0". The next paragraph, which is the first item of the nested list ("Applications") has an ilvl element with a val attribute of "1", indicating that the paragraph is nested one level deep. All of the paragraphs use the same list style, specified in the ilfo element as "2". The value in the ilfo element refers to a list element that appears inside the lists element before the body element. The list element, in turn, associates the ilfo element id with a particular list definition by using the ilst element. The following list element, for instance, defines list "1" as using list definition "2": <w:lists>
<w:list w:ilfo="1">
<w:ilst w:val="2" />
</w:list>
</w:lists>
The list element can contain one other element, lvlOverride. The lvlOverride element contains elements that override settings in the list definition. These override settings can include a new starting number for the list and different formatting. By using the lvlOverride element, you can specify settings for a particular list (or part of a list) without having to create a whole new list definition. The actual list definitions are defined inside the listDef element, which also appears inside the lists element. Within the listDef element, the listDefId attribute (which must be numeric) specifies the list name that is used in the ilfo element of the list element. All of the children of the listDef element are given in Table 16. Within the listDef element, lvl elements define how the list items at each level are to be formatted. The format information inside an lvl element can include a pPr element (containing formatting for p elements) and an rPr element (containing formatting for r elements), among other elements. The pPr and rPr settings will automatically be applied to the p and r elements that make up the list item's paragraph. Also within the lvl element, the start element specifies the starting number for the list. The following sample listDef style definition defines two levels of a list (Word typically generates eight levels of definition for a listDef element). The list definition is tied to the actual list in the body of the document through a list element. The example also demonstrates that linkage. Starting from the listDef element, The listDef element has its listDefId attribute set to "2", identifying this as list definition 2. Following the listDef element, a list element has the val attribute of its ilst element set to "2". This ties this list element to list definition 2. The list element has its ilfo attribute is set to "1", which identifies this as list 1. In the following body element, a listPr element has an ilfo element with its val attribute set to "1". This ties the listPr element back to list 1 which, in turn, is tied back to list definition 1. <w:lists>
<w:listDef w:listDefId="2">
<w:lvl w:ilvl="0">
<w:start w:val="1" />
<w:lvlText w:val="%1." />
<w:lvlJc w:val="left" />
<w:pPr>
<w:tabs>
<w:tab w:val="list" w:pos="1080" />
</w:tabs>
<w:ind w:left="1080" w:hanging="720" />
</w:pPr>
<w:rPr>
<w:rFonts w:hint="default" />
</w:rPr>
</w:lvl>
<w:lvl w:ilvl="1" w:tplc="56325532">
<w:start w:val="1" />
<w:nfc w:val="4" />
<w:lvlText w:val="%2." />
<w:lvlJc w:val="left" />
<w:pPr>
<w:tabs>
<w:tab w:val="list" w:pos="1800" />
</w:tabs>
<w:ind w:left="1800" w:hanging="720" />
</w:pPr>
<w:rPr>
<w:rFonts w:hint="default" />
</w:rPr>
</w:lvl>
</w:listDef>
<w:list w:ilfo="1">
<w:ilst w:val="2" />
</w:list>
</w:lists>
<w:body>
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="0" />
<w:ilfo w:val="1" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Types of Web sites</w:t>
</w:r>
</w:p>
Using the list element as an intermediary between the list definition in the listDef element and the listPr tag in the body makes it easy to change a style for a group of lists. Rather than rewrite the list definition or set all the lists in the document to use a different list definition, all that's necessary is to update the val attribute of the ilst element in the list element so that it points to a different listDef element. All the listPr tags that use that list element will now be displayed according to the new listDef element. Headers, Footers, and Title PagesWordprocessingML lets you add headers, footers, and a title page to your document. In WordprocessingML, headers and footers are just another kind of paragraph. Headers and footers are defined in the sectPr element that marks the end of the section. In the sectPr element, the hdr element contains the definition of the header for the section, and the ftr element contains the definition for the footer. Within the hdr and ftr elements, the content of the element is treated like the content of the body element: p, r, and t elements are used to hold the text that makes up the header or footer. Here's an example of the definition of a header and a footer: <w:sectPr>
<w:hdr w:type="odd" >
<w:p>
<w:pPr>
<w:pStyle w:val="Header"/>
</w:pPr>
<w:r>
<w:t>My Header</w:t>
</w:r>
</w:p>
</w:hdr>
<w:ftr w:type="odd">
<w:p>
<w:pPr>
<w:pStyle w:val="Footer"/>
</w:pPr>
<w:r>
<w:t>My Footer</w:t>
</w:r>
</w:p>
</w:ftr>
<w:sectPr>
You can use any style that you want to control the formatting of a header or footer. A typical header style might look like this: <w:style w:type="paragraph" w:styleId="Header" >
<w:name w:val="header"/>
<w:basedOn w:val="Normal"/>
<w:pPr>
<w:pStyle w:val="Header"/>
<w:tabs>
<w:tab w:val="center" w:pos="4320"/>
<w:tab w:val="right" w:pos="8640"/>
</w:tabs>
</w:pPr>
</w:style>
The hdr and ftr elements have a type attribute that takes one of three values: "even", "odd", and "first". If you're only using one hdr or ftr element, the type attribute must be set to "odd". To have a different header (or footer) on even and odd pages, you will need two hdr elements, one with its type attribute set to "even" and one with its type attribute set to "odd". For example: <w:sectPr>
<w:hdr w:type="odd">
<w:p>
<w:pPr>
<w:pStyle w:val="Header"/>
</w:pPr>
<w:r>
<w:t>My Odd Header</w:t>
</w:r>
</w:p>
</w:hdr>
<w:hdr w:type="even">
<w:p>
<w:pPr>
<w:pStyle w:val="Header"/>
</w:pPr>
<w:r>
<w:t>My Even Header</w:t>
</w:r>
</w:p>
</w:hdr>
</w:sectPr>
You must also add the evenAndOddHeaders element to the docPr element at the top of the document: <w:docPr> <w:evenAndOddHeaders/> </w:docPr> If you set the type attribute of a hdr or ftr element to "first", the hdr or ftr will be used only on the first page (even if it's the only hdr or ftr element in the document). You don't have to add any elements to the document properties to use this option, but you do need to add the titlePg element to the end of the sectPr element, following the definition of your headers and footers: <w:sectPr>
<w:hdr w:type="first">
<w:p>
<w:pPr>
<w:pStyle w:val="Header"/>
</w:pPr>
<w:r>
<w:t>My Title Page Header</w:t>
</w:r>
</w:p>
</w:hdr>
<w:titlePg/>
</w:sectPr>
To ensure that your headers and footers display correctly, you should allocate the space on the page to display them. For this, you'll need to control your page's layout, as described earlier in "Formatting a Section." TablesIn WordprocessingML, tables are defined with the tbl element (Table 10 lists the high-level table elements). The following elements are used within the tbl element:
The following example shows a table with two columns and a single row. The tbl element is followed by a tblPr element, which contains a set of table properties. As is typical in WordprocessingML, each property is an empty element with a single val attribute that contains the value for the property. In this example:
<w:tbl>
<w:tblPr>
<w:tblStyle w:val="TableGrid"/>
<w:tblW w:w="0" w:type="auto"/>
<w:tblLook w:val="000001E0"/>
</w:tblPr>
The next element inside the tbl element is the tblGrid element, which contains one gridCol element for each column in the table. The w attribute of the gridCol element gives the width of the column in twips. In this example, there are two columns, one 1770 twips and one 1400 twips wide: <w:tblGrid> <w:gridCol w:w="1770"/> <w:gridCol w:w="1400"/> </w:tblGrid> With the table now defined, tr elements are added to contain the cells with the table's content. The tr element can contain a trPr element, which holds the properties for the row (for example, the row's height and whether it can be split across a page). The following example omits the trPr element. Within the tr element, the row's cells, which are defined by tc elements, contain the table's content. Within a tc element, the tcPr element contains the properties for the cell. In the following example:
Also within the tc element is the cell's content. In this example, the content is a p element with a single run with a single piece of text: <w:tr>
<w:tc>
<w:tcPr>
<w:tcW w:w="1770" w:type="dxa"/>
</w:tcPr>
<w:p><w:r><w:t>Hello, World</w:t></w:r></w:p>
</w:tc>
</w:tr>
</w:tbl>
You can merge cells by using the Vmerge (merge cells vertically) and Hmerge (merge cells horizontally) elements in the tcPr element. An empty Vmerge or Hmerge element with its val attribute set to "restart" marks the start of a merged range. A Vmerge or Hmerge element without any attributes marks the end of the merged cells. Cells between the first and last merged cell must have a Vmerge or Hmerge element with the val attribute set to "continue". In this example, the last cell in the first row starts a merge that is completed in the cell below it: <w:tr>
<w:tc>
<w:p><w:r><w:t>First cell, first row</w:t></w:r></w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:vmerge w:val="restart"/>
</w:tcPr>
<w:p><w:r><w:t>Last cell, first row </w:t></w:r></w:p>
</w:tc>
</w:tr>
<w:tr>
<w:tc>
<w:p><w:r><w:t>First cell, second row</w:t></w:r></w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:vmerge />
</w:tcPr>
<w:p><w:r><w:t>Last cell, second row </w:t></w:r></w:p>
</w:tc>
</w:tr>
Figure 9 shows the results of merging the two cells. The space that the second cell of the last row would occupy is now just a continuation of the cell above it and can have no separate content. The content specified in the WordprocessingML file for the second cell of the last row disappears when the WordprocessingML is displayed in Word. The content is still present, though, and can be retrieved through Word's object model.
Figure 9. The results of merging two cells For more information, Table 11 lists the table property elements, Table 12 lists the table positioning elements, Table 13 lists the child elements for table row properties, and Table 14 lists the child elements for table cell properties. Document PropertiesThe document has a set of properties, held in the docPr element. Table 9 lists the properties that can be set at the document level. Some useful settings for developers who are creating documents include:
The following example shows a set of document properties that set the user's view to normal, zoom the view to full page, and prevent the user from changing the formatting in the document: <w:docPr>
<w:view w:val="normal"/>
<w:zoom w:val="full-page" w:percent="100"/>
<w:documentProtection w:formatting="on" w:enforcement="on"/>
</w:docPr>
Document InformationThe DocumentProperties element performs a different function from the docPr element. Like docPr, DocumentProperties is a container for other elements. The DocumentProperties element, however, is not part of the WordprocessingML namespace but is part of the Common Properties namespace ("urn:schemas-microsoft-com:office:office"), a set of elements common to all Office applications. The DocumentProperties element contains meta-information about the document, including the document's title, version, and author. Some statistics about the document are also kept in the DocumentProperties element, including the number of characters, pages, lines, and paragraphs. Here's a sample DocumentProperties element: <o:DocumentProperties> <o:Title>Sample Document</o:Title> <o:Author>Peter Vogel</o:Author> <o:Pages>1</o:Pages> <o:Words>2</o:Words> <o:Characters>15</o:Characters> <o:Lines>1</o:Lines> <o:Paragraphs>1</o:Paragraphs> <o:Version>11.4920</o:Version> </o:DocumentProperties> Section 5: Other TopicsGraphicsWordprocessingML stores graphics as a combination of Vector Markup Language (VML) and a binary representation of the image. A discussion of VML is outside the scope of this document, but this section shows how picture data fits into the structure of a WordprocessingML document. Some shapes are very easy to add. For instance, to add a rectangle to your document, you only need the VML rect (rectangle) element. The element's style attribute holds the information to draw a rectangle in the right place at the right size: <v:rect id="_x0000_s1032" style="position:absolute;margin-left:63pt;margin-top:4.2pt;width:54pt;height:45pt;z-index:1" /> To use the VML rect element, you must add the Vector Markup Language (VML) namespace ("urn:schemas-microsoft-com:vml") to the namespaces declared in your document. The Common Properties namespace ("urn:microsoft-schemas:office:office") may also be required if you intend to include anything more than the simplest AutoShapes. Typically, you'll establish these namespaces in the document's root element: <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xml:space="preserve"> Your graphic must appear inside a pict element inside an r element inside a p element. This example uses the VML rect element to add a simple rectangle to the document: <w:p>
<w:r>
<w:pict>
<v:rect id="_x0000_s1032"
style="position:absolute;margin-left:63pt;margin-top:4.2pt;
width:54pt;height:45pt;z-index:1" />
</w:pict>
</w:r>
</w:p>
If a graphic consists of more than just a simple shape, you will also need to include a base64-encoded version of the graphic. Considerably more VML is required inside the pict element: <w:p>
<w:r>
<w:pict>
<v:shapetype id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe"
filled="f" stroked="f">
<v:stroke joinstyle="miter" />
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0" />
<v:f eqn="sum @0 1 0" />
<v:f eqn="sum 0 0 @1" />
<v:f eqn="prod @2 1 2" />
<v:f eqn="prod @3 21600 pixelWidth" />
<v:f eqn="prod @3 21600 pixelHeight" />
<v:f eqn="sum @0 0 1" />
<v:f eqn="prod @6 1 2" />
<v:f eqn="prod @7 21600 pixelWidth" />
<v:f eqn="sum @8 21600 0" />
<v:f eqn="prod @7 21600 pixelHeight" />
<v:f eqn="sum @10 21600 0" />
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t"
o:connecttype="rect" />
<o:lock v:ext="edit" aspectratio="t" />
</v:shapetype>
<w:binData w:name="http://01000001.gif">R0lGODlhQQAzAKI
...more base64-encoded data...
q18Ldi1baGzZt1/nZr07dW/Tv0cHDz3cc3HOxzMnt7x8
</w:binData>
<v:shape id="_x0000_i1025" type="_x0000_t75"
style="width:48.75pt;height:38.25pt">
<v:imagedata src="http://01000001.gif"
o:title="FolderN" />
</v:shape>
</w:pict>
</w:r>
</w:p>
BookmarksBookmarks are not part of the WordprocessingML namespace but are part of the Annotation Markup Language namespace ("http://schemas.microsoft.com/aml/2001/core"), which is conventionally prefixed with "aml". In a WordprocessingML document, annotation elements, which are empty elements, bracket the area that is bookmarked. An annotation element with its type attribute set to "Word.Bookmark.Start" marks the start of a bookmark area; an annotation element with its type attribute set to "Word.Bookmark.End" marks the end of the bookmark. In this example, a complete paragraph (containing the text "Inside bookmark") has been bookmarked with a bookmark called "MyBookmark": <w:p><w:r><w:t>Before bookmark</w:t> </w:r> </w:p> <aml:annotation aml:id="0" w:type="Word.Bookmark.Start" w:name="MyBookmark" /> <w:p><w:r><w:t>Inside bookmark</w:t></w:r></w:p> <aml:annotation aml:id="0" w:type="Word.Bookmark.End" /> <w:p><w:r><w:t>After bookmark</w:t></w:r></w:p> If a bookmark is inserted without enclosing any text, the annotation elements will be inserted between r elements: <w:p><w:r><w:t>text sur</w:t></w:r> <aml:annotation aml:id="1" w:type="Word.Bookmark.Start" w:name="MyOtherBookmark" /> <aml:annotation aml:id="1" w:type="Word.Bookmark.End" /> <w:r><w:t>rounding bookmark</w:t></w:r></w:p> In addition to the type attribute, which identifies an annotation element as being used as a bookmark, two attributes of the annotation element are used with bookmarks:
FieldsA field is, effectively, a kind of declarative programming. A field is a set of instructions on how part of the document is to be processed. Also included in the field definition are any input parameters and the results of the processing. A typical Word document with several form fields can be seen in Figure 10.
Figure 10. A Word document with several fields to enter WordprocessingML supports two kinds of fields:
Simple FieldsSimple fields are defined with the fldSimple element. The fldSimple element has an instr (instruction) attribute whose contents define the field's behavior. Within the fldSimple element, an r element holds the results of processing the instructions. For instance, this example creates a simple field that will insert the name of the author from the document properties into the text: <w:p>
<w:fldSimple w:instr="AUTHOR \* Upper \* MERGEFORMAT">
<w:r>
<w:t>Peter Vogel</w:t>
</w:r>
</w:fldSimple>
</w:p>
Complex FieldsComplex fields appear in WordprocessingML as a series of r elements inside a paragraph. Each r element contains one part of the field's definition. Three r elements contain fldChar elements, which mark the three parts of a complex field definition:
The fldChar element is used to mark each of these three parts. The fldCharType attribute of the fldChar element is set to "begin", "separate", or "end" to mark the parts of the field definition. The field instructions are placed in the instrText elements. The instrText elements appear between the r element that marks the beginning of the field definition and the r element that marks the end of the instructions. The results of the field's processing are placed between the r element that marks the end of the instructions and the r element that marks the end of the field definition. For example, form fields require that the definition include a fldData element, which holds binary data required by the field. To make it easier to find the form field when processing the document, you can add a bookmark to identify the field. One kind of complex field is a form text field. A set of WordprocessingML elements that creates a single form field inside a p element would look like this: <w:p>
<w:r>
<w:fldChar w:fldCharType="begin">
<w:fldData>////</w:fldData></w:fldChar>
</w:fldChar>
</w:r>
<w:r>
<w:instrText>FORMTEXT</w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate" />
</w:r>
<w:r>
<aml:annotation aml:id="0" w:type="Word.Bookmark.Start"
w:name="MyField" />
<w:t> </w:t>
<aml:annotation aml:id="0" w:type="Word.Bookmark.End" />
</w:r>
<w:r>
<w:fldChar w:fldCharType="end" />
</w:r>
</w:p>
Looking at the previous example in detail:
For Word to process form fields correctly, the document must be protected for form editing only. You can turn on this level of protection by adding a documentProtection element to the docPr element at the start of the WordprocessingML document. The edit attribute of the documentProtection element must be set to "forms" and the enforcement attribute must be set to "on". Here's an example: <w:docPr> <w:documentProtection w:edit="forms" w:enforcement="on" /> </w:docPr> After the field has been filled in by the user, the r element that contained the original value will hold the value entered by the user. The result would look like this if the user entered "My Data Entered" into the form field: <w:p>
<w:r>
<w:fldChar w:fldCharType="begin">
<w:fldData>////</w:fldData></w:fldChar>
</w:fldChar>
</w:r>
<w:r>
<w:instrText>FORMTEXT</w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate" />
</w:r>
<w:r>
<aml:annotation aml:id="0" w:type="Word.Bookmark.Start"
w:name="MyField" />
<w:t>My Data Entered</w:t>
<aml:annotation aml:id="0" w:type="Word.Bookmark.End" />
</w:r>
<w:r>
<w:fldChar w:fldCharType="end" />
</w:r>
</w:p>
HyperlinksA hyperlink has two components: the hyperlink itself (the text the user will click) and the target for the link. Potential targets include external files, e-mail addresses, and bookmarks. If you are creating a hyperlink in Microsoft Word, other targets are supported (for example, the top of the document and headings). However, all of those targets are implemented by adding a bookmark at the appropriate location in the document. In this section, you'll see how to create a bookmark for a target within the document. For a bookmark to be the target of a hyperlink, it must be a complete bookmark pair and have a name assigned to it. For instance, in Word if the user creates a hyperlink to the top of the document, a bookmark called "_top" is inserted at the top of the document. The resulting WordprocessingML looks like this: <aml:annotation aml:id="0" w:type="Word.Bookmark.Start" w:name="_top" /> <aml:annotation aml:id="0" w:type="Word.Bookmark.End" /> The hyperlink that points to this bookmark is represented in WordprocessingML by an hlink element that has "_top" in its bookmark attribute. The text that is displayed by Word as the hyperlink must be inside a r element between the hlink element's opening and closing element (see Figure 11 for how the link appears in Word): <w:p>
<w:hlink w:bookmark="_top">
<w:r>
<w:rPr>
<w:rStyle w:val="Hyperlink" />
</w:rPr>
<w:t>Go To Top</w:t>
</w:r>
</w:hlink>
</w:p>
Figure 11. A hyperlink in Microsoft Word You can use any style that you want with your hyperlink. However, the "Hyperlink" style that is generated by Microsoft Word is what most users will recognize as the visual clue for a hyperlink (underlined blue text). For consistency's sake, you should consider adding this style to your document and using it with your hyperlinks: <w:style w:type="character" w:styleId="Hyperlink">
<w:name w:val="Hyperlink" />
<w:basedOn w:val="DefaultParagraphFont" />
<w:rsid w:val="365462" />
<w:rPr>
<w:color w:val="0000FF" />
<w:u w:val="single" />
</w:rPr>
</w:style>
Two other attributes of the hlink element can be useful in generating a WordprocessingML hyperlink that will be read Word:
Figure 12. A Word hyperlink with a ScreenTip Macros and ComponentsA document can also contain Visual Basic for Applications (VBA) code, toolbar modifications, OLE custom controls (OCX) and other "active" components. All of these items can be represented in WordprocessingML. In this section, you'll be introduced to how WordprocessingML stores VBA code and OCX controls. You'll also see how Word ensures that software can detect whether these components are present in the document so that the component can, for instance, be scanned for viruses. Word also ensures that if components are not made visible in WordprocessingML, they will not be executed. For VBA code, a base64-encoded version of the binary file generated by the VBA editor is held in the binData element inside the docSuppData element. The binData element has a name attribute whose value must be set to "editdata.mso". The docSuppData element is a top-level element under the wordDocument root element, and follows the styles element in a document created by Word. A typical VBA module in a WordprocessingML document looks like this: <w:docSuppData><w:binData w:name="editdata.mso"> QWN0aXZlTWltZQAAAfAEAAAA/////wAAB/AbDwAABA ...more base64-encoded data... LgBNAFkATQBPAEQAVQBMAEUAAABAAAAL8AQAAAASNFZ4 </w:binData></w:docSuppData> Representing an OCX control in WordprocessingML is more complicated than storing VBA code because an OCX control also has a graphical representation in the document. For OCX controls, a binData element within a docOleData element is used to hold the OLE data. For OCX controls, the name attribute of the binData element must be set to "oledata.mso". <w:docOleData> <w:binData w:name="oledata.mso"> 0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAA ...more base64-encoded data... C4zcL+WTKDhJozVltEGRkTOwQAROjpejLDyT5d+/F5BeLt5n3wv4P/Cl4BK= </w:binData> </w:docOleData> Later in the document, a set of VML-related elements will handle the display of the component. Two attributes of the wordDocument element are used to indicate the presence of the VBA code and OCX controls: macrosPresent for VBA code and embeddedObjectPresent for OCX controls. The macrosPresent attribute is used to indicate that macros are present in the document. If the attribute is missing or if it's set to "no", Word won't load a document that has a docSuppData element. This attribute is strictly enforced. If, for instance, the attribute is present and is set to "yes" (indicating that macros are supposed to be present), and Word doesn't find a docSuppData element before it finds the body element, Word will not load the document. Note Once the document is loaded, Word's security settings will control whether the VBA code will be allowed to execute. The second attribute is the embeddedObjectPresent attribute, which indicates that an OCX control may have been used in the document. If the attribute is missing or if it's set to "no", Word won't load a document that has a docOleData element. This attribute is not, however, strictly enforced. If the attribute is present and is set to "yes", but Word doesn't find a docOleData element before the body element, Word will still load the document. Section 6: Auxiliary ElementsWhen a WordprocessingML document is created in Word, a number of elements are included that provide information to any applications used to read the document. These auxHint elements, from the Auxiliary XML Document 2003 namespace ("http://schemas.microsoft.com/office/word/2003/auxHint"), provide information about how Word handled various elements. Setting the auxHint attributes and elements has no effect on how Word behaves. These elements are provided for the use of other WordprocessingML processing tools and provide a convenient way to access information that would otherwise be difficult to determine. When you are creating a document, there is no problem with using the WordprocessingML sectPr elements and omitting the auxHint section elements in your document. However, when a WordprocessingML document is read, the sect elements provide containers for the sections of your document. These containers can be very useful to the application that is processing the document, especially when XSL transformations (XSLTs) are used, because XSLTs are oriented towards processing child elements inside container elements. Sections and SubsectionsWordprocessingML does not use a container element for a section but, instead, marks the end of a section with a sectPr element. However, Word does generate sect elements to enclose the p elements that make up a section, creating a true XML container for sections. Within a sect element, each table of contents heading generates sub-section elements that enclose content at a particular heading level or lower. The sect ElementA WordprocessingML document will consist of at least one sect element. If the document contains multiple sectPr elements, which define multiple sections in the document, the document will consist of a series of sect elements. Including the sect elements in the definition of a WordprocessingML body element, this means that there are three possible structures for the body element:
The sub-section ElementA sub-section element is generated by Word whenever a paragraph is found that has an outlineLvl element assigned in the p element's pPr element. In this example, for instance, the paragraph is assigned to the third level of the outline (the lowest level is 0): <w:p>
<w:pPr>
<w:outlineLvl w:val="2" />
</w:pPr>
<w:r>
<w:t>x</w:t>
</w:r>
</w:p>
Outline levels are frequently assigned through styles. In Word, the "Heading 1" style has an outline level of 0 set in its rPr element. Any text formatted with the "Heading 1" style picks up that outline level and generates a sub-section element. Word nests sub-section elements within each other, depending on the outline level. When Word finds a paragraph with an outlineLvl element assigned to it, Word generates an opening sub-section element. If the outlineLvl element just found is higher than the previous outlineLvl element, the new sub-section element will be nested within the sub-section created for the earlier outlineLvl; if the previous outlineLvl was equal to or higher than the outlineLvl just found, closing elements for all the higher level sub-section elements are generated before the new sub-section element is opened. In this example, for instance, there are five headings at various heading levels: Heading Level 1 Paragraph1 Paragraph2 Heading Level 2 Paragraph3 Paragraph4 Heading Level 2 Paragraph5 Paragraph6 Heading Level 1 Paragraph7 Omitting all other WordprocessingML elements, the auxiliary sect and sub-section elements that Word would generate would look like this: <wx:sect>
<wx:sub-section>
Heading Level 1
Paragraph1
Paragraph2
<wx:sub-section>
Heading Level 2
Paragraph3
Paragraph4
</wx:sub-section>
<wx:sub-section>
Heading Level 2
Paragraph5
Paragraph6
</wx:sub-section>
<wx:sub-section>
Heading Level 1
Paragraph7
</wx:sub-section>
</wx:sect>
Using the sect and sub-section ElementsInserting a section break will create a new sect element in the document and close all open sub-section elements. In this sample, a section break has been added after paragraph4: Heading Level 1 Paragraph1 Paragraph2 Heading Level 2 Paragraph3 Paragraph4 Section Break Heading Level 1 Paragraph5 The resulting sect and sub-section elements would look like this: <wx:sect>
<wx:sub-section>
Heading Level 1
Paragraph1
Paragraph2
<wx:sub-section>
Heading Level 2
Paragraph3
Paragraph4
</wx:sub-section>
</wx:sub-section>
</wx:sect>
<wx:sect>
<wx:sub-section>
Heading Level 1
Paragraph5
</wx:sub-section>
</wx:sect>
Auxiliary Attributes of the Tab ElementThe tab element takes three attributes from the auxiliary namespace: wTab, tlc, and cTlc. If you're reading a document and need to determine where some text that is positioned on a tab stop will fall horizontally on the page, these properties provide useful information. In WordprocessingML, the tab element moves the text following it to the next tab stop in the document. The wTab attribute that Word adds to the tab element provides the distance (in twips) between the previous character in the document and the first character of the text at the tab stop. In this example, the word "Hello" is 2,880 twips from the end of the previous text: <w:tab wx:wTab="2880" wx:tlc="none" wx:cTlc="14"/><w:t>Hello</w:t> To get the absolute distance between tab stops, you should reference the settings in the tab elements of the pPr (paragraph properties) element. The tlc attribute reports on how the space before the tab is filled. Values for this attribute are:
The cTlc attribute states how many dots were used in the leader before the tab stop. Going back to the previous example, the word "Hello" would have 14 dots between it and the previous text. However, in the example no leader was shown, as indicated by the tlc setting of "none". The Auxiliary font ElementThe font element describes the font used by Word for part of the document. In this example, the run is displayed in the Times New Roman font: <w:rPr>
<wx:font wx:val="Times New Roman"/>
</w:rPr>
<w:t>Hello, World</w:t>
You can't use the font element to set which font is used &151; you must use the rFont or pFont elements. However, the auxiliary font element is useful when determining what font was used with the text. Without the auxiliary font element, any tool processing a WordprocessingML document would have to determine the document's default font and then check to see if the default font had been overridden by a style or by property settings at the paragraph or run level. The Auxiliary estimate AttributeThe estimate attribute can appear as an attribute on a number of elements that hold numerical information. Where the estimate attribute appears, it will be set to either "true" or "false" and indicates whether Word has estimated the value in the element ("true" indicates that the value has been estimated). ReferenceTable 1: WordprocessingML Elements Notes
Table 2: rPr Child Elements (Run Properties) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 3: pPr Child Elements (Paragraph Properties) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 4: sectPr Child Elements (Section Properties) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 5: style Element Attributes
Table 6: style Child Elements (Style Definitions) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 7: tab Element Attributes
Table 8: WordprocessingML Auxiliary Elements and Attributes
Table 9: docPr Child Elements (Document Properties) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 10: Table-Related Elements
Table 11: tblPr Child Elements (Table Properties) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 12: tblpPr Child Elements (Table Positioning Properties) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 13: trPr Child Elements (Table Row Properties) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 14: tcPr Child Elements (Table Cell Properties) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 15: listDef Child Elements (List Definitions) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 16: lvl Child Elements (List-Level Definitions) Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 17: nfc Element Integer Values
|