Proposal: Convention for repeats and conditionals

Topics: Developer Forum, User Forum
Jul 15, 2010 at 5:53 AM
Edited Jul 15, 2010 at 5:59 AM

Hi

This is the text of a proposal I've just written, for handling repeats and conditionals.  The original (better formatted) word document may be found at http://dev.plutext.org/svn/docx4j/trunk/docx4j/sample-docs/databinding/conventions.docx

Feedback welcome.  I'd particularly like to see this proposal (or something like it), integrated into the Toolkit distribution.

cheers

Jason

===

Word's content control data binding provides a natural way to insert text, for example
 Dear [Click here to enter text.]

That leaves the question of how to handle things like:
• conditional inclusion of paragraphs or other units of content
• repeat (eg of list items, table rows, or other units of content)
• inclusion of other documents as well (altChunk)

Conditional Inclusion
------------------------

A content control is said to be conditional if it (and its contents) are included/excluded from the document based on whether some condition is true or false.

Without a way to say that a content control is conditional, an XML file can't control whether paragraphs, tables etc appear in a document. 

Repeats
----------

A content control is a repeat if it designates that its contents are to be included more than once.

For example, a row of a table for each invoice/order item, or person.

Without a way to say that a content control is to be repeated, an XML file can't contain variable amounts of repetitive content. 

Problem Statement
----------------------

The problem is that the Open XML specification does not standardise how conditionals and repears can be done, and nor does Microsoft give any guidance or convention.

This is a significant limitation on document generation, which each system typically has to address.

A standardised way of doing repeats and conditionals would prevent businesses from re-inventing the wheel, and provide for enhanced interoperability.

The purpose of this document is to suggest a convention, which various tools could implement.

Suggested Convention
----------------------------

The suggested convention is to include bindingrole=conditional|repeat in the Content Control's tag. 

Putting the control information in the content control properties is a better design than putting it in the bound XML.

Processing model
---------------------

The binding role tag is preprocessed via an appropriate tool, to produce a new docx document. 

Any content controls whose bindingrole=conditional evaluated to false will be missing from this new docx document.

Any content controls which had bindingrole=repeat will have their content appear n times, where n is the number of child nodes

Example documents
-----------------------

An example can be found in http://dev.plutext.org/svn/docx4j/trunk/docx4j/sample-docs/databinding/

invoice.docx contains examples of conditionals and repeats, using the proposed conventions.

The custom xml used in the example is:

<invoice>
  <customer>
    <name>Joe Bloggs</name>
  </customer>

  <items>
    <item>
      <name>apples</name>
      <price>$20</price>
    </item>
    <item>
      <name>bananas</name>
      <price>$30</price>
    </item>
    <item>
      <name>cherries</name>
      <price>$40</price>
    </item>
    <total>$90</total>
  </items>
  <misc>
    <includeBankDetails>true</includeBankDetails>
  </misc>
   
</invoice>

invoice_preprocessed_OUT.xml is the result of processing invoice.docx, using the docx4j implementation of this convention.

invoice_bound_OUT.xml is the result of processing all the binding information (ie the equivalent of what Word does when opening
invoice_preprocessed_OUT.xml).

Notice that Word 2007 can open all 3 documents, and behaves as one would expect.

bindingrole=conditional
------------------------------

The content control is excluded only if its databinding points to an XML element or attribute which has case-insensitive value &quot;false&quot;

If you look at invoice.docx, you'll see it contains an sdt with:

      <w:sdtPr>
        <w:tag w:val="bindingrole=conditional&amp;w:xpath=/invoice[1]/misc/includeBankDetails&amp;w:storeItemID={8b049945-9dfe-4726-9de9-cf5691e53858}" />
      </w:sdtPr>

Notice that the information which would ordinarily be included in a w:dataBinding tag is instead encoded in the tag.

This approach ensures Word 2007 behaves as expected.

Word Content Control Toolkit
-----------------------------------

The Content Control Toolkit is often used to set up data bindings.  This program is easy to modify, so that if an element has a w:tag with w:val containing the word &quot;bindingrole&quot; then any binding information is encoded as above and not as a w:dataBinding element.

bindingrole=&quot;repeat&quot;
-------------------------

invoice.docx contains the following example:

      <w:sdt>
        <w:sdtPr>
          <w:tag w:val="bindingrole=repeat
                        &amp;w:xpath=/invoice[1]/items
                        &amp;w:storeItemID={8b049945-9dfe-4726-9de9-cf5691e53858}" />
        </w:sdtPr>
        <w:sdtContent>
          <w:tr>
            <w:sdt>
              <w:sdtPr>
                <w:dataBinding w:xpath="/invoice[1]/items/item[1]/name" 
                               w:storeItemID="{8B049945-9DFE-4726-9DE9-CF5691E53858}" />
                <w:text />
              </w:sdtPr>
              <w:sdtContent>
                <w:tc>
                  <w:p>
                    <w:r>
                      <w:t>apples</w:t>
                    </w:r>
                  </w:p>
                </w:tc>
              </w:sdtContent>
            </w:sdt>
            <w:sdt>
              <w:sdtPr>
                <w:dataBinding w:xpath="/invoice[1]/items/item[1]/price" 
                               w:storeItemID="{8B049945-9DFE-4726-9DE9-CF5691E53858}" />
                <w:text />
              </w:sdtPr>
              <w:sdtContent>
                <w:tc>
                  <w:p>
                    <w:r>
                      <w:t>$20</w:t>
                    </w:r>
                  </w:p>
                </w:tc>
              </w:sdtContent>
            </w:sdt>
          </w:tr>
        </w:sdtContent>

Here, the table row will be duplicated, once for each /invoice[1]/items/item

When the repeat is being processed, any w:dataBinding on any child sdt will need to be altered to point at the nth item.

A more sophisticated model would be to say it is cloned once for each child node that has certain specified name. For example, given:

  <items>
    <item>
      <name>apples</name>
      <price>$20</price>
    </item>
    <item>
      <name>bananas</name>
      <price>$30</price>
    </item>
    <item>
      <name>cherries</name>
      <price>$40</price>
    </item>
    <total>$90</total>
  </items>

bindingrole=repeat[item] could produce a row for each item, and ignore the &lt;total&gt; node.  Feedback is sought as to whether this flexibility is required.

Implementation
------------------

This proposed convention is implemented in docx4j v2.5.0.

Source code can be found at http://dev.plutext.org/svn/docx4j/trunk/docx4j/src/main/java/org/docx4j/openpackaging/parts/CustomXmlDataStoragePart.java

Namespace?
---------------

It is up to us to choose a namespace (databindingconventions.org?).  Really, what we need is a convention for the content of w:tag, since its in there that we have key bindingrole (without a namespace).

document version 0.1

Coordinator
Jul 17, 2010 at 11:20 AM

Hi Jharrop

Your proposal sounds promising. I am interested in extending the toolkit with this concept. From your proposal it sounds like you may have already modified the source to make this happen. If this is true, would you mind sending it to me for a review?

Thanks
Matt

Jul 17, 2010 at 1:28 PM

Hi Matt

I just modified WriteCCDataFromDalToXd in PkgWriter, so that if w:tag contains "bindingrole" then the binding information  is written to w:tag instead of w:databinding.  The revised method is below.

        private void WriteCCDataFromDalToXd(XmlDocument xd, Dal dal)
        {
            // Example content control XML definition:
            //<w:sdtPr>
            //  <w:rPr>
            //    <w:rStyle w:val="Entry" /> 
            //    <w:szCs w:val="18" /> 
            //   </w:rPr>
            //   <w:alias w:val="SSN" /> 
            //   <w:tag w:val="SSN Tag" /> 
            //   <w:id w:val="4065844" /> 
            //   <w:placeholder>
            //        <w:docPart w:val="9870526EE9CB4DF49255A067C4B66787" /> 
            //    </w:placeholder>
            //   <w:dataBinding w:prefixMappings="xmlns:ns0='http://schemas.microsoft.com/office/2006/metadata/properties' xmlns:ns1='http://www.w3.org/2001/XMLSchema-instance' xmlns:ns2='5c6f7550-9fbd-464c-87de-72bbae7c0540'" w:xpath="/ns0:properties[1]/documentManagement[1]/ns2:SSN[1]" w:storeItemID="{3E0BD050-4BDB-4E91-9197-8CA144C78648}" /> 
            //   <w:text /> 
            //</w:sdtPr>

            // From the XSD:
            /*
             * Note that XPATH and STOREITEMID are required
            <xsd:complexType name="CT_DataBinding" odoc:ID="c1314638-ae46-4bb0-9fb3-ddc3a4c31946">
		        <xsd:attribute name="prefixMappings" wbld:cname="prefixMappings" type="ST_String" wbld:comment="Defines the prefix-&gt;namespace mappings for the xpath" odoc:ID="7c85af68-916d-4ffc-9586-f74c269554e7" />
		        <xsd:attribute name="xpath" type="ST_String" use="required" wbld:cname="xpath" wbld:comment="The xpath to the data source for this databinding" odoc:ID="d2af1566-869b-4b9d-a9ff-1fd2454f260f" />
		        <xsd:attribute name="storeItemID" type="ST_String" use="required" wbld:cname="storeItemID" wbld:comment="This is the itemID of the store item that we are linked to" odoc:ID="d06b6407-1c7b-46a7-a5f4-898afefb3ec7" />
	        </xsd:complexType>
            */

            // Find all content controls
            NameTable nt = new NameTable();
            XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
            string sWdNs = xd.DocumentElement.NamespaceURI;
            nsManager.AddNamespace("w", sWdNs);
            XmlNodeList xnlCC = xd.SelectNodes("//w:sdtPr", nsManager);

            XmlNode xnAlias = null;
            XmlNode xnTag = null;
            XmlNode xnId = null;
            XmlNode xnPrefixMappings = null;
            XmlNode xnStoreId = null;
            XmlNode xnXpath = null;
            string sId = null;
            XmlNode xnDataBinding = null;
            foreach (XmlNode xnCC in xnlCC)
            {
                // Get references to tags
                xnAlias = xnCC.SelectSingleNode("w:alias", nsManager);
                xnTag = xnCC.SelectSingleNode("w:tag", nsManager);
                xnId = xnCC.SelectSingleNode("w:id", nsManager);
                xnDataBinding = xnCC.SelectSingleNode("w:dataBinding", nsManager);
                if (xnDataBinding != null)
                {
                    xnPrefixMappings = xnDataBinding.Attributes.GetNamedItem("w:prefixMappings");
                    xnXpath = xnDataBinding.Attributes.GetNamedItem("w:xpath");
                    xnStoreId = xnDataBinding.Attributes.GetNamedItem("w:storeItemID");
                }

                // Get the id
                if (xnId.Attributes.Count == 0)
                {
                    Debug.Fail("No id on this node:", xnCC.OuterXml);
                    continue;
                }
                sId = xnId.Attributes[0].Value;


                // Find the cc in the DAL and write to the xml of the package dom
                Dal.CC cc = dal.FindContentControl(sId);
                if (cc != null) // If we found it...
                {
                    // -------- <w:tag> ----------- //
                    System.Diagnostics.Trace.WriteLine("handling Tag..");
                    if (!string.IsNullOrEmpty(cc.Tag))
                    {
                        if (xnTag == null) // If node does not exist, create it
                            xnTag = CreateAndAppendNode(xnCC, sWdNs, "w:tag", "w:val");

                        // The element should exist by now so set values
                        // Our special bindings live in Tag only
                        string tagBinding = "";
                        if (cc.Tag.Contains("bindingrole"))
                        {
                            System.Diagnostics.Trace.WriteLine("tag contains bindingrole..");

                            // tmp
                            if (string.IsNullOrEmpty(cc.XmlPartId)) cc.XmlPartId = "foo";

                            // The XSD says that the xpath and storeitemid are required attributes. 
                            // So if either one of those values are empty do not touch the databinding element
                            if (!string.IsNullOrEmpty(cc.XPath) && !string.IsNullOrEmpty(cc.XmlPartId))
                            {
                                // If we have no info for prefix mappings, dont add the attribute
                                if (string.IsNullOrEmpty(cc.PrefixMappings))
                                {
                                    tagBinding = "&w:xpath=" + cc.XPath + "&w:storeItemID=" + cc.XmlPartId;
                                }
                                else
                                {
                                    tagBinding = "&w:xpath=" + cc.XPath + "&w:storeItemID=" + cc.XmlPartId + "&w:prefixMappings=" + cc.PrefixMappings;
                                }
                            }
                            else
                            {
                                System.Diagnostics.Trace.WriteLine("XmlPartId empty?");
                            }
                        }

                        xnTag.Attributes["w:val"].Value = cc.Tag + tagBinding;
                    }
                    // -------- </w:tag> ----------- //

                    // -------- <w:alias> ----------- //
                    if (!string.IsNullOrEmpty(cc.Title))
                    {
                        if (xnAlias == null) // If node does not exist, create it
                            xnAlias = CreateAndAppendNode(xnCC, sWdNs, "w:alias", "w:val");

                        // The element should exist by now so set values
                        xnAlias.Attributes["w:val"].Value = cc.Title;
                    }
                    // -------- </w:alias> ----------- //

                                        
                    // -------- <w:dataBinding> ----------- //

                    // Clear any existing dataBinding node to initialize state
                    // Note that this fixes bug http://www.codeplex.com/dbe/WorkItem/View.aspx?WorkItemId=11562
                    if (xnDataBinding != null)
                    {
                        xnCC.RemoveChild(xnDataBinding);
                    }

                    if (!cc.Tag.Contains("bindingrole"))
                    {

                        // The XSD says that the xpath and storeitemid are required attributes. 
                        // So if either one of those values are empty do not touch the databinding element
                        if (!string.IsNullOrEmpty(cc.XPath) && !string.IsNullOrEmpty(cc.XmlPartId))
                        {
                            // If we have no info for prefix mappings, dont add the attribute
                            if (string.IsNullOrEmpty(cc.PrefixMappings))
                            {
                                // Start from scratch
                                xnDataBinding = CreateAndAppendNode(xnCC, sWdNs, "w:dataBinding", "w:xpath", "w:storeItemID");

                                // The element should exist by now so set values
                                xnDataBinding.Attributes["w:xpath"].Value = cc.XPath;
                                xnDataBinding.Attributes["w:storeItemID"].Value = cc.XmlPartId;
                            }
                            else
                            {
                                // Start from scratch
                                xnDataBinding = CreateAndAppendNode(xnCC, sWdNs, "w:dataBinding", "w:prefixMappings", "w:xpath", "w:storeItemID");

                                // The element should exist by now so set values
                                xnDataBinding.Attributes["w:prefixMappings"].Value = cc.PrefixMappings;
                                xnDataBinding.Attributes["w:xpath"].Value = cc.XPath;
                                xnDataBinding.Attributes["w:storeItemID"].Value = cc.XmlPartId;
                            }
                        }                    
                    }

                    // -------- </w:dataBinding> ----------- //                  



                }
                else
                {
                    Debug.Fail("Could not find cc in dal with id " + sId);
                }
            } // End for each
        }

cheers
Jason

Coordinator
Jul 18, 2010 at 2:15 PM

Thanks Jason for the quick follow up. I'll need some time to review this and consider integration.

Thank you for your contribution to Word content controls!

--Matt

Jul 22, 2010 at 4:31 PM

No worries Matt.  The change above is the core of what is required. Now that I know you are interested in integrating this, please give me a few days so I can consider what else should be done, and perhaps provide some additional changes.

cheers .. Jason

Jul 27, 2010 at 11:34 AM

Hi Matt

I've uploaded a more complete patch to http://dev.plutext.org/dbe_mods_20100727.zip

It is contributed under the Microsoft Limited Permissive License

cheers .. Jason

Aug 25, 2010 at 12:43 PM

Oops, missing a null check in PkgLoader at line 228

if (cc.Tag!=null && cc.Tag.Contains("bindingrole" ))

Oct 11, 2010 at 8:36 AM
Edited Oct 12, 2010 at 5:13 AM

Please note that this proposal is now superseded by a v2, which can be found at http://dev.plutext.org/svn/docx4j/trunk/docx4j/sample-docs/databinding/conventions.html

Compared to version 1, there are three main improvements:

1. content in the sdt tag is minimised, which is desirable since Word restricts the tag content to 64 characters

2. supports conditions made up of boolean expressions

3. supports optional interactive processing (ie user gets asked questions); v1 only supported non-interactive processing

I've made a Word Add-In which can be used to set up your content controls as per this v2 proposal.  It is currently rough around the edges, but it gets the job done. You can download it from http://dev.plutext.org/opendope/setup.exe 

cheers .. Jason