Generate Microsoft Word docx Documents in Biztalk

Tuesday, July 5, 2011

In this post I’ll discuss how to generate Word 2007 documents natively from BizTalk 2006 using the Office Open Xml System.IO.Packaging API recently released by the Microsoft Office Team under .Net 3.0.


Unless you’ve lived under a rock during the last year, you’ll know that the Office Open XML (OOXML) format is the new Xml format for the Office 2007 suite, namely Word, Excel and Powerpoint. OOXML uses a file package conforming to the Open Packaging Convention and contains a number of individual files that form the basis of the document; the package is then zipped to reduce the overall size of the resulting file (either a .docx, .xlsx or .pptx).

Generating Word Documents – Overview

Generating a Word document is relatively simple and only requires a custom send pipeline component that generates our OOXML package.

In this post I will be using a Sales Report scenario, generating a Word document from the output of a fictional ERP system; to that extent, I’ll also be mapping from a fictional sales summary Xml message to the required OOXML format before generating the final .docx. The final document will look something like the following (note that the areas in red will be replaced with content from our ERP sales summary message – click on the image for a larger version):

Before we start, I need to present a quick crash-course in the structure of OOXML packages. A minimal OOXML WordprocessingML document contains three parts: a part that defines the main document body, usually called document.xml; a part detailing the Content Types (which indicates to the consumer what type of content can be expected in the package); and a Relationships part (which ties the document parts and Content Types together). When using the System.IO.Packaging API we only need to concern ourselves with the main document body – the API takes care of creating the Content Types and Relationship parts. Its this feature of the API that allows us to create Word documents in BizTalk – all we need to do is create the Xml for the main document and squirt it at a custom pipeline component which does the packaging stuff for us using the API.

Note that the structure of an OOXML document is outside of the scope of this post (but a good understanding is fundamental when working with these documents) and I would recommend that you read the excellent Open Xml Markup Explained by Wouter van Vugt.

Generating Word Documents – The ‘Main’ Document

The main document body (i.e. document.xml) is the only part that is generated in the BizTalk solution. We don’t actually create a file called document.xml – the packaging API does this for us – instead we simply create a message that conforms to the OOXML schema and pass this into the custom Send pipeline.

In our scenario, we are generating a Sales Report document for distribution to the finance department – we will receive an Xml sales summary document from our fictional ERP system that resembles the following:
<?xml version="1.0" encoding="utf-8"?>
<ns0:SalesReport xmlns:ns0="">
<Author>Nick Heppleston</Author>
<SalesStart>10th January 2008</SalesStart>
<SalesEnd>17th January 2008</SalesEnd>

which needs to be mapped into our OOXML main document body message (I think the layout of the OOXML message is pretty self explanatory, however I would point you at Open Xml Markup Explained if you’re after a more detailed explanation):
<?xml version="1.0″ encoding="utf-8″ ?>
<w:document xmlns:w="">
<w:b />
<w:sz w:val="52″ />"
<w:rFonts w:ascii="Cambria" />
<w:t xml:space="preserve">Sales Summary for: </w:t>
<w:t>Nick Heppleston</w:t>
<w:i />
<w:sz w:val="52″ />"
<w:rFonts w:ascii="Cambria" />
<w:spacing w:val="15″ />"
<w:color w:val="48FDB2″ />"
<w:t xml:space="preserve">Sales from: </w:t>
<w:t>10th January 2008</w:t>
<w:t xml:space="preserve"> to </w:t>
<w:t>17th January 2008</w:t>
<w:t xml:space="preserve"> - </w:t>
<w:t xml:space="preserve">Contact: </w:t>
<w:t>Nick Heppleston</w:t>
<w:t xml:space="preserve"> | </w:t>

This transformation can be performed anywhere: in the sample solution I’ve put the map on the Receive Port. Also, because I can’t think of any way to generate this type of message using a standard BizTalk Map – how do I graphically say ‘map from this source node to this destination node’ when all of the destination nodes simply repeat themselves – I am using custom XSLT to drive the map.

Note: I’ve yet to find a satisfactory XSD for the WordprocessingML markup so the solution contains a OOXML schema that was automagically generated from the above destination format. I’m working on sourcing the schema – I have a number of ‘feelers’ out with the Office Team and I hope to be able to provide a reference in the next couple of days.

With our Sales Summary message now mapped and in the necessary OOXML format, we can send it to the custom pipeline / pipeline component for it to do its work and generate our .docx package.

Generating Word Documents – The Custom Pipeline Component

The custom pipeline component is relatively simple. It uses the System.IO.Packaging API introduced in .Net 3.0 which can be found in windowsbase.dll (C:\Program Files\Reference Assemblies\Microsoft\Framework\v3.0\windowsbase.dll); full documentation regarding this namespace can be found online at MSDN. The API is invoked in the pipeline component Execute() method as follows:

1: public IBaseMessage Execute(IPipelineContext pc, IBaseMessage inmsg)
2: {
3: XmlDocument InputXmlDocument = new XmlDocument();
4: InputXmlDocument.XmlResolver = null;
6: // Define bodypart instances
7: IBaseMessagePart bodyPart = inmsg.BodyPart;
9: // Define stream instances
10: Stream originalStream = null;
11: MemoryStream odfStream = new MemoryStream();
13: string docContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml";
14: string docRelationshipType = "";
16: if (null != bodyPart)
17: {
18: // Get a *copy* of the original stream
19: originalStream = bodyPart.Data;
21: // Check that the original stream is not null
22: if (null != originalStream)
23: {
24: // Load the original message stream into our input xml document
25: // to be used as the basis of the OOXML document.
26: InputXmlDocument.Load(originalStream);
28: try
29: {
30: // Create a new OOXML package
31: Package pkg = Package.Open(odfStream, FileMode.Create, FileAccess.ReadWrite);
33: // Create a Uri for the document part
34: Uri docPartUri = new Uri("/word/document.xml", UriKind.Relative);

post by Creative Commons

1 comment:

  1. Do you have the complete code for the pipeline component?


Post Your Comment...