
There is no doubt that the new Office Open XML file format is a huge gain for every developer, and as conseguence for users too.
The new file format is native in Office 2007 but can be also used natively with a plugin in Office 2003 and with external (free) utilities with previous Office releases but also OpenOffice. The very first gain is that the file format is an ECMA and recently ISO standard. This means that two committees ensure the stability and evolution of this format.
In order to read and write docx, xlsx and pptx (word processing, spreadsheet and presentation) files there are few things to keep in mind:
The OPC file format
Ok, let's start and create a new Word document without using Word.
1: using(var doc = WordprocessingDocument.Create(
2: fileName, WordprocessingDocumentType.Document)) 3: { 4: MainDocumentPart mainPart = doc.AddMainDocumentPart();5: // ....
6: }This is the most essential code that you need to create the document but obviously few things more are needed.
Replace line 5 (the comment) with the following snippet to create the body of the document:
1: mainPart.Document = new Document(new Body());
2: 3: Paragraph par = CreateSimpleParagraph(); 4: mainPart.Document.Body.Append(par);Finally create a paragraph with the following code:
1: private Paragraph CreateSimpleParagraph()
2: {3: Paragraph par = new Paragraph(new Run(
4: new Text("Welcome to TechDays/WPC 2009")));
5: return par;
6: }That's all! Now let's add a couple of methods to create formatted paragraphs, just to show how to add some decoration to our document.
1: private Paragraph CreateFormattedParagraph()
2: {3: Paragraph par = new Paragraph(new ParagraphProperties());
4: par.ParagraphProperties.TextAlignment = new TextAlignment();
5: par.ParagraphProperties.TextAlignment.Val = VerticalTextAlignmentValues.Top; 6: 7: par.ParagraphProperties.Justification = new Justification();
8: par.ParagraphProperties.Justification.Val = JustificationValues.Right; 9: 10: par.ParagraphProperties.Indentation = new Indentation();
11: par.ParagraphProperties.Indentation.FirstLine = 12; 12: 13: Run run = new Run(new Text("Welcome to TechDays/WPC 2009"));
14: //run.Append(new Break() { Type = BreakValues.Page }); // optional
15: par.Append(run);16: return par;
17: } 18: 19: private Paragraph CreateBorderParagraph()
20: {21: Paragraph par = new Paragraph(new ParagraphProperties());
22: par.ParagraphProperties.ParagraphBorders = new ParagraphBorders();
23: par.ParagraphProperties.ParagraphBorders.LeftBorder = new LeftBorder();
24: par.ParagraphProperties.ParagraphBorders.LeftBorder.Size = 24; 25: par.ParagraphProperties.ParagraphBorders.LeftBorder.Val = BorderValues.Single;26: par.ParagraphProperties.ParagraphBorders.LeftBorder.Color = "4F81BD";
27: 28: 29: par.Append(new Run(new Text("Hello, world")));
30: return par;
31: }
You may wonder now how many tags /classes are available to the developer to create a full featured document. I can honestly say that they are so many that you will need the specification on your knees or a secondary monitor with the specs.
There is a valid alternative that, trust me, you will really love:
Isn't it wonderful? This also explain why the SDK object model is unusual. It is not strong typed and you are free to create invalid hierarchies, that is invalid documents. The advantage to have a one to one match between tags and classes is that an existing document can easily converted in code as DocumentReflector does.
Take the time to look at the SDK documentation that is in the C:\Program Files\Open XML Format SDK\V2.0\doc folder. It's organized in two sections: a valuable "how-to" article collection and the file format documentation that is really more powerful than the official format specifications.
I have a final example to talk about. I took from the Internet the ascii text for Julius Caesar's "De Bello Gallico", a well-known avatar for the author and created the book-formatted document. The result is the following:
In this example, the most important innovation is the use of styles. Every word processor user should use styles instead of formatting the document. This let's you easily change the look of the document by changing the styles only.
Creating the styles is pretty easy. You typically define one base style and then define the others based on the first one.
1: private Styles GetStyles()
2: {3: Styles styles = new Styles(
4: // Normal
5: new Style(
6: new StyleName() { Val = "Normal" },
7: new PrimaryStyle()
8: ) { Type = StyleValues.Paragraph, StyleId = "Normal", Default = true },
9: 10: // ParaFancy
11: new Style(
12: new StyleName() { Val = "ParaFancy" },
13: new BasedOn() { Val = "Normal" },
14: new NextParagraphStyle() { Val = "ParaFancy" },
15: new PrimaryStyle(),
16: new StyleParagraphProperties(
17: new Indentation() { FirstLine = 220U },
18: new Justification() { Val = JustificationValues.Both },
19: new SpacingBetweenLines() { After = (UInt32Value)300U, Line = 240, LineRule = LineSpacingRuleValues.Auto },
20: new WidowControl()
21: )22: ) { Type = StyleValues.Paragraph, StyleId = "ParaFancy" },
23: 24: // Title
25: new Style(
26: new StyleName() { Val = "Title" },
27: new BasedOn() { Val = "Normal" },
28: new NextParagraphStyle() { Val = "Normal" },
29: new LinkedStyle() { Val = "TitleChar" },
30: new PrimaryStyle(),
31: new StyleParagraphProperties(
32: new ParagraphBorders(
33: new BottomBorder() { Val = BorderValues.Single, Color = "4F81BD", Size = (UInt32Value)8U, Space = (UInt32Value)4U }),
34: new SpacingBetweenLines() { After = (UInt32Value)300U, Line = 240, LineRule = LineSpacingRuleValues.Auto },
35: new ContextualSpacing()
36: ),37: new StyleRunProperties(
38: new Color() { Val = "17365D" },
39: new Spacing() { Val = 5 },
40: new Kern() { Val = (UInt32Value)28U },
41: new FontSize() { Val = (UInt32Value)52U },
42: new FontSizeComplexScript() { Val = (UInt32Value)52U })
43: ) 44: );45: return styles;
46: }Styles must be embedded in another part (xml file) of the OPC container. This is done with only two lines:
1: StyleDefinitionsPart stylePart = mainPart.AddNewPart<StyleDefinitionsPart>(); 2: stylePart.Styles = GetStyles();Once defined the styles, you assign the desired style while creating the paragraph:
1: private Paragraph CreateParagraph(string str, string style)
2: {3: Paragraph par = new Paragraph(
4: GetPropertiesForStyle(style),5: new Run(new Text(str)));
6: return par;
7: } 8: 9: public ParagraphProperties GetPropertiesForStyle(string StyleName)
10: { 11: var element =12: new ParagraphProperties(
13: new ParagraphStyleId() { Val = StyleName });
14: return element;
15: }Formatting the first letter of each paragraph is only a matter of fantasy. This is the obvious method implementation:
1: private Paragraph CreateParagraphFirstLetterBold(string str, string style)
2: {3: Paragraph par = new Paragraph(GetPropertiesForStyle(style));
4: 5: if(str.Length == 0)
6: return par;
7: if(str.Length == 1 || style != "ParaFancy")
8: {9: par.Append(new Run(new Text(str)));
10: return par;
11: } 12: 13: string FirstChar = str[0].ToString();
14: string Rest = str.Substring(1);
15: 16: par.Append(17: new Run(
18: new RunProperties(
19: new Bold(),
20: new FontSize() { Val = 32U }
21: ),22: new Text(FirstChar)
23: ),24: new Run(
25: new Text(Rest))
26: ); 27: 28: return par;
29: }Inserting the image is quite tedious and longer, so I simply used the DocumentReflector tool as you will see in the attached sample.
But now there is a last super-important information to include in our document: document properties.
We live in the metadata-age, metadata are fundamental to index, categorize, tag, retrieve the documents and we can add these precious information with few lines of code.
1: using(WordprocessingDocument doc = WordprocessingDocument.Create(document, WordprocessingDocumentType.Document))
2: { 3: MainDocumentPart mainPart = doc.AddMainDocumentPart(); 4: 5: doc.PackageProperties.Creator = "Raf";
6: doc.PackageProperties.Category = "Sample";
7: doc.PackageProperties.Keywords = "Caesar De Bello Gallico Latin";
8: doc.PackageProperties.Description = "This is a sample for TechDays/WPC 2009 conference";
9: doc.PackageProperties.ContentStatus = "First draft by Raf";
10: doc.PackageProperties.Subject = "Office OpenXML";
11: doc.PackageProperties.Title = "DocumentProperties Sample";
12: // ...Copyright (c) Raffaele Rialdi 2009, Senior Software Developer, Consultant, p.iva IT01741850992, hosted by Vevy Europe Advanced Technologies Division. Site created by Raffaele Rialdi, 2009 - 2011