YAPS: Random acts of kindness regarding Reports, Excel, XSLT, and you

You've provided some gratifying responses to my unintended continuing series on using custom XSLT to derive better Excel  or other formats for your reports than the standard stuff  you get from SS Reporting Services' Excel renderer.  With the gratifying responses come lots of questions about details and areas that I haven't covered.  So here's a round up of some of those points.

I'm going to thank Beth for most of the items in this catch-all list, but I'd also like to issue a blanket "thank-you" to the astounding number of readers who continue to be interested in this topic.  

What do you use to test the XSLT (Part 1)?

A number of people have asked about my toolset; I've talked about the commercial XML editors that I use, and recommended some others.  Most XML editors give you an opportunity to generate output by apply an XSLT transformation to an XML document, although each is different. 

For those of you who are working in Visual Studio or the BIDS version of Visual Studio that ships with SQL Server as a development tool, you have a very simple choice in your current environment without adding on any tools. 

I didn't realize this was a problem for you, until Beth wrote about it.  As she worked on her custom XSLT for a report, she was deploying the XSLT to her report server to check it out — or possibly exporting XML from within the Report Preview in Visual Studio;  I'm not really sure. 

The point is, running your report to check your transformation is the long way around, especially if your report runs lots of complex queries. 

Let's leave the RDL alone, and stop poking it, shall we?  You're finished developing the thing, you're just supplying an alternative format now.

Here's what I suggest you do:

  1. Export the XML from your report without  adding a custom XSLT, so you have your raw source file.  You needed this anyway, to figure out how the file was constructed, right?
  2. Open the exported XML as an XML file in Visual Studio.
  3. Open the Properties Window for this XML Document; you'll see something like this (in VS 2005; slightly better in VS 2008 in every way):
    Editing an XML Document in VS 2005  

  4. You see, in the Properties window, where there's a place for a Stylesheet name?  Go ahead and navigate to the one you've been working on; there's a little ellipsis for a file-picker dialog to the right, on that line, not showing in the screen shot here.  While you're at it, put an "output" filename in the appropriate row, although this is not required. 
  5. After visiting the Properties window, click back into the XML Document editor, on the left side, and wait for your change-of-focus to register.  You should see an XML context menu pad.  Open it.  Guess what happens when you use the "Show XSLT Output" item on this menu popup?

    No prizes for guessing...
  6. If you're a VS person anyway the above procedure is really all you need.  While VS isn't a great XSLT editor, it's fine for occasional use, and you can do all your work within your accustomed IDE without strain.

What do you use to test the XSLT (Part Deux)?

What if you're not a VS person?  Well, frankly, if I just want to apply an XSLT to an XML, I usually go with the command-line approach, using MSXSL.EXE.  This tiny EXE (it's under 30K) has decent error reporting, allows you to specify XSLT parameters (although you won't be doing that in an RS context), etc.  It's freely   available from Microsoft from the XML developer section of MSDN. You'll find it easily if you search for "command line transformation utility" or "msxsl.exe" in the downloads section.

MSXSL is, of course, just a quick wrapper around the MSXML core libraries, so you can also write yourself a little VBS script that does the same thing.   I've already posted an example of how to do this, and the example script referred to in that post, transformRDL.vbs, is part of the samples that accompany RDL Documenter.

MSXSL.EXE command line options

Put some PowerShell to work on top of this, if you want to!

The raw xml is the page content data, not the page header and footer data

Most banded report writers have to make a distinction between what I call the "formatting bands" and the "content bands".  "Formatting bands" are full of information that requires pagination to be applied, and therefore will be different from renderer to renderer, and different rendering instance even when done by the same renderer, because of different output page sizes.  In the case of the SQL Server Reporting Services engine's XML renderer, there is no concept of pagination, so the XML renderer only gives you the contents of the page body. 

Beth wanted to include some information from the page header in her report (possibly in the sheet tab name, or the print setup information we've just discussed).  Since the XML renderer doesn't give you that information, the usual way to handle this is to include everything you want in the page body instead.  You can make it white-on-white, so it doesn't appear in other renditions, or play other tricks to hide it. 

For what it's worth, I sometimes find myself putting the whole "Page header" into the body of the report for other reasons, and telling it to repeat on page.

Ersatz Page Header information in Report Body, using repeat on page

If you have an issue with getting something like this to work, give me a specific case, and I'll explain in more detail.

What can SSML (Excel 2003 XML Dialect) do? 
What can't it do? 
And how do you know?

Just BTW, and as a caveat, I didn't set out to become an expert in SSML, but it's a neglected art, and people keep coming to me for help in it.

It's no wonder SSML is neglected.  If you look up that acronym in Standards Land, it's already been re-used for another purpose (Speech Synthesis Markup Language).  In Microsoft Land, it's been superseded by Excel 2007's XML format, which is a bit harder to use and create, although quite a bit more capable. 

You certainly can't use XSLT with RS to get Excel 2007 as easily as you can create SSML; you'll need some more manual work than just attaching an XSLT to an RDL definition on the server.  So, perfect it isn't, but I still like Excel 2003 XML.

As a general rule, SSML can do … most things that Excel can do, except for serving as a host to "foreign objects", such as MS Chart and images.  If you want to know exactly what is included in "most things", start with the documentation.

Where in the docs do you find good information about this?  The answer will probably surprise you: try doing a Save As XML Spreadsheet 2003 in Excel 2007.  Even if you're saving a blank workbook, you'll get a warning about features that won't be included in this format.

Saving to SSML from Excel 2007

When you press help, here's what you learn:

  


XML Spreadsheet 2003


This XML Spreadsheet 2003 file format (.xml) does not retain the following features:

  • Auditing tracer arrows
  • Chart and other graphic objects
  • Chart sheets, macro sheets, dialog sheets
  • Custom views
  • Data consolidation references
  • Drawing object layers
  • Outlining and grouping features
  • Password-protected worksheet data
  • Scenarios
  • User-defined function categories
  • VBA projects

New Office Excel 2007 features, such as improved conditional formatting, are not supported in this file format. The new row and column limits of Excel 2007, however, are supported.


I know you'll be shocked to hear this…

The help file entry you see above is far more complete than the information we had about SSML in Excel 2003 documentation, but it's not complete-complete.

So how do you really know?

Most people get started writing SSML by Saving Excel As SSML and then looking at what they get.  They slavishly follow what they got from the internal "decisions" that Excel makes about how to do this translation.

This practice is okay for a while, but it  leads to a lot of extremely bad practices.

For example, Excel uses generated style names that are difficult to maintain, and tends to include many more of them than you actually have used in any Excel workbook, just because you've changed your mind a couple of times while working on it.  For goodness' sakes: figure out which you need, toss the rest out.  Then re-name them with sensible names so you know to which types of cells you should apply them when you write your XSLT.  I've put a small snippet below; you'll see that I've used a reasonable name.

The right place to learn about SSML is from its schema documentation.

SSML is an XML dialect, the schema is published, as it is with most XML dialects, to give you exactly this information. SSML's XSD is available to tell you exactly what is allowed and what is not allowed, what Excel will recognize and render and what it can't.   Save As gives you the "learn from an example" method, but you can never be sure how much in that example was a quirk and how much was required.  The XSD gives you the whole picture and the correct syntax for everything, without ambiguity.

You'll find the full set of Office 2003 XML Reference Schemas are available on the Microsoft site; if the link I've just given you is not correct, just search for that phrase on the MS site and you'll find it.  The organization of the various general and specific schema files, and the contents of the Excel-specific schema, are straightforward and clear, and they're all heavily annotated.

Office 2003 Reference Schemas

Extra tip:

The editor in the screenshot above is a free editor called First Object XML Editor.  By a wierd coincidence, its short name is FOXE!

This app has few frills but some notable benefits:

  • It loads very large files extremely fast.
  • It's a tiny download with no installation routine (it's a C++ program, I believe).
  • It handles file encodings well (including Chinese characters).
  • It searches and formats/indents large files like a dream.  My team uses it when a file has been created without extra white space. 

    Visual Studio tends to balk at the load because of "long line lengths", or crash when you ignore the warnings and try to load the large file anyway.   We load it quickly to FOXE, use the Indent feature to break the lines up, save it, and then Visual Studio will load the file.

Beth's question: a case-in-point for what's missing from the help file and what's hard to find out

If you look at the screen shot above, you'll see that I've zeroed in on the section describing the "PrintType" complex type, which Contains workbook and worksheet print options, according to the schema annotation.  Just as with "regular" binary Excel, you can specify the comments layout, whether you want draft quality, scaling, paper sizes and resolutions, etc. You can specify whether gridlines are printed, and whether columns are printed left-right or top-bottom as a large worksheet is broken into printer-page-size chunks.

Beth wanted to use Excel's facility to repeat a designated set of rows on every printed page, to use as static column headers.  This was something I had never had occasion to do.

SSML will let you freeze a set of columns or rows in the screen-display of a worksheet, just like binary Excel, and it's very easy to find the facility in the SSML schema:

             <xsd:element name="SplitHorizontal" type="xsd:int" default="0" minOccurs="0">
                <xsd:annotation>
                    <xsd:documentation>Contains the number of points from the top of the window that a worksheet is split horizontally.</xsd:documentation>
                </xsd:annotation>
            </xsd:element>
            <xsd:element name="SplitVertical" type="xsd:int" default="0" minOccurs="0">
                <xsd:annotation>
                    <xsd:documentation>Contains the number of points from the left of the window that a worksheet is split vertically.</xsd:documentation>
                </xsd:annotation>
            </xsd:element>
            <xsd:element name="FreezePanes" minOccurs="0">
                <xsd:annotation>
                    <xsd:documentation>Specifies whether the panes of a worksheet window are frozen.</xsd:documentation>
                </xsd:annotation>
            </xsd:element>
            <xsd:element name="FrozenNoSplit" minOccurs="0">
                <xsd:annotation>
                    <xsd:documentation>Specifies whether the frozen panes of a worksheet window are not split.</xsd:documentation>
                </xsd:annotation>
            </xsd:element>
            

… and SSML also gives you the opportunity to include custom content in the page header and footer; this is clearly spelled out in the XSD.  But I couldn't find what Beth was asking for, and this really had me scratching my head.  The schema contains about a zillion different settings to specify attributes of Pivot Tables. No matter how warped Microsoft priorities are, it seemed inconceivable that they would have left out this basic feature.

Well, they didn't leave it out but, it's not something you'll really figure out from the schema, either.

Sigh.  When in doubt, go back to the "learn from an example" strategy.

It turns out that this feature isn't baked into the Excel XSD because it uses one of Excel's "magic" named ranges: the one called Print_Titles.  Anything included in a named range with this magic name gets treated in a special way, and this is just as true in SSML as in binary Excel.

But the magic range names aren't called out in the annotations (where else would they be able to put a secret knowledge thing like this, from an XML perspective?!?).

The Named Range feature is, of course, thoroughly baked into the XSD.  To use it, you define a range with the magic name and tell it what rows to look at…

<Worksheet ss:Name="Sheet1">
  <Names>
   <NamedRange ss:Name="Print_Titles" ss:RefersTo="=Sheet1!R1:R3"/>
  </Names> 

… then you use the NamedCell syntax to call out the items that will actually need to be repeated by Excel. (And don't ask me why it works this way.  I don't know.  That is one of the dangers of learning from examples, remember?)

 <Table>  
   <Row ss:AutoFitHeight="0">
    <Cell><Data ss:Type="String">My major title line</Data><NamedCell
      
ss:Name="Print_Titles"/></Cell>
   </Row>
   <Row ss:AutoFitHeight="0">
    <Cell><Data ss:Type="String">My minor title line</Data><NamedCell
      
ss:Name="Print_Titles"/></Cell>
   </Row>
   <Row ss:AutoFitHeight="0" >
    <Cell ss:StyleID="sHeaderLeft"><NamedCell ss:Name="Print_Titles"/></Cell>
    <Cell ss:StyleID="sHeader"><NamedCell ss:Name="Print_Titles"/></Cell>
    <Cell ss:StyleID="sHeader"><NamedCell ss:Name="Print_Titles"/></Cell>
    <Cell ss:StyleID="sHeader"><NamedCell ss:Name="Print_Titles"/></Cell>
    <Cell ss:StyleID="sHeader"><Data ss:Type="String">Address</Data><<NamedCell ss:Name="Print_Titles"/></Cell>
    <Cell ss:StyleID="sHeader"><Data ss:Type="String">Invoice To</Data><NamedCell
      
ss:Name="Print_Titles"/></Cell>
    <Cell ss:StyleID="sHeader"><Data ss:Type="String">$</Data><NamedCell
      
ss:Name="Print_Titles"/></Cell>
    <Cell ss:StyleID="sHeaderRight"><Data ss:Type="String">Notes</Data><NamedCell
      
ss:Name="Print_Titles"/></Cell>
   </Row>

One more thing:

When you're trying to find a feature like print setup in the Excel schema, remember to distinguish between an item that is specific to a Worksheet and an item that is global to a Workbook.  These items will be in different places in the schema. 

Print_Titles happens to be a "sheet thing", as my example above probably shows.  However, if I am not mistaken, Excel ranges used for other purposes can be global to the book as well.

What have I forgotten?

There is no end to the wonder and glory that is Excel.

Here's an apology if I have not addressed your question in this random list.  Nudge me e-mail or something and I'll top it up.