YAPS on Dynamic PageBreaks: Mixing up the Groups

Karthik asks

My actual task is to display totals of each group at the end of page breaks which are further grouped on count of 1000. I don’t know how to proceed with this issue. I have also queried the same at MSDN forums at the following links but haven’t been getting any responses at all.

social.technet.microsoft.com/…/b3ea0253-bb5a-463d-82e8-4e9d1d9eed3a

Could you please provide any tips/suggestions on the way to deal with this task?

This is actually a tough one, but luckily I brought grenades.  Today’s Karthik’s day in TechSpoken Court…

Why is this so hard?

Let’s  quickly recap the problem.  I will demonstrate Karthik’s scenario using the City table in my standard SqlWorld sample db. 

SELECT id, countrycode, district, name, LEFT(name,1) AS NameInitial, population
FROM city
ORDER BYcountrycode,district,name

Karthik has three “real” groupings and aggregation levels in his report, which we will model using CountryCode, District, and the first initial of the city name (that’s why the ORDER BY clause).

Typically, you put a conditional or dynamic page break instruction on an outer group.  In this case Karthik has used CEILING(RowNumber(NOTHING)/1000) as his outermost group instruction. 

When an outer group breaks, each inner group breaks too; they are “child” groups, or nested aggregates.  Karthik’s problem occurs when one of his “real” groups has more than 1000 rows.  Karthik doesn’t want to see his sums or other aggregates until the “real” groups are actually finished.

Since my sample doesn’t really have more than 1000 rows in any one set, and to test properly, I’ve made the “Break on” figure configurable, with a default value of 3 rows.  I’ve illustrated the problem like this:

 

The AGO country code grouping that starts on page 2 doesn’t finish on page 2.  Therefore, even though a group break is forced by the outer group every 3 detail rows, we shouldn’t have totals for AGO on page 2.

The report is doing what we said, but it isn’t doing what we meant.

What not to try

I first tried to solve this problem using a “remove the groups” strategy, which I’ve used successfully in other scenarios before.  It quickly turned into a nightmare.  I’ll cut to the chase and say… don’t bother.

Inner dynamic group breaks

I realized that the right thing to do was to move the dynamic break group inside the ones that might have more members and shouldn’t show aggregates until they legitimately ended.  This is the right solution, but it turned out to be a little more complicated than expected.

Before I explain, the usual caveats.  In this case, there are two things to keep in mind:

1. My analysis may be better than my solution. Read what’s involved, because I’ve done this part right. Maybe you’ll think of a better way to handle what needs to be done.

2. I’m only moving one level of grouping inwards, because that’s all that’s necessary to show you what’s involved — even though, as you’ll see, the District group could also “overflow” the test break value of three rows that I’m using in these screen shots is so small. However, if Karthik or you have more than one additional level that could have more rows than the explicit page breaks you want to create, it doesn’t change the problem. Whatever solution you use — whether mine or your improved version — the thing-to-fix, and how it works, remain exactly the same.

The unkindest cut

We start, as explained  by moving the outermost group with the special “CEILING” expression one level inward.  You can try taking Karthik’s expression and just moving it, as-is, if you want to play along.  Here’s the error you’ll get:

A group expression for the grouping ‘Standard Page Break’ uses the RowNumber function with a scope that is not valid.  When used in a group expression, the value of the scope parameter of RowNumber must equal the name of the group directly containing the current group.

Ouch. Note: this is the error message I get in RS 2008 using the Report Designer; it might be slightly less explicit using ReportBuilder or RS 2005, but I imagine the engine rules are the same.

You can try to do the obvious thing and use the scope suggested in this error message (in my case, that would be RowNumber(“table1_CountryCode”)).  This may seem to work for you — for me, it didn’t have the desired results.   Maybe I should have perservered, but I think that different sort orders, on different levels, may interfere. So, instead, I thought about how I could mimic the desired behavior of RowNumber(Nothing) without actually using the forbidden syntax.

The second cut

I did a number of things with Code functions, counting rows myself, which seemed to sort-of-work.  But the behavior seemed a bit flakey, especially when rendering to HTML and flipping back and forth between pages. 

In the end, I took the more reliable route of creating my own row counter in SQL, like this: 

SELECT
Name,CountryCode, District, Population, ID,
ROW_NUMBER() ORDER BY CountryCode,District,Name)
FROM City
ORDER BY CountryCode,District,Name

— the new group break expression is:
— =CEILING(Fields!RowNum.Value/Parameters!BreakOn.Value)

… and that sort-of worked. I mean, it worked for most pages.

But now I had a new problem. On pages where the CountryCode (outer) group break and the pagebreak limit (Parameters!BreakOn.Value) coincided exactly, no page break occurred.  Apparently, the fact that the outer group said “no page break” countermanded the inner group’s explicit page break instructions. 

This behavior was consistent in different renderers, and seemed to have nothing to do with margins, sort orders, or anything else I could tweak.  So, clearly, more effort was required.

Giving myself a break

By the time I figured out the pattern of the mysterious pages that didn’t work as expected, I wanted to solve this in a very literal-minded fashion and get it over with already.  I already had a way to say “do a page breaks without unwarranted group header and footers”, that was inner.  But now I needed to say “Also break at the same time as group headers and footers are warranted”.  So, I realized I needed a second way to break — outermost, where it would be the highest-level authority on whether a page break was called for.

 

What would be the proper group break expression for the new group, though?

I knew I wanted to to give the group these instructions: “Only break when (a) you’ve hit the right number of rows and (b) you’ve also hit the CountryCode group break”. 

Again, thinking literally, I knew that the second condition could be satisfied by checking the current CountryCode against the previous row’s CountryCode.  But, in using the Previous function, I ran into similar limitations in group expressions as I’d faced with the RowNumber() function:

A group expression includes the aggregate function ‘Previous’.  Previous cannot be used in group expressions.

Ouch squared.

I tried a lot of ways to get around this expression, as I did earlier (because I am stubborn), by writing code to compare the values myself, but in the end I took the somewhat more controllable path of adding the value directly into my SQL query, like this:

SELECT curr.*, prev.CountryCode AS PrevCountryCode
FROM 
  
(
    S
ELECT TOP 100 PERCENT ROW_NUMBER() Over  
      (ORDER BY CountryCode,District,Name) AS RowNum, 
    RTRIM(Name) AS Name,CountryCode, District, Population, ID  
    FROM City 
    ORDER BY countrycode,district,name
   ) curr
LEFT JOIN
   (
   SELECT TOP 100 PERCENT ROW_NUMBER() Over
     (ORDER BY CountryCode,District,Name) AS RowNum, CountryCode 
   FROM City  
   ORDER BY countrycode,district,name
   ) prev
ON curr.RowNum1 = prev. RowNum

… and, while it may be possible to handle the resulting evaluation in a direct group break expression, I found it easiest to set up a function to do it, like this:

Ignore the Intellisense failure (annoying isn’t it?) which just indicates that Intellisense isn’t very… intelligent… about custom code. 

And you can ignore the ShowIssue parameter, which I only needed to show you the problem that occurs when this break behavior isn’t included in my sample report.

The report script looks like this (as you can see it’s very simple and literal-minded):

Dim CurrBreakNum As Integer = 0  
Dim CurrCeiling As Integer = 0  
 
Function GetBreak( _  
   ByVal showIssue As Boolean, ByVal thisCeiling As Integer , _  
   ByVal currCode As String, ByVal prevCode As String) As Integer
 
   If showIssue Then

      ‘ never break, we want to show the issue


   ElseIf
 prevCode Is Nothing OrElse Len(prevCode) = 0 Then

      ‘ first row, there is no previous value

      CurrBreakNum = 1

   ElseIf
(CurrCeiling <> thisCeiling) AndAlso (currCode <> prevCode) Then

      ‘ if we’ve hit the special situation where
      ‘ a break is called for and the “outer” group
      ‘ breaks on the same row, we have to force 
      ‘ an external break…
      CurrBreakNum += 1

   End
If

   CurrCeiling = thisCeiling

   Return CurrBreakNum  

End Function

See?

Simple, literal-minded, and rather brute-force-ish.  And of course I tried all kinds of ways to be clever first.

But… when you think about it, you’ll see that this approach makes sense and fits “what’s really going on” in the engine. And, more importantly, it works.

24 thoughts on “YAPS on Dynamic PageBreaks: Mixing up the Groups

  1. Hi Lisa,

    First of all thanks a lot for this wonderful and highly informative post. After today’s discussion ,re-working on the solution with the same code I was able to successfully insert the page breaks in the report. I don’t know why it created problems yesterday in the studio. I wish it could have worked yesterday and the misunderstanding could have been avoided. Am sorry again for that. I should have first been on record. 🙂 Here are few things that I have observed.

    (1) Page break was occurring for a group having less than 1000 records. For example, in the Customer Type group ‘Electronic Invoicing’ is appearing on two pages having total records of 41. And in these two pages, the totals displayed were for the page level i.e. for 21 & 20 records respectively.

    (2) Page breaks were also occurring for every change in Customer Type. With a total of 2800 records it should have been total 3 pages whereas I am currently seeing 6 pages. Type1(3 pgs), Type2(1 pg), Type3(2 pgs) and Total Sum in the last page along with the type3 in last page.

    (3) Exporting such report into excel I see total of 9 sheets, the behavior of which I am not able to understand.

    What I have done –

    With the new dataset query of prev and current, first I tested with only the inner grouping with the group expression for page break – =CEILING(Fields!RowNum.Value/1000). Doing only this I was getting the totals as desired for all the groups but the issue (3) persists and I am getting 5 sheets.

    I have then added the OuterGroup with the Code.GetBreak function as Group expression and having only ‘Page break at End’ checkbox value ticked. I hope that is what you have meant when you say – “proper group break expression” and the above issues have been observed.

    Firstly please do let me know if I have gone overboard expressing these observations and I shall try to see on how to resolve taking one at a time. I am also trying in the meantime though.

    Secondly, as explained my real issue is to deal with exporting report data which has more than 65,536 records. On searching we realized the only solution to this is by having a page break expression. In our current environment though we are having only 2800 records. Doing this we saw that the totals were accumulating for individual pages rather than the group as a whole (the essence of the problem). Resolving this we wanted to mimic the same behavior with a page break of 65000.

    Thanks again.

  2. Hi Karthik,

    If you’re getting page breaks where you should not have them, then there is something wrong with the way you’re setting the page breaks. I may have missed a step in this walkthrough/explanation but I don’t think so… or you may have missed a step in the implementation.

    * or * — and this is possible — the problem may be that I was testing in 2008 and you were using 2005. If this is so, then the behavior I’m seeing, where the outer group countermanded the instructions of the StandardGroupBreak, might not be occuring in 2005, and you just don’t need the extra outer group break at all. If that’s the case, you should be able to get exactly what you want without the extra complexity, problem solved.

    I haven’t had time to test in both environments, and didn’t think this was a behavior that would be new in 2008. I don’t have 2005 available to me right now. It would be interesting to see this and I could try next weekend.

    The behavior of Excel makes perfect sense; it will break in every case where an explicit page break is called for, regardless of what that happens to be. Again, I did not see this, but I was working in 2008. It is absolutely true that the Excel renderer changed a lot in 2008 (and is slated to change again with R2).

    Where the real problem is with a particular rendition, the most important thing is that the desired renderer behaves as you want; the rest of the versions of the output really don’t matter. So let’s look at the proper ways to fix your real issue!

    1 – Use XML, not Excel, and use XSLT to get the Excel you want as SSML. This is something I’ve discussed in a number of other posts. Get your page breaks wherever you want, and noplace else, plus name your tabs how you want, etc, etc.

    2 – Use XML and write some code to export to Excel 2007 where the restriction to 65k records will not exist.

    3 – Accept the limitations of Excel and tell your users that this report requires some type of limitation of the contents (I’m not sure if this is possible with your data or your user set, but if they insist on using Excel 2003, these are the limitations). Instead of experimenting with breaks on 1000, ask them what they would prefer, given this limitation.

    You may be surprised at the answer, and it may radically change your implementation. For example, they may say they would prefer one sheet (because of the use they are making of the Excel output), no matter what. Offer them a TOP N parameter in this case, with an appropriate ORDER BY (TOP N does work with a variable in 2005, right? If not, you’ll need a little dynamic SQL).

    With this kind of response, usually users are intent on analysis with Excel, so you can also suggest that they use CSV instead. Never mind the fancy formatting. (
    I don’t know if you’re using ReportViewer or ReportManager or grabbing the report output via code or a schedule, but you *can* hide Excel as an output type if you need to. )

    Alternatively, they might say “We didn’t realize that was a problem, let’s have a page break on every CustomerType group”, in which case you’ll know what to do ;-). You can expose this as a separate report for them, “Excel-Centric”.

    You may have already taken this route, but I want to make sure that you and everybody else who might be reading understand that you should not leave users out of the discussion, just in case.

    IMHO a page break of 65k is not really a useful resolution for most user scenarios, and it’s best to let them think about how they will use the data, before deciding for them. It’s important to understand *why* users export to Excel, not just make sure it doesn’t break — just as it’s important for me to understand *why* you have your requirement before answering your question.

    >L<

  3. Hey Karthik,

    It turns out to be a lot easier in 2005 — and works exactly the way I expected it should. Sigh. I will write a new post and explain.

  4. >l< thats really helped me, ive been stuck with a similar problem to Lisa but id nearly pulled all the hair from my head trying to work around it. Ive got some new ideas now, thankyou 🙂

  5. Yes, I agree that it changes in the year 2008. But unlike all changes it is not for the better. It is harder to understand and use unlike way back in 2005.

  6. Nancy O:
    >>Does this mean that it’s not a problem when you use Excel 2007? Thanks!

    Yes, the number of rows and columns are greatly increased in Excel 2007. Hope this helps.

  7. On searching we realized the only solution to this is by having a page break expression. In our current environment though we are having only 2800 records. Doing this we saw that the totals were accumulating for individual pages rather than the group as a whole (the essence of the problem).

  8. I never found an answer on MSDN forum when I really need a help, unfortunately don`t know the answer on your question but thing the best way of finding them is keep trying to solve it by yourself.

  9. thanks Lisa! I’m just now learning SQL after far too long so it’s good to see some useful techniques that can be used

Leave a Reply

Your email address will not be published. Required fields are marked *