Writing Dynamic SQL in and for RDLs

I know, I know, it's a boring title. But it's a big, exciting, and important subject — and I've now promised people in three different RS threads that I would write about it.

So here goes.

Why do it?

There are a lot of reasons to bind some querying logic very tightly to the report it runs. I said "some".  I know there are a lot of reasons, and a lot of cases, where it's a better idea to bind it tightly to the database instead. But consider these possibilities:

  • Report designer personnel may not have direct access to all the source databases.  The report designer is given a model or a set of views to which the report will have access, and that's it.  It's possible to do a great deal of manipulation on top of those views, but s/he has to put that fancy code somewhere.
  • The fancy report logic may be very tightly coupled with some expressions and behavior (say, visibility or aggregation) that only affect one report.  In such a case it is often better to bind this logic to the RDL, knowing that it is not exposed elsewhere. If the report requirements change you can guarantee that you can do it without side effects in anybody else's reports or database behavior.
  • Building some of the dynamic logic for a reporting query can make good use of VB functionality not found natively in SQL.  (You'll find a typical example in this post, by the way.) Sure, you can write a Split function in T-SQL (I have, everybody has) and sure you can allow CLR integrated code in SQL Server 2005.  But what if one of your sources isn't SQL Server 2005?

How can it be safe?

The first thing that everyone says when you start talking about writing dynamic sql is "security risk — SQL injection".  And everyone is right to be concerned. 

In the RDL context, very often the submissions from the user come from a strictly controlled (by you, the developer) set of options, displayed in radio buttons or drop-down controls; a user isn't allowed to submit straight text. 

Sometimes, yes, a text parameter is used as a filter. Or programmatic access via URL Access or SOAP permits the submission of values that are not so strictly controlled. But I think you'll see that the way I'm constructing the SQL (and the way the parameters submitted are deconstructed in the process)  it is extremely easy to test the contents of the user-submitted text in ways that obviate the possibility of SQL injection. 

If you think otherwise when you see what I'm doing with the parameter values herein, please speak up and provide an example of how straight SQL could be injected and actually executed. I'll want to refine the way I'm testing the submissions, of course.

In the meantime, I'll also point out that I use this technique on intranet sites with integrated Windows authentication.  Report access is typically very tightly controlled (in fact, in some cases the requests are being submitted only by other applications that we've written). If somebody is on these intranets with appropriate network access to submit the parameters, s/he can do a lot more damage elsewhere already.

If you did (say) have an internet-accessible site that (for example) permits a user to send you a list of fields to be SELECTed or an ORDER BY clause as text parameters, so that you were just stringing the clauses together, you could still validate what you got. You could still add tests for keywords within the user-submitted text, and you could still restrict the allowable fields and values within the text to a known set of column names and expressions.  It is still possible to do this safely.

How do I do it?

This is the "walkthrough" part.

Let's take a simple case: You have a report query that shows orders, and one column represents a status value.  Your users would like to filter the list by status value.  Of course you could add a filter to the data region, but then you've pulled all the orders and status values across the network when you only need some of them.  It would be better to put the filter on the query, wouldn't it?  This filter value is optional; they can still show all orders if they want to.

The original query looked like this:

SELECT OrderNo, OrderStatus FROM MyOrdersTable

You build a dropdown showing the available order status values, and you also include the value 'Any'. (If the dropdown values are query-driven, use a UNION with the explicit 'Any' value.)

You want a query that looks like the above, if 'Any' is chosen, but you want a WHERE clause if a status value is specified. 

As a first step, realize that the query in the little Dataset-defining window can be an expression instead of a literal.  The following expression is a valid representation of the one you have already written:

="SELECT OrderNo, OrderStatus FROM MyOrdersTable "

Therefore, it follows that you can write a more complex expression with some VB functionality defining the string, such as this:

="SELECT OrderNo, OrderStatus FROM MyOrdersTable "  &
   IIF(Parameters!SelectedStatus.Value = "Any", " ",
       " WHERE OrderStatus =  '" & Parameters!SelectedStatus.Value  & "' " )

So there's your first dynamic SQL statement.  Notice that we need to delimit the value of the parameter, which will be compared as a literal against the column in question, as we build up the string.

Now suppose you need to change your SelectedStatus parameter to be multi-valued.  You probably want an IN list in the SQL result.

As you may know, and if you don't I've spelled it out in another post about report parameters, multi-valued params are an array of strings.  In that particular post we were receiving the multiple values as a comma-delimited string, and we used the Split function to separate out the values into the string array. 

In this scenario, we have to do the opposite: we have the string array and we need to concatenate that set of values together so it can be read in the resulting SQL.  We use the Join function to do that.  The function allows us to add whatever delimiter(s) we want, and in this case we'll add delimiter quotation marks for each literal value along with the commas that will separate them in the IN list.  Like so:

="SELECT OrderNo, OrderStatus FROM MyOrdersTable "  &
   IIF(Parameters!SelectedStatus.Value = "Any", " ",
       " WHERE OrderStatus IN  ('" & Join(Parameters!SelectedStatus.Value,"','")  & "') " )

I've marked the delimiter values in red and yellow so you don't miss them. Be sure you don't add any delimiting spaces that you don't want in the literal values that will appear in the SQL result.

I've included this particular example because it provides more types of confusion than just about any other form of RS-Dynamic SQL on the forums. Reference this thread.

Now you're getting the idea, right?  You can see that you might also want to add a  Switch or Choose instead of the IIF() to handle the possibility that Parameters!SelectedStatus.Count = 1, if you tested and found that a straight = comparison performed better than the IN clause.

How do I do it more flexibly?  

Sooner or later, you are going to get fed up doing this kind of work in the little Dataset definition window.  It will be "sooner" if you have lots of parameters for each report and you run out of room (that little window has a limit).  It will be "eventually" when you realize how hard those statements are to maintain, especially if you need to do lots of tests for each clause before you decide what the dynamic SQL statement should be.

At that point, you're ready to do it properly.  You can start by copying that long statement you wrote to Visual Notepad or someplace else ultra-safe, so you don't lose it.

Now replace the statement with a call to a code function, like this, which you'll write to return the SQL string:

 =Code.SQLStatementForThisDataSet(Parameters)

Notice I've passed the report parameters object to my function (and no, you don't have to call it  SQLStatementForThisDataSet, sheesh, and yes, you can pass the name of the dataset if you want one function to define more than one SQL statement).  Since we were evaluating the parameters to create our SQL statement before, it stands to reason we're going to need them now.

Start writing your function by going into the Report Properties window (from the Report menu popup ) and choosing the Code tab.  You'll need a function skeleton that looks something like this:

Public Function SQLStatementForThisDataSet(ByVal p as Parameters) As String
  Return ""
End Function

Now things start getting interesting. I bet you have some idea of what we do in here, right?  But I'll give you some concrete examples. 

Define "interesting". 

You can write really smart, generic code in here. For example, I happen to have a lot of reports that feature the ability to create "start date" and "end dates" as custom periods, with an override if they just want a "month to date" report. In some cases there is more than one date pair to be considered.  Typically my code for such a report will look something like this:

Public  testText As String =""
' this item will be explained below! 

Public Sub PrepareSQLWithDateStringParam(ByVal v AS String,  ByVal n AS String, ByVal sb As System.Text.StringBuilder, ByVal start As Boolean)
    If Len(v) > 0 AndAlso IsDate(v) Then
       If sb.Length > 0 Then
          sb.Append(" AND ")
       End If
       If start then
          sb.Append(n & ">= '" & v & "' ")
       Else
          sb.Append(n & "<='" & v & "' ")
       End If
   End If
End Sub

Public Function SQLStatementForThisDataSet(ByVal p as Parameters) As String
  Dim sb As  System.Text.StringBuilder = New System.Text.StringBuilder()
  Dim dummy As String
  If p("MonthToDate").value Then
     dummy = Month(Today).ToString()  & "/1/" & Year(Today).ToString()
     PrepareSQLWithDateStringParam(dummy,"SurveyDate",sb,true)
  Else
     If Not IsNothing(p("EarliestDate"))  Then
        dummy = p("EarliestDate").value
         PrepareSQLWithDateStringParam(dummy,"SurveyDate",sb, true)
     End If
     If Not IsNothing(p("LatestDate"))  Then
        dummy = p("LatestDate").value
         PrepareSQLWithDateStringParam(dummy,"SurveyDate",sb, false)
     End If
  End If

  If Not IsNothing(p("EarliestRunDate"))  Then
     dummy = p("EarliestRunDate").value
      PrepareSQLWithDateStringParam(dummy,"RunDate",sb, true)
  End If

  If Not IsNothing(p("LatestRunDate"))  Then
     dummy = p("LatestRunDate").value
      PrepareSQLWithDateStringParam(dummy,"RunDate",sb, false)
  End If

  If sb.length > 0 Then
     sb.Insert(0," WHERE ",1)
  End If

  sb.Insert( 0," SELECT …",1)
 ' leaving out the rest of the SELECT, the important
 ' thing is that I'm inserting it before my now-fully-formed
 ' WHERE clause

  sb.Append(" ORDER BY …  ") 
 ' leaving out the ORDER clause, ditto…

  testText = sb.ToString()
' this item will be explained below! 

' and, finally, here's the payload: 
  RETURN  sb.ToString()

End Function

Get the picture?  Here's another example, that shows you a multi-valued parameter at work in this type of code.  In this example, the values are of Integer rather than String type, so there are no extra delimiter marks, but you can still see the Join function at work in this, much clearer, context:

Public Sub PrepareSQLWithIntegerParam(ByVal v AS Parameter,  ByVal n AS String, ByVal sb As System.Text.StringBuilder,ByVal omitVal As String)
  If Not(IsNothing(v)) Then
    Dim thisVal AS String = Join(v.Value,",")
    If  (IsNothing(omitVal)  OrElse NOT(thisVal.Contains(omitVal)) )  Then
        If sb.Length > 0 Then
          sb.Append(" AND ")
        End If
        sb.Append(n & " ")
        If thisVal.Contains(",") Then
           sb.Append(" IN (" & thisVal & ") ")
        Else
          sb.Append("=" & thisVal &" ")
        End If
    End If
  End If
End Sub

You may be wondering what the "omitVal" information is doing in this function.  I want to allow for an "Any" case in the multi-select values, which will mean that this clause does not get added to the filter at all.  I add "Any" to the labels available in the dropdown, and give the "Any" choice an impossible value, usually -2.  When this value is chosen, it means that the user wishes to skip the filter entirely. The calling function that is constructing the SQL statement invokes this function as follows:

  If Not IsNothing(p("Q1Answer")) Then
      PrepareSQLWithIntegerParam(p("Q1Answer"), "Q1",sb,"-2")
  End If

  If Not IsNothing(p("Q2Answer")) Then
      PrepareSQLWithIntegerParam(p("Q2Answer"), "Q2",sb,"-2")
  End If

  If Not IsNothing(p("Q3Answer")) Then
      PrepareSQLWithIntegerParam(p("Q3Answer"), "Q3",sb,"-2")
  End If

How do I test it?

You've noticed the "testText" String variable I've added to the code.  I often have a "debug" layout control in my RDL, which is either Hidden based on a Debug variable or removed after report design is complete.  Here's a typical report expression for a "debug" textbox:

="DEBUG: " & User!UserID & " " & Code.TestText

In the early stages of report design, my SQL statement may not even be working as I have constructed with it.  I create a temporary SELECT statement in my Dataset window, providing some stubbed-in contents for the report without all the filters and other dynamic stuff. (You may have to do this anyway to get the right fields list for the data set; see the next section below.)  Meanwhile, the debug textbox shows me what I've constructed for the "real" SQL.  Usually I've forgotten to add a space between keywords, or something obvious.  But, if I don't spot what's wrong immediately, I can copy it out of an export from the report, paste it into a query window, and execute it interactively to find the bug.

How do I get the the right information in my Datasets list ?

If you started this process with a non-dynamic query and you haven't changed your output fields list, you're already all set.  If not, you can create a temporary query using literals, for example:

SELECT   10 AS OrderNo, 'Pending' As OrderStatus

… execute that query once in the Dataset window, press the little refresh button, and you should be all synched.  If not, you can right-click in the Datasets window to explicitly Add fields by name.

How do users understand the results?

A typical problem with "Exported" reports that have a lot of parameters is: What does the data represent?  Users forget what they selected in the report parameters, and how those choices related to the filtered and aggregated data values.

Luckily, you already have the tools you need to help them!

Along with the debug textbox we've already discussed, you can build up a "title" expression, to display users choices directly in the report.  All it takes is a second StringBuilder instance… the logic is already there for you, and the same functions can construct both the SQL version and the English language version at the same time.

 

22 thoughts on “Writing Dynamic SQL in and for RDLs

  1. i have along text it is inserted in data base , so its a long message our logic is these message is inseted in two row depend on length,
    but my requrement is how we retrive these message in single row..

  2. Hi Nasir,

    Do you mean you have the text inserted in two database table rows but you would like them to appear together in the report?

    If so, is it ever more than two rows? And what type of key arrangement ties the two (or more) rows together in the database? (For example, how do you know which one comes first?)

    If I have misunderstood your question, please ask again. But for your information I am in the middle of moving house and will not be on line very much for a couple of weeks! I will try to help, though.

  3. Hi Lisa,

    I read through your posts…

    I know this must be simple but I just can’t seem to get it

    I have a form where users can select a report from a dropdown list and there are a bunch of filters/criteria that they can set for the report.

    Based on their selections I build a query, for example, if they choose Income Summary report where the amount for a charge is greater then $50.00. So the query would look like this:

    SELECT SUM(tblAcctgXref.AcctXref_Amount) AS TotalAmount, tblCharges.Charge_name

    FROM tblAcctgXref INNER JOIN

    tblAcctgCharges ON tblAcctgXref.AcctXref_ChgId = tblAcctgCharges.AcctChg_ID INNER JOIN

    tblCharges ON tblAcctgCharges.AcctChg_ChargeID = tblCharges.Charge_Id

    WHERE (tblAcctgXref.AcctXref_Amount > 50)

    GROUP BY tblCharges.Charge_name

    ORDER BY tblCharges.Charge_name

    I have another form, ReportViewer, which is used for viewing the report. On the load event of this form I reset the viewer and set the ReportEmbeddedResource to the specified report, like this:

    Me.ReportViewer.Reset()

    Me.ReportViewer.LocalReport.ReportEmbeddedResource = “AAM.ReportIncomeSummary.rdlc”

    ‘Fill in Report Data with query

    Me.ReportViewer.RefreshReport()

    Now, how do I populate the report with the results from the above query?

    The report has only two columns, Description (tblCharges.Charge_name) and Amount (TotalAmount)

    Your help is really appreciated

    Thanks

  4. the statement you typed below:

    =”SELECT OrderNo, OrderStatus FROM MyOrdersTable ” &
    IIF(Parameters!SelectedStatus.Value = “Any”, ” “,
    ” WHERE OrderStatus = ‘” & Parameters!SelectedStatus.Value & “‘ ” )

    where is exactly will this be put in the report rdlc? the reason i ask is because i tried putting the code in the report’s properties – Code tab, and it gave me a syntax error with no explanation as to what the error is.

  5. hi Imran,

    Sorry for the delay in reply, we are moving house…

    You have a comment in your code:

    <code>’Fill in Report Data with query</code>

    … is this what you are trying to figure out how to do?

    If you have an RDLC, rather than an RDL, you are preparing the data OUTSIDE the report. I cover this in the related post http://spacefold.com/lisa/Dynamically-loading-reports-in-ReportViewers — there are two parts, not both of which you necessarily need in your situation:

    *1 — You fill a dataset with the data of your choice. In my walkthrough you see this happen in the DisplayReport method, and it is different depending on where the data is coming from. In my case it may be local CE or remote SQL Server, or it may be loaded from an XML file.

    The point is, you fill a dataset just as you would for any non-reporting scenario.

    In my walkthrough, there is one dataset called DataSet1 and one called DataSetMany, because the two RDLCs between which I switch display metadata for one RDL/RDLC or many RDL/RDLC files. (There’s actually only one column’s difference between the datasets, but I deliberately made them separate to illustrate how you could switch datasets for whatever needs you have.)

    There are also two bindingsources in the form: one for DataSet1 and one for DataSetMany.

    At runtime, you’re going to create a datasource and associate it with one of the available bindingsources and its datamember (the dataset of choice).

    *2 — So, having filled your dataset, you create a datasource object and add it to the ones that are bound to the LocalReport object. In my walkthrough, you see this action in the InitializeViewer method.

    As you can see in the method, I create ReportDataSourceX on the fly, and I associate it with the appropriate binding source. Then I issue the line of code you see below.

    <code>Me.ReportViewer1.LocalReport.DataSources.Add(ReportDataSourceX) </code>

    … but in your case you may have to do a .Clear() first, it depends on its lifespan and your sequence of actions, which may vary quite a bit from what you see in my code. The important thing to realize is that InitializeViewer is being called each time we switch between RDLCs. DisplayReport, by contrast, is being called each time we switch between datasets — but we may not be switching RDLCs at the same time.

    For example, somebody can view the metadata for a single RDL and then choose a different RDL to view. This requires calling DisplayReport but not InitializeViewer because the RDLC displaying the metadata has not changed. So DisplayReport will Reset the ReportViewer control and re-fill the dataset, and then Refresh the control but it will not do all the binding stuff in InitializeViewer.

    It is also possible to need to switch RDLCs but <i>not</i> need to refill the dataset. And it is <i>also</i> possible to need to switch data connections when you refill the dataset — something I illustrate in the post using SQL CE as an example.

    I hope I am not making this worse! If so, please write again and I’ll try to explain whatever I’ve missed. I promise to be more prompt this time!

  6. hi Jon,

    The statement doesn’t ordinarily go into the RDLC! RDLCs have their data prepared, by you, outside the report and handed to the report. So if you are preparing SQL dynamically for a report and it’s an RDLC, you will be doing this work (in code) outside the custom code. An RDLC isn’t going to issue a statement and collect data for you; only an RDL, where the data is server-side, will do that.

    See my answer to Imran and the related post on how data can be dynamically attached to RDLCs.

    That being said, it doesn’t mean that there’s no use for custom code in RDLCs — it can manipulate report expressions, keep various calculations, etc.

    FWIW: You probably got a syntax error because you thought this expression should be put directly into code — it wouldn’t be. Custom Code is going to be in functions, it’s like a library. (RDLs have a separate element for storing the statement that is actually executed by Report Server to fill a dataset.)

    If I’ve missed what you’re trying to do or learn about, please write again and I’ll try again. As I said to Imran, we’ve been moving house so I’ve been away but will try to be more prompt in the future.

  7. Lisa,
    I have the following SQL Query which I have typed in the dataset defining window.

    =”select ‘ ALL’ from terr_dim UNION select distinct region from terr_dim WHERE ” & Parameters!WhereParam.Value.ToString

    In the above statment WhereParam is the parameter which holds the where clause values. When I try to preview the report it errors out:

    An eror occurred during local report processing. Cannot set the command text for data set ‘dsRegion’. Error during processing of the CommandText expression of dataset ‘dsRegion’

    I dont know why I am getting this. I read in one of the forums that this has got to do with security policy. Could you guide me?

  8. When I say =Code.GetSQL(Parameters) for the Query String of a DataSet, ‘GetSQL’ gets underlined in Red. Why is this?

    Also, when I run my query following almost all of the steps you suggested in the article, I get “Cannot set the command text for data set ‘DataSet1’. Error during processing of the CommandText expression of dataset ‘DataSet1’.

    Any inputs you could offer for the above two issues would be of great value and help to me. Thank you.

  9. Hi BusOwner,

    Your code gets underlined in red because Intellisense doesn’t recognize it. That’s okay. It still works <g>.

    To help you figure out what you are doing wrong, I am going to suggest you put in a temporary SELECT statement as the return statement in your function (a simple one that won’t fail). The function should still construct the complicated dynamic SELECT statement, as I explain above, but place it in a variable instead of using it as the SELECT. Display the contents of the variable somewhere in your report, so you can debug it.

    I explained this above in the “How do I test it?” section. If this doesn’t help you (if the report still doesn’t run), there is something else going on; perhaps you haven’t set the right datatype for the Parameters argument in your function code, or something like that.

    Hope this helps.

  10. You are exactly right – I had an ampersand instead of a comma in the JOIN statement in the huge SQL Select I was building. I can’t thank you enough for writing this article in such a clear and concise manner.

    Now, why won’t MS bubble up the true exception that is caused by the underlying code? Instead all we get is a very generic and meaningless error message like this one:

    Cannot set the command text for data set ‘DataSet1’. Error during processing of the CommandText expression of dataset ‘DataSet1’.

  11. Hi there Yasser,

    What was the expression you used in the RDL (such as an expression for a textbox, or a filter) that invoked the function? And what was the function as you put it into the custom code portion of the RDL?

  12. Lisa,

    RE setting up your dataset, another way to do it is to right-click on the report in the Solution Explorer and choose XML editor from the options. You can then scroll down to the <DataSets> node and add what you need. Look at a report with a populated dataset to see examples of <Field> elements.

    Thanks for sharing your hard-won knowledge.

  13. Oh, you’re welcome, DLT, and thank *you* for sharing !

    FWIW I am almost always happier to edit the RDL as XML, but I don’t often talk about it because many other people don’t like to do it <g>. I’m glad to hear that you are a kindred soul.

  14. I don’t have a large amount to say in response, I only wanted to comment to say excellent job. Me and my friend really loved your article. It seems that you have placed a good amount of effort into your article and I require a lot more of these on the Internet these days.

  15. I’m having a difficult time, I have the following code:

    protected void btnGenerateReport_Click(object sender, EventArgs e)
    {
    String strItem;
    String sqlStmt;
    int x = 0;
    // Setup Connection string
    cnstr = “server=myserverIP;user=mylogin;password=mypassword;”
    + “database=mydatabase;”;

    // Create a new Sql Connection
    SqlConnection cn = new SqlConnection(cnstr);

    //Create the DataSet
    ds = new DataSet(“ds”);

    sqlStmt = “Select “;

    foreach (ListItem listItem in ColumnListBox.Items)
    {
    if (listItem.Selected)
    {
    strItem = listItem.ToString();
    sqlStmt += strItem + “, “;
    }
    }
    sqlStmt += ” FROM ” + ReportList.SelectedValue.ToString();

    SqlDataAdapter da = new SqlDataAdapter(sqlStmt, cn);

    // Fill the Data Adapter
    da.Fill(ds);
    ReportViewer1.LocalReport.DataSources.Clear();
    ReportViewer1.LocalReport.DataSources.Add(ds);

    The error on the line above is “cannot convert System.Data.DataSet to Microsoft.Reporting.WebForms.ReportDataSource!!

    How can I get the sql to populate into the ReportViewer?

Leave a Reply

Your email address will not be published. Required fields are marked *