Skip to content

Using (or not using) tabular KPIs

You can create KPIs in the tabular model. They work like KPIs in the multidimensional model. Here is the MSDN tutorial on creating KPIs in case you are interested in the details.

There’s a bug in SQL Server 2012 in the SSDT designer. Those icons at the bottom of the KPI dialog box do not work. You see the ordinary red, yellow, and green icons for the KPI status regardless of the icon set you choose (note: the icons work fine in PowerPivot). Also, you cannot specify an expression for KPI trends. The trend field is exposed in the field list, but it is always an arrow across.

Say you’re fine with the stock dots and lack of KPI trends. You went and created a KPI. Now try to use it. You can use the KPI in Excel without issue. Don’t try to use the KPI in Power View – KPIs are not supported in Power View in SQL Server 2012. Don’t try to use the KPI in PerformancePoint – there is a bug in PerformancePoint that prevents the import of the KPIs. Don’t expect to see the status indicator in Reporting Services – you have to set the performance indicator again anyway.

So what is the advantage of having a KPI over having a measure that computes a value of –1, 0, or 1 depending on various thresholds? Not much. The KPI UI just saves you the trouble of writing the thresholds out in DAX. It doesn’t buy you icon selection, it doesn’t allow you to set information for trends, and it doesn’t buy you an instant drag/drop of KPIs (including status indicator) in any tool other than Excel. I will leave the determination as to whether or not a tabular KPI is useful in SQL Server 2012 as an exercise for the reader.

PS: if you don’t like the stock dot in Excel for your KPIs, you can change by applying Conditional Formatting, like so:

image

Using multiple connections on a single table

Sometimes it is nice to mash data from multiple connections into a single table. One case for using multiple connections for a single table is to analyze data feeds. For example, suppose you are nosy (like me) and you like to check on home valuations on Zillow. Zillow has a free data set on the Azure data market, called the HomeValuationAPIs, that allows you to query for the current valuation for any given home.

Let’s take an example. Here’s a house that makes me giggle – a home on offer for a cool $3.4 million in Redmond, WA. Not exactly a shocker that this house has sat on the market for over 8 months with no buyers. Redmond’s nice, but not that nice.

We can keep up to date on what Zillow thinks this property should be priced at using the Zillow API. The service URL to use is https://api.datamarket.azure.com/data.ashx/Zillow/HomeValuationAPIs/GetZestimate?ZillowPropertyID=48817763&$top=100. If you import this feed into SSDT, you see the Zillow estimates for this home.

Of course, one home is not enough for analysis. You need many homes in a single table to do anything interesting. Unfortunately, the tools only allow us to use one connection per table. Because the property ID is encoded in the data feed’s URL, and the URL is what determines the connection string, the tools limit us to querying one single home per table. This is not very much fun.

A workaround is create a second partition on the table that uses a second connection. This is not exposed anywhere in the UI, but can be accomplished via script. Now, what I am about to show you is something that is probably not really supported. It just so happens to work, but asking the product team to fix bugs on it might result in dirty looks and postponement of fixes until the next release. Consider yourself warned.

Here is how you do it:

  1. Deploy the model with the data feed.
  2. Open SSMS and connect to the instance hosting the model.
  3. Right-click the connection for the data feed, then script a create to a new query editor window, as shown:

    image

  4. Modify the generated script, changing the three highlighted lines – ID, Name, and Connection String.

    image

    I used the ever creative "DataFeed GetZestimate2" for both the Name and ID, and changed the ZillowPropertyID in the URL to 48817769, which is the house practically next door that has also been languishing on the market for a cool $3.4 million. Press F5, the new connection is created. Refresh the Object Explorer in SSMS to verify.

  5. Now we need to add the account key for the Azure Data Market so we can process the data from this new connection. To do this:
    • Right-click the newly created connection and select Properties.
    • Select the Connection String and then click … to edit it.
    • Set the Password to your account key on the Azure Data Market, and then click OK.
  6. Now we need to create a partition that uses this new connection. To do this:
    • Right-click the table that contains the data feed and select Partitions.
    • Click the copy button in the Partition Manager to create a copy of the current partition.
    • Press the Script button to script out the partition definition, then Cancel the Copy Partition dialog. We will create the new partition in script.
    • Change the three highlighted lines in the generated script.

      image

      Use "DataFeed GetZestimate2" for the DataSourceID. Use whatever unique ID and Name that you like, I used something creative like “Data Feed Content 2”. Press F5 to execute the script.

  7. The new partition should appear in the Partition Manager. Click the Process button and load the data into it.

Now you have a table that uses data from multiple connections. Happy house hunting.

Using Reporting Services with DirectQuery models

Power View is the primary reporting client for visualizing query results for DirectQuery models. After all, it is the only graphical client that issues DAX queries.

However, it is possible to use other reporting clients, including Reporting Services running in native or SharePoint integrated mode, to get results from a DirectQuery model. To do this, supply a DAX query in the place where an MDX query should be. For more information about using DAX queries, consult Chris Webb’s six part series on getting started with DAX queries (1, 2, 3, 4, 5, 6) and also Marco Russo’s post on parameterizing DAX queries.

Reporting Services is interesting because it has disabled text entry for Analysis Services data sources. This is a problem, as that means you cannot simply switch the MDX query designer to text and enter a DAX query in there (the “intuitive” way to enter a DAX query, as intuitive goes for DAX query editing in SQL Server 2012). Instead, you must enter the DAX query in the DMX query designer. Since Report Builder doesn’t support the DMX query designer, you must therefore use SSDT to build reports that query DirectQuery models.

To create a report that connects to a DirectQuery model:

  1. Launch SSDT.
  2. Create a new Reporting Services project. I will use the Reporting Services Project Wizard in this example.
  3. Enter the connection information for your DirectQuery model and then click Next.
  4. Click the Query Builder button to construct the DAX query.
  5. Click the Data Mining button to start the DMX query editor:

    image

  6. Click Yes to dismiss warning about losing changes.
  7. Click the funky ruler button to go into text editing mode:

    image

  8. Type your DAX in the text box as shown, and then click OK to shut the query builder.

    image

  9. Finish constructing your report as usual.

Now when you run your report, you see results returned directly from the relational data source. Voila, you’ve just cut Power View out of the picture when connecting to DirectQuery models. Not sure if this is entirely supported, but it works.

Upcoming virtual and physical tabular BI sessions this month

Tomorrow morning, at 10 AM Pacific, I will be presenting Building a BI Portal for the Tabular Model in webinar format for the PASS BI virtual chapter. Click here to join the meeting. This is the same talk I will be giving at the regular session of SQLBits X in London at the end of the month.

Next up is SQL Saturday #114 in Vancouver. I was pulled in at the last minute to do a second talk, so I will be doing tabular modeling 101 at 9:30 AM and DirectQuery at 3:45 PM. The morning session is gentle, the afternoon session more challenging. The DirectQuery talk is cool because I have some tricks to model around DirectQuery limitations. Tabular modeling 101 is conceptually similar the talk I gave at SQL PASS in November. The DirectQuery talk is an hour long chunk of the pre-conference session I will be giving in London on Thursday March 29.

Looks like there are still seats available for the tabular BI pre-con on the 29th. If you’re around, do feel free to grab a seat and come it. It should be pretty technical. There are some great DirectQuery and row security samples – I am pretty excited about those demos. The multidimensional vs tabular hour, though entirely devoid of demos, is entertaining and educational enough that you may well be able to stomach an entire hour of PowerPoint! There is another foray into theory when we talk about perf and processing (and oh, we will talk about processing), but quickly we will jump back into practice and there will be lots and lots of time devoted to managing tabular models.

Hope to see you there.

Automated backups using PowerShell and the SQL Server Agent

One of the least known features of SQL Server 2012 is the addition of several PowerShell cmdlets for managing Analysis Services servers. See my previous blog post on using PowerShell cmdlets for managing tabular models for more information. When I was fiddling with the SQL Server Agent recently, I noticed that the Agent can execute PowerShell scripts automatically. Just for fun, let’s put these two things together and create a basic SQL Agent job that automatically backs up an Analysis Services database using PowerShell.

First – the prerequisites. Make sure the target server satisfies the lengthy list of prerequisites for using AMO for PowerShell. Next, ensure that the user running the SQL Server Agent service has sufficient privileges to back up the desired Analysis Services database. If you need to, run services.msc, change the user running the SQL Agent service, and restart the service.

Now you can go ahead and create the job. To take a backup of a database on the localhost\tabular instance, paste the following script into a PowerShell step in a SQL Server Agent job:

cd sqlas\localhost\tabular
backup-asdatabase "rosling.abf" hans_rosling_project -ApplyCompression

This script works, but it has a problem. The script always succeeds! Try it yourself and see. Because this script does not allow overwriting of the backup file, the backup-asdatabase command will fail on the second try. However, the SQL Agent job succeeds. Only when you drill down into the history of each individual job step do you discover the reason for failure.

We definitely want this job to fail if backup fails so that notification can be triggered. Here is how to do it. The SQL Agent will fail job steps if an unhandled exception is thrown at any time. Since backup-asdatabase does not throw unhandled exceptions, we need to throw one ourselves if the cmdlet returns an error. We can capture the error output of the backup-asdatabase cmdlet by supplying a variable to the built-in –ErrorVariable parameter. If this variable is not empty, we can throw an exception.

The revised PowerShell script now looks like this:

cd sqlas\localhost\tabular
backup-asdatabase "rosling.abf" hans_rosling_project -ApplyCompression -ErrorVariable myError
if ($myError) {throw $myError}

If you test this code, you will see that the job now fails at the job step if the backup file already exists, which is nice.

Of course, it would be better if the script succeeds if a backup file already exists. The –AllowOverwrite parameter is the one we need:

cd sqlas\localhost\tabular
backup-asdatabase "rosling.abf" hans_rosling_project -ApplyCompression -AllowOverwrite -ErrorVariable myError
if ($myError) {throw $myError}

Now we’re done. The script fails when it’s supposed to fail, and succeeds in most cases.

Using views in the a tabular model

It is very handy to views as the table source definition in the the tabular model.

The biggest advantage for using views – the columns can be renamed in the underlying data source without breaking your tabular model. If that’s not enough to convince you use views, here are a number of smaller advantages to using views, especially if you are incrementally loading data using ProcessAdd:

  • It is easy to limit the number of rows imported into the workspace database during development – simply add a where clause to your view and change the view definition later.
  • You can safely redeploy over top of a model in production without worrying about retaining partitions as the partition definition has not changed.
  • Because the partition definition has not changed, the underlying partition will never be marked as unprocessed and will not ever be reprocessed if a Process Default command is issued on the partition.
  • The script out of the partition definition will always accurately reflect the query to be used for processing.

Interested in using a view? Read on for the implementation details.

Creating a table based on a view

To create a table that is based on a view, go to SSDT, open your tabular project, and then launch the Import Wizard. You have the choice of using the table preview mode or entering a SQL query to import a table. Both of these can use views. The SQL query method is self explanatory. Table Preview mode could use a picture.
Views are shown in the list with a special icon that’s rather difficult to describe. It looks like a table with a checkbox with dotted lines on top (not sure how this represents a view). I have circled the view in this picture:

image

Once the view is selected, you can import as usual. If you made a mistake at import time and want to update the source of a table in your model to refer to a view instead of a SQL table, this is easy to correct. Simply select the table in either the grid or diagram view, press F4 to view the property grid, then click to edit the Source Data property. You can change the table definition to refer to the view instead of the table. If you have multiple partitions, you should update the partitions in the Partition Manager as well. Redeploy your model after making this change.

Creating a partition based on a view

You can use the Partition Manager in SSDT or SSMS to create a partition based on a view, even if the other partitions in the table are not based on views. Again, this is done using a SQL query (in both SSMS or SSDT) or in preview mode in SSDT.

Processing data incrementally into a partition based on a view

Processing data incrementally into partitions that are based on views is also pretty straightforward. First, you update the definition of the view in SQL to reduce the number of rows returned to just the rows we want to load, taking care to ensure that there are no duplicate rows introduced when processing. We can then do a ProcessAdd in SSMS, PowerShell, AMO, or SSIS to load the data into the partition.

Using AMO to incrementally load data into partitions based on views

If you are using AMO, the C# program I previously showed to incrementally load data changes a little bit. The only thing that changes in the AMO is that you no longer need to specify a binding when calling ProcessAdd. This means you no longer need to care about this mysterious “Sandbox” data source view, which is nice. Your AMO looks a bit nicer, however you do need some more code overall to deal with updating the SQL view.

Here is the code:

//update the SQL Server view, ensure that the data is being added for the year 1867
OleDbConnection conn = new OleDbConnection("Provider=SQLNCLI11;Data Source=.\\PowerPivot;Initial Catalog=hans_rosling_project;Integrated Security=SSPI;Persist Security Info=false");
String sql = "ALTER VIEW [dbo].[vDemographicFacts] AS SELECT ID, Country, Year, LifeExpectancy, GDPPerCapita, ChildMortalityPerThousand, ChildrenPerWoman, CountryKey, ArmsImport, ArmsExport FROM  dbo.DemographicFacts WHERE (Year = 1867)";
OleDbCommand cmd = new OleDbCommand(sql, conn);
conn.Open();
cmd.ExecuteNonQuery();

//connect to Analysis Services
Server s = new Server();
s.Connect(".\\TABULAR");

//navigate to the partition we want to process
Database db = s.Databases.FindByName("hans_rosling_project");
Cube c = db.Cubes.FindByName("Model");
MeasureGroup mg = c.MeasureGroups.FindByName("Demographic Facts");
Partition p = mg.Partitions.FindByName("DemographicFacts");

//incrementally add data to the pre-existing partition
p.Process(ProcessType.ProcessAdd);
s.Disconnect();

Using SSIS  to incrementally load data into partitions based on views

The Integration Services package that I previously showed for loading data changes as well. Now, you need an Execute SQL task to run before the Analysis Services Processing Task. Also, just as in the AMO example above, you no longer need to configure the Analysis Services Processing Task to use a query binding when doing the Process Add.

The new procedure for adding data incrementally looks like this:

  1. Create a new Integration Services project.
  2. Create two connection managers in the Connection Manager pane – one to the Analysis Services database you want to process, and an OLE DB connection to the data source you want to modify.
  3. Drag an Execute SQL Task from the SSIS toolbox to the Control Flow pane of the SSIS designer.
  4. Configure the Execute SQL Task to use the OLE DB connection defined in step 2, and set the SQL to ALTER VIEW [dbo].[vDemographicFacts] AS SELECT ID, Country, Year, LifeExpectancy, GDPPerCapita, ChildMortalityPerThousand, ChildrenPerWoman, CountryKey, ArmsImport, ArmsExport FROM dbo.DemographicFacts WHERE (Year = 1867).
  5. Drag an Analysis Services Processing Task from the SSIS toolbox to the Control Flow pane of the SSIS designer. Connect the output of the SQL task to the Analysis Services task.
  6. Double-click to configure the Analysis Services Processing Task. Use the Analysis Services connection you defined in step 2, then as described in my previous post add the partition to the list of objects to be projects by drilling down through the cube and the measure group.
  7. Set the Processing Option to Process Add for the partition, then click OK.

You can now run this package and see the view updated and data loaded incrementally.

Calling ProcessData in SSIS to incrementally load data into tabular partitions

Previously, I showed a pattern for incrementally loading data by creating a partition, doing a ProcessData on the partition, then doing a ProcessRecalc on the database all in a single transaction. It is possible to do the same thing in an Integration Services package. Let’s take a look.

We cannot use the Analysis Services Processing task for this chore. This is because the task was not updated for SQL Server 2012, so ProcessRecalc is not an option. Therefore, if we want to use a built in Analysis Services task, we must use the Analysis Services Execute DDL task to do this work.

We also need to do this work transactionally. Even though it is is something of a design antipattern to use transactions in an SSIS package, we must use one because we cannot leave the database in an unqueryable state while processing (recall that ProcessData renders a database unqueryable until a Recalc is performed). In theory, you could try to use the built-in transaction support for SSIS, but transactions do not work properly with Analysis Services. The easiest way to send the commands in a single transaction is to use one Analysis Services Execute DDL task and wrap all three commands we need inside of an XMLA Batch statement.

Imagine we want to create a partition for the year 1812 on the DemographicFacts table on the Hans Rosling demo project database. Here is how to do the incremental data load:

  1. From SSDT, create a new Integration Services project.
  2. Right click in the Connection Managers pane, add a new connection to your tabular server.
  3. Drop the Analysis Services Execute DDL task into the Control Flow area (it is under “Other Tasks” in the SSIS Toolbox).
  4. Double-click to configure the task.
  5. In the dialog box that appears, click DDL.
  6. In the DDL properties, set the Connection property to the connection manager you created in step 2.
  7. Click the … button in the SourceDirect property to launch the DDL dialog box.
  8. Paste in the following XMLA command:
    
    <Batch xmlns='http://schemas.microsoft.com/analysisservices/2003/engine' Transaction='true'>    
    <!-- Create the partition for the year 1812 on the DemographicFacts table -->  
    <!-- To create the partition, alter the measure group-->  
    <Alter AllowCreate="true" ObjectExpansion="ExpandFull" xmlns="<a href="http://schemas.microsoft.com/analysisservices/2003/engine">http://schemas.microsoft.com/analysisservices/2003/engine</a>">
       <Object>
        <DatabaseID>hans_rosling_project</DatabaseID>
        <CubeID>Model</CubeID>
        <MeasureGroupID>DemographicFacts_c2655f3e-3570-4a34-92bf-3f9ed44ad24b</MeasureGroupID>
        <PartitionID>Partition 1812</PartitionID>
       </Object>
       <ObjectDefinition>
        <Partition xmlns:xsd="<a href="http://www.w3.org/2001/XMLSchema">http://www.w3.org/2001/XMLSchema</a>" xmlns:xsi="<a href="http://www.w3.org/2001/XMLSchema-instance">http://www.w3.org/2001/XMLSchema-instance</a>" xmlns:ddl200_200="<a href="http://schemas.microsoft.com/analysisservices/2010/engine/200/200">http://schemas.microsoft.com/analysisservices/2010/engine/200/200</a>">
         <ID>Partition 1812</ID>
         <Name>Partition 1812</Name>
         <Source xsi:type="QueryBinding">
          <DataSourceID>bc946c5c-ec5b-4fbb-87d9-5e4f721e6adc</DataSourceID>
          <QueryDefinition>SELECT [dbo].[DemographicFacts].*   FROM [dbo].[DemographicFacts]  WHERE ([Year] = 1812)</QueryDefinition>
         </Source>
         <StorageMode valuens="ddl200_200">InMemory</StorageMode>
        </Partition>
       </ObjectDefinition>
      </Alter>
    
    <!-- Now do a ProceessData on the newly created partition -->
      <Process xmlns="<a href="http://schemas.microsoft.com/analysisservices/2003/engine">http://schemas.microsoft.com/analysisservices/2003/engine</a>">
       <Type>ProcessData</Type>
       <Object>
        <DatabaseID>hans_rosling_project</DatabaseID>
        <CubeID>Model</CubeID>
        <MeasureGroupID>DemographicFacts_c2655f3e-3570-4a34-92bf-3f9ed44ad24b</MeasureGroupID>
        <PartitionID>Partition 1812</PartitionID>
       </Object>
      </Process>
    
    <!-- Finally, do a ProcessRecalc on the database to make the DB queryable -->
      <Process xmlns="<a href="http://schemas.microsoft.com/analysisservices/2003/engine">http://schemas.microsoft.com/analysisservices/2003/engine</a>">
       <Type>ProcessRecalc</Type>
       <Object>
        <DatabaseID>hans_rosling_project</DatabaseID>
       </Object>
      </Process>
     </Batch>
    
  9. Press OK to all dialog boxes, and F5 to run the package. The run should be successful and you should be able to verify in SSMS that the partition was created, populated, and is queryable.

Let’s look a bit closer at that XMLA and look at how the scripts were generated. There are three commands in those 43 lines. The commands were generated by scripting out actions from three different places in the SSMS UI. I manually wrapped these commands in a batch, and inserted some comments which I hope are helpful. Use the comments to find the start of each command.

The first command, the Alter, creates a partition by modifying the DemographicFacts measure group. Notice that you must refer to the measure group by ID. The ID for the measure group was automatically generated by the SSDT UI when the table was created using the Import Wizard. The partition definition is again bound using a query binding. The data source ID is the GUID for the magical “Sandbox” object, which was also automatically created by the SSDT UI. None of these IDs are exposed in the UI in SSMS or SSDT, so you will need to discover them by scripting out.

To generate script for creating a partition, do the following:

  1. Launch the Partition Manager in SSMS on the desired table.
  2. Select a partition that is bound via query binding, then click the Copy button in the toolbar.
  3. From the New Partition dialog, click the Script button. This script will contain both the data source IDs and the measure group IDs necessary for the script.

That takes care of creating the partition. The next command is a ProcessData on the partition. Again, you must pass the measure group ID to do the processing.

To generate script for processing a partition:

  1. Launch the Partition Manager in SSMS on the desired table.
  2. Select a partition, click Process.
  3. Set the desired processing option in the dialog (in this case ProcessData), then script out. This script will contain the measure group ID you need for processing.

The last Process command is the ProcessRecalc on the database. Fortunately the database ID is quite readable, and also exposed via the properties page on the database, so you have some hope of actually being able to write this script yourself. Nevertheless, it is easier to generate the script by right-clicking the database in the SSMS object explorer, selecting Process, choosing ProcessRecalc from the processing options, then scripting out.

Simple, huh? Compare this to the similar code in AMO. The AMO snippet I posted was 26 lines and doesn’t contain any horrible GUIDs, so stylistically it may be a preferable approach to the one I outlined above for creating the DDL. You can call AMO from Integration Services from a scripting task. However, from a performance standpoint, you might want to stick with the ugly DDL to avoid having to deal with the VSTA dependency and overhead associated with the scripting task.

Also compare this to the approach for doing a ProcessAdd in SSIS. ProcessAdd is a lot neater, although there are scenario tradeoffs to be made. Comparing the two processing approaches merits a future post.

Follow

Get every new post delivered to your Inbox.

Join 38 other followers