SQL11UPD06-TSCRIPT-01
This wiki is a transcript of a previously recorded video.
Related content assets:
Welcome to this presentation introducing SQL Server 2012 Analysis Services (SSAS) Improvements. My name is Peter Myers. I am a SQL Server MVP and a Mentor with SolidQ.
This is one of two presentations focusing on what’s new in Analysis Services. We’ll take this opportunity in this presentation to introduce what’s new and to specifically focus on the BI Semantic Model (BISM). That will raise a new dilemma for developers, as in, what will be the right approach to producing a BI Semantic Model? We’ll then focus on the multidimensional model improvements and then finish up with the installation and management discussion, followed by a summary.
SQL Server 2012 Analysis Services is the fifth release of the product if you count the first release being SQL Server 2000. Of course, Analysis Services was really born in SQL Server 7.0 with all OLAP Services but wasn’t named Analysis Services until the 2000 release. Now, new features in SQL Server 2012 and enhancements are based on a vision to expand the reach to a broader user base. Now, in fact, we saw the beginning of this in SQL Server 2008 R2 with the release of PowerPivot, essentially providing data modeling capabilities to Excel and business users. So this is referred to as tabular model development. In fact PowerPivot is a tabular approach. And so this release very much embraces the tabular data model by making the capability of tabular model development available to IT professionals also. It brings together tabular and multidimensional, and we are not forgetting the fact that we have cubes, dimensions, and the multidimensional approach that has been with the product since its first release. Collectively, tabular and multidimensional models are now referred to as BI Semantic Models. Do note that while data mining is still a component of Analysis Services, there are no significant changes in the 2012 release so it is not considered or covered in these presentations.
So what is this BI Semantic Model? Well, the BI Semantic Model is expressed in terms of a data model, business logic and queries and data access. Essentially, what it provides is a very convenient access to data that is intuitive to business users for them to go ahead and explore and create ad hoc analytic experiences and reports. The data sources it can consume from can include relational databases, line of business applications, OData feeds, spreadsheets, and text files. And client tools could range from the familiar with Excel with PivotTables, PivotCharts, it could be reporting services, it could be PerformancePoint Services. So, all that – this is what the BI Semantic Model is. We’ll explore this in much more detail in this module.
Now when we take a look at this, very much the marketing message about one model for all end user experiences – so we have already seen with PowerPivot in its SQl Server 2008 R2 release – is that this delivers personal BI. It delivers the capability for business users to go ahead using Excel with the add-in for PowerPivot – which incidentally is free for download – to go ahead and create reasonably sophisticated data models over potentially large volumes of data and to have these models perform very, very well at query time. Now the solutions produced here we refer to as personal because the solutions may as well just be for the desktop. We go ahead, we produce the model and we use it locally on the desktop and in the workbook. Now, as an example, it could be that someone produces a BI solution but colleagues and the team or department discover this and decide it is also of use for them. So it’s not very appropriate to copy that workbook around the organization. This is not great for version control; it’s not great for data and data access security. So the appropriate path to take is – with SharePoint Server 2010. And if you publish a workbook that has an embedded PowerPivot model to SharePoint, then some magical things can happen. First of all, the business user can then go ahead and secure it to ensure only the right people have access to it. They can also configure data refresh to ensure that the data is kept up to date. Now on top of that, we have an IT monitoring story that comes in, so IT can keep track of who is using what, what resources are being used, how frequently queries are being sent by how many users and how long it is taking those queries to be returned. So this provides visibility for IT and also to ensure the data is secure and is being backed up along with the other content inside SharePoint. So this is referred to as the team BI story.
Corporate BI is where IT go ahead and develop, maintain and, of course, sanction data modeling solutions. And so we have this concept that if a single data model actually works across all three scenarios. In fact, it is very interesting to consider that a personal BI solution, simply by publishing the workbook to SharePoint, becomes a Team BI solution. And if IT discover, for example, that this solution is being heavily used and is therefore likely to be critical to the business, they can take that workbook and there are paths for them to ensure that it is deployed and managed as an IT solution. It supports greater growth than PowerPivot Excel documents ever could and other features that we’ll be exploring in the next module. So we can see that this path and transition from personal to team BI potentially to corporate BI. Essentially it is one data model that can be used for all of the end user experiences, as we’ll explore in much more detail in this presentation.
So in this slide let’s explore the BI Semantic Model in some more detail. Because this term requires a very, very good definition as there’s a lot involved. What we’ll discover is that a BI Semantic Model is really a generic term for Data Model that is hosted in Analysis Services. There are three project types or approaches used to develop these models: the PowerPivot Workbook, as was introduced in SQL Server 2008 R2; the tabular project, which is new in SQL Server 2012; and the multidimensional project is what we’ve had since 2005 but we are simply renaming it multidimensional project to distinguish it from the two tabular approaches. So let’s take a look at these three different approaches, all used to produce a BI Semantic Model. We’ll take a look in terms of design type, development tool, business logic, data access and deployment. Starting with the PowerPivot Workbook, we’ll see that this is a tabular design approach, and according to its target audience, it typically targets business users. The development tool is Excel 2010 with the PowerPivot Add-in installed. Business logic in the form of calculations, like new columns and measures, are implemented with Data Analysis Expressions, or the acronym DAX. DAX is an expression language that has evolved in a sense from the Excel functions but has been enhanced with many capabilities that Excel has never expected to support – for example, time intelligence, context modification and relationship navigation.
Now for data access, PowerPivot only supports in-memory and so it’s worthwhile pointing out that the processor known as the Vertipaq processor (recently renamed xVelocity), when we load data into the PowerPivot it takes this data and it compresses it and optimizes it for use at query time. At query time, essentially what happens is, it’s brute force scanning of this in-memory data that produces lightning fast query results. For deployment, they may well not be a story. It might well be that this is a personal BI solution and the user simply opens the workbook and interacts with it through local reports within the workbook. However, to implement that team BI scenario, it could well be that we publish that workbook to SharePoint and SharePoint has the PowerPivot Add-in installed and effectively that is deployment. A request by the URL for that workbook will actually engage the PowerPivot System service to take the embedded data model and to deploy it to an Analysis Services instance in SharePoint mode within the SharePoint farm. So, effectively, we can understand that PowerPivot can be used to create the data model in Excel. The publication of that workbook to SharePoint then allows us to configure data refresh. It supports monitoring and, effectively, it’s the deployment process when the request is made for the workbook.
Now what is new in the SQL Server 2012 release is a new tabular design approach. Effectively, you can think of it as PowerPivot for the IT professional but with some additional enhancements. So the focus being on the IT professional, the development tool is SQL Server Data Tools (SSDT) – it uses for its business logic the exact same DAX library of functions. There is no difference between DAX for the PowerPivot and for tabular project. Now, in addition to in-memory the same process is used – there is a pass-through mode so, effectively, DirectQuery, as it is known, allows us to query a model in this mode and Analysis Services will translate that to a query to the underlying data source. And therefore there is no need to cache the data and the data can be passed through. This can deliver to low latency results.
Now, a tabular project being developed in SQL Server Tools must be deployed to a new mode of Analysis Services known as Tabular mode. So, from a deployment point of view, that must take place, but we can also consider at this stage what happens to a PowerPivot workbook. So that workbook can either be restored directly to a Tabular mode instance of Analysis Services or it can be imported into a SQL Server Data Tools project and then enhanced, checked-in to source control, and then deployed to a tabular mode instance of Analysis Services. So there is the hand off story there for the business user or power user across to the IT professional.
The Multidimensional Project is really just a rename of the project type we have had since SQL Server 2005. Its official name is actually Multidimensional and Data Mining Project, because this is where cubes, dimensions and data mining models will continue to be developed. It uses SQL Server Data Tools also, but its business logic has always been implemented with Multidimensional Expressions (MDX), data access is achieved for caching through MOLAP and pass-through with ROLAP. No changes fundamentally here. And we must deploy these projects to the mode of Analysis Services – that is, Multidimensional mode. So what we see there on the deployment line is that there is actually three modes of Analysis Services: the SharePoint mode for PowerPivot; there is the Tabular mode for tabular project deployment; and there is Multidimensional mode for our cubes, dimensions, and data mining models. And so I am hoping now that you have a better appreciation about what the BI Semantic Model is with its three different project types and approaches.
So now let’s consider the clients that can connect and consume the data models. So we have this full range across the top. Let’s first talk about third-party applications that you may well develop, Reporting Services reports, Excel, PowerPivot and SharePoint Insights (nd I think what we are referring to there is PerformancePoint Services). So all of these clients have been issuing MDX queries to cubes and the fact is, regardless of the project type, all BI Semantic Models can be queried with MDX (providing they are not using DirectQuery). We’ll talk about this later. So, what this means is that a tabular model exposes a multidimensional interface and it understands MDX. The great news for us here is that it truly is one data model for all experiences – existing OLAP clients will just work against tabular data models.
Now what is new in SQL Server 2012 is that we have DAX Query. Essentially, this is the evaluation of a table function that returns a results set. Now, Power View, a new report writing tools delivered in Reporting Services in SQL Server 2012 and that will be discussed in a later presentation, uses DAX Query. Third-party applications could also choose to use DAX Query. What we need to appreciate in the first release of SQL Server 2012 is that DAX Query may only be submitted to tabular data models. Therefore, we can now understand why Power View has this restriction that it may only be based on top of a tabular data model.
The final build for this slide then is to consider the different data sources that we can use. So for multidimensional we are being restricted to relational data sources only – specifically, SQL Server, Access, Oracle, DB2, Teradata, and SQL Azure also. So this has been a little bit problematic – if you have data in other data formats, there is that requirement then to use a product like Integration Services to extract, transform, load (ETL) that data into a relational database and then build your data model upon it. Now the flexibility that we get with tabular development is that, in addition to relational, we can work with files, OData feeds, cloud services, other deployed BI Semantic Models. So it’s far less restrictive with the data sources that can be accessed. And that completes the final build for this slide that describes in some detail what the BI Semantic Model is.
The deliverables of the BI Semantic Model include flexibility, richness and scalability. From a flexibility point of view, you have choice. Is it tabular? Is it multidimensional? Your choice will result in how business logic is implemented. DAX or MDX and cached and pass-through is supported in both with in-memory, in DirectQuery, or MOLAP or ROLAP. So this all delivers choice to the developer and also a choice to the end user, because BI tools, as we now appreciate, there’s those OLAP clients that work across all data model types. For richness, all types support rich modeling capabilities. Rich and sophisticated business logic can be implemented with both DAX and MDX although we will note that MDX can support far more sophisticated scenarios than DAX. Both support fine-grained security, so rows in tabular models or cells in multidimensional models can be hidden. There are enterprise capabilities including multi-language and perspectives. Note the perspectives are supported in both model types. However, multi-language is only supported in multidimensional. For scalability, we see excellent scalability for in-memory providing that you have sufficient resources in terms of memory. For MOLAP, well, MOLAP has traditionally worked with extremely large amounts of data and can scale well across different drives, across different servers even. Be aware that in-memory will not support this and will require adequate memory to host the entire data model. That’s the benefit of in-memory, is that, once you have hosted the entire data model, you get lightning fast query response from it. Now, both types, tabular and multidimensional, support pass-through and cached modes, direct query and ROLAP. And state-of-the-art compression is employed by both modes for in-memory and MOLAP. Note that in-memory provides typically much higher compression on average at around 10X compression – whereas MOLAP generally is around the 3X. And, finally, it scales to the largest of enterprise servers. So whatever hardware you can throw with either approach, Analysis Service can work with.
So this creates a dilemma because now we have two approaches for creating enteprise data models. There is tabular and multidimensional. So the model developer needs to choose the right development approach and we’ll provide you some guidance at this point. These are the project templates that are now available through SQL Server Data Tools. So whether it’s multidimensional and data mining or the import from an existing or a new tabular project or importing from a PowerPivot workbook or an existing tabular database.
Choosing the right development approach is somewhat of a challenging topic to address in just a handful of slides. It could depend on many things, including your requirements that the different layers – including data model, business logic, data access and storage. So the model developer will need to consider multiple things. First of all, the supported capabilities and features. I’ve already mentioned the multi-language support is not supported in tabular. So, for example, if that is a requirement in your project, clearly there is no decision to be made; it needs to be multidimensional. But if you don’t have data limitations here you move on to understand that the scalability and performance given the volumes of data and the concurrency and the number of queries, and so on. You are also going to factor in the skills that you have and generally speaking, to produce a multidimensional data model requires more skills to develop objects like dimensions, cubes and to implement business logic into MDX, than does a tabular data model. That’s generally speaking, so if you are coming from a background where you don’t know multidimensional, I would probably suggest, if you see that the featureset is compatible, go ahead and commence to develop tabular data models, because the learning curve is less steep. And generally speaking, also, development time takes longer to develop a multidimensional project than a tabular project. And finally, a multidimensional project requires care and feeding with the design and processing of aggregations to maintain good query performance.
The data model, so tabular is really familiar. Anyone familiar with relational database design will grasp the concepts of tabular design very, very easily. So tables, columns and relationships are the foundations of this design. Some advanced concepts are not supported natively – for example, many-to-many relationships – but may be simulated through using calculations. So be aware that there could be some work around that could be quite simple. It is easy to wrap a data model over a raw database, so within a matter of minutes, we could create a tabular project that imports table definitions. Configure this for DirectQuery so that there is no caching, deploy to the server, and then we can use Power View to go and explore the data. Very, very quickly and easily. Now, multidimensional, this data modeling approach is far more sophisticated and typically involves a higher learning curve. Advanced concepts are baked in so we have native support for parent-child, many-to-many relationships. OK, typically, multidimensional will be ideally suited to OLAP type of applications, like planning, budgeting, and forecasting.
When it comes to business logic, DAX has already been explained. It is based on Excel formulas, so it’s usually very simple to get started. Complex solutions involve a steeper learning curve but it is generally easier than MDX. There is no provision to create named sets or calculated members using DAX. Now MDX, providing far more sophisticated capabilities, involves a higher initial learning curve. Complex solutions require steeper learning curves and ideally suited for applications that need the power of multidimensional calculations involving things like scopes, assignments, and calculated members.
When it comes to the data access and storage, we have already considered the caching approach for tabular with in-memory, and MOLAP for multidimensional. And I have already talked about the compression ratios. They both have great compression but tabular stands out as being more than great – it is actually fantastic, and I have seen examples with more than 100X compression, which all depends on the data that you have in the cardinality and distinct values, and so on. For pass-through, we have DirectQuery and ROLAP. And we’ll talk about DirectQuery in more detail later. There are certain limitations we have and we’ll cover that in the following presentation when we talk about tabular project development.
Let’s switch now to multidimensional modeling improvements. So, while multidimensional is still fully supported by Analysis Services, no change there except the change of the terminology. We don’t refer to it as a Unified Dimensional Model (UDM) anymore – it’s a multidimensional BI Semantic Model. So there are over 300 improvements. Of course, we are not going to list them here in this presentation but we will note that about 100 of them were reported directly by customers so it’s great to get that feedback. We will highlight some major new features here by describing that the designers have been upgraded to SQL Server Data Tools (SSDT), removal of the 4GB string store limit for attributes. There are new events for monitoring lock usage and contention. Support for XEvents and for PowerShell.
In this topic we’ll take a look at installation and management. So, a subtle change here – so when we take a look at the installation of Analysis Services, when you are configuring the feature, you now need to make the decision in using this radio option – whether it’s a Multidimensional and Data Mining mode or the new Tabular mode. So it can only be one or the other. Be aware that you may install multiple instances of Analysis Services side-by-side, so a single server can host both types of modes and that’s totally acceptable. For the PowerPivot and SharePoint mode there is a PowerPivot Configuration Tool that we’ll talk about in more detail in the next presentation.
Working with PowerShells. SQL Server 2012 Analysis Services introduces PowerShell cmdlets for Analysis Management Objects (AMO), so this provides command-line connectivity, navigation, and discovery of Analysis Services databases. It’s supported for both multidimensional and tabular.
Managing in Object Explorer. I like to include this slide because if you connect to an instance of Analysis Services, the visual clue that you have in Object Explorer in SQL Server Management Studio (SSMS) is the little icon next to the instance name. So see that the first one consists of the cube. Look at this, the next one consists of this table in tabular mode. The first one is multidimensional mode, the next one is tabular mode and this is the SharePoint mode.
The discussion now about upgrading existing applications. Any investment you have made since 2005 in Analysis Services will automatically update. So we can take those existing applications, and if you open these in SQL Server Data Tools, every Unified Dimensional Model simply becomes multidimensional BI Semantic Model. It will not change much that you have done. What we do get is a different design experience but the design of your solution will be preserved. The dilemma is – that we have already discussed is – what are you going to do about new data models? So now you have choice – will that be tabular project or will it be a new multidimensional project?
So, in summary, the BI Semantic Model delivers a single model for all user experiences, and also delivers choice, flexibility, richness and scalability. The tabular development approach introduced in SQL Server 2008 R2 by using PowerPivot is now available to IT professionals. There are numerous improvements for multidimensional models. PowerShell cmdlets for AMO are now supported and existing UDMs can easily be upgraded to SQL Server 2012 multidimensional projects.
Return to SQL Server 2012 Developer Training Kit BOM (en-US)
Richard Mueller edited Revision 3. Comment: Removed (en-US) from title
peter_myers edited Revision 1. Comment: Edited by the presenter.
peter_myers edited Revision 2. Comment: Edited by the presenter.