Presentation Transcript: Introducing SQL Server 2012 Transact-SQL Improvements

Presentation Transcript: Introducing SQL Server 2012 Transact-SQL Improvements

SQL11UPD03-TSCRIPT-01

This wiki is a transcript of a previously recorded video.

Related content assets:


 

Introducing SQL Server 2012 Transact-SQL Improvements


Hi folks. I’m Greg Low. I’m a SQL Server MVP and Microsoft Regional Director. In this session I want to take a look at a number of the things that have changed in T-SQL, in SQL Server 2012.

 

First up we’ll look at query pagination. Now this is a common issue where perhaps you are building a website and you need to be able to show a list of products or transactions from the bank, things like that. What we need to do is show them a page a time. It’s been a fairly common request to say, look, I want to be able to deliver ten items to a page and I that want to have page number 12 being displayed. Now there were ways you could do that previously. The most common is probably to do that with a common table expression and what you would do there in a common table expression is create some sort of row order using perhaps the row number function. Then you would build a Where clause that selected from that common table expression and selected out of the rows of interest.

 

This change in SQL Server 2012 allows us to do this in a much clearer way. The changes that have been made here, you can see in the order by clause there is now an offset clause as part of that, so we can then indicate the number of rows that we want to skip, and then after that we specify the number of rows that we then want to fetch at that point in the code.

 

I’ve had a number of people ask me if this is a performance optimization and at this stage that’s not the case.  But it’s really important to understand that wherever you have a situation that you’re being declarative about what you want, so you’re saying this is what I want to achieve in the query, rather than saying this is how I want you to achieve it to SQL Server, then you’re in a much better position going forward. The people writing the query optimizer, for example, can then start to take much more advantage of this because they know what you’re trying to achieve rather than knowing how you want to try and achieve it.

 

The next section though relates to things that are often performance related. Now it’s very common when working with a set of rows and working through a query, sometimes you need to be able to refer to previous rows. In this example query here what I’m doing is selecting from the reseller’s sales table, and I’ve said go and select the ProductKey, the OrderDateKey, the SalesOrderNumber, but then what I want to know is how long is it since that customer made a previous order. You can imagine that this is the sort of query that doesn’t optimize all that well. What you’re saying is, I’m wandering through a file and as I get to each row, what I need to do is then go off and look up the previous time that occurred. The irony in this situation is if I’m moving through a file based on an index, often what I’ve got is the row I’m looking for might be the row immediately prior to the one I’m currently working on. It seems real petty to go and execute another entire query to pick up the row that was the one I just had a moment ago. If you look at this traditional approach, what I’d do is go and do a sub-select to then find that particular row. If you think about it, this is something that’s probably doing a lot more work than what it had to do.

 

In a new windowed approach we have a new set of verbs or clauses that have been added to the language. In this case I can say select the ProductKey, the OrderDateKey and the SalesOrderNumber, but then I want to take the OrderDateKey but lag that same key. The default is one row but I could also specify a number of rows and say look, give me the previous time this happened based on that same order. 

 

The thing that’s really interesting with this is just how profound that could be in terms of performance. Instead of executing sub-queries, you’ll find that when we compare these we see a huge difference between the two types of queries. This could also apply to a number of other things. We have, for example, a first function that allows us to say, let’s not just lag a row or two rows or five rows but perhaps let me see how long it is that the customer has ever ordered from us. So when we go through each order let’s also refer back to the first one they’ve ever made with us. That same sort of thing applies in the windowing clauses.

 

Another important contribution that’s been made in this version of the product are sequences. Now a lot of people have been asking about these when they migrate from other database engines. I won’t mention which one in particular, but it’s fairly common to say, how do we implement a sequence when moving to SQL Server? The traditional answer has been to go and use an identity column or something like that. But identity columns haven’t been the same as having sequences, because a sequence is an object that is unrelated to a specific table, where an identity column has been part of the same table. So what we can do with a sequence now in SQL Server 2012 is I can say create a sequence, and I’ve said Booking ID – note these are in a schema as well, so they are a schema bound object. I’ve said in this case AS INT. The default is actually BIG INT, but I could actually just specify an integer here. Then I could say start with 20001 and then increment by 10.  This is a completely separate object to something like an identity column that’s part of a table. The way I then refer to that, notice I’m creating a table called FlightBooking and I’ll show you more of this in a later demo. I’ve then said as the default value for the column take the next value for the BookingID. The idea here is I could have three or four tables that all use that same sequence of values.

 

Another very common question I get, and I see on newsgroups and things all the time, is people say, if I have a piece of arbitrary SQL or even if I have a stored procedure, the question is how can I know what types of columns and things could come back from a stored procedure?  The answer is that you can’t today, because what you’d really need to do is go and read the code of the stored procedure to determine that. There has been a set FMT-only option as part of SQL Server, but it has some fairly severe limitations in that what it does is returns a set of metadata for each possible code path through the procedure. That’s often not what you want in this case. In this version they’ve improved the situation for metadata discovery. A number of things have been added. First up, sp_describe_first_result_set takes a piece of SQL code – it can even be a piece of dynamic SQL – what it does is then goes through and does a fairly deep parsing of that and then returns you back the metadata about the first result set that would come back from that code, only if it’s possible to determine that.

 

If, for example, I said if some variable is some value, return a select one set of columns, but then said else return something that’s completely and utterly different, what you’ll get back is a message telling you, look, there’s no simple way to go and determine what it is that will come back. But it does a very fine job of going through the code and trying to determine that, and in most cases is going to give you what you’re after. Now I find rather than system stored procs, I really prefer to have these things available in a query-able form. The nice thing here is they’ve also given us dynamic management views for this. So sys.dm_exec_describe_first_result_set returns that same detail in a dynamic management view so I can query, what are the columns that are going to come back, the ordinals, the order of the columns, even things like the underlying data types, the collation, all that sort of detail is returned by these DMVs. The second version of the DMV, sys.dm_exec_describe_first_result_set_for_object; the difference with that one is that it takes an object ID. So if I have something like a stored procedure I can get the object ID from the stored procedure pass that to that call and then get back the same sort of list of columns, where the first one takes a piece of SQL as actual SQL text or a SQL badge.

 

Then finally the other option in here is you may have a piece of code that calls something like sp_execute_SQL, and generally you will have gone through and spelled out where the parameters are, but it is possible with that sort of code to have left some parameters in the code looking like they’re part of the text and not spelled out separately as parameters. Another system stored proc has been provided to us, sp_describe_undeclared_parameters, which when it parses your code goes through and identifies any of those situations.

 

Now another thing in terms of being able to ensure reliable code is it’s fairly common to want to execute a piece of dynamic SQL or perhaps a procedure with an insert exec. In that case one of the concerns that often comes up is what if underlying code or what if the underlying procedure changed and now it’s going to return a different set of columns to what it did before? In SQL Server 2012 what we have when we do an execute statement we can now say with results sets, and in that case spell out the signature that we are expecting to have come back from those result sets. Then the idea is if what is returned can’t be sort of implicitly cast to what we’re expecting then in that case we’re going to throw an error instead.

 

Apart from this there’s also been quite a lot of work done on the basic function library. First up in terms of data conversion we have a Parse statement. This is very much like wrapping underlying .NET code. We have a sort of Parse function that says go off and take this string perhaps and parse it as though it’s maybe a date. I can also say using a specific culture. I don’t have to set my connection to that culture or language or regional settings, I can say parse this using a specific culture. Similarly, if I don’t know if the parse is going to work or not I can also now do a TRI_PARSE. So instead of having to wrap with again try and a catch block in case something goes wrong during that, what I can do is say try and parse this and if it doesn’t work it will just return null so we can tell that the parsing failed. Similarly there’s TRI_CONVERT and TRI_CAST.

 

Another requirement that’s been coming up from time to time is we need a good way of specifying how to construct a date or time field. There has been a lot of discussion about the best string formats for declaring dates and times and so on. It has been a little confusing as to the best options there. A lot of people that have come from earlier .NET languages and so on had functions like date serial, where they could say build me a date and here’s the year, the month, and the day. Similarly we now have DATEFROMPARTS, TIMEFROMPARTS, DATETIMEFROMPARTS and so on. These functions allow us to specify specifically, this is the year, this is the month, this is the day, and so on, please give me back a date based on that without having to worry about regional settings.

 

Finally in that category similar to what we had in Excel, is an EOMONTH, which is actually end of month. It just makes it easy to determine the end of month date for a given day.

 

Similarly, functions that have come from other languages, we have a CHOOSE function. What that one does is it says give me an index and then give me what’s like an array of values. So if the index is three it will return me the third value from the list.

 

An IIF is very much like an IF perhaps in Excel or something along those lines. What that one does is it says test this Boolean condition, if it’s true take the first answer, otherwise take the second answer. 

 

CONCAT is interesting. It concatenates values together in order to form a string. I had a number of people ask, well, what’s the difference? I could just use plus signs, but it’s not the same. If I had the word hello, and I said plus null and then plus there, the return value present will be null in that case, not hello there. What the CONCAT function does is it takes a list of values, it implicitly casts all of them to strings, and any null values it casts to empty strings and then concatenates a value.

 

Then finally in this category there is the FORMAT function, which is basically a wrapper around a .NET format function. I’m now able to specify a format string and then apply values into that like I would in a .NET language.

 

SQL Server 2005 introduced the ranking functions. We had a row number, rank, dense rank and NTILE. There are a number of those in the NT standard and in this version a whole lot more of these have been added. So cumulative distribution, percentile distribution and so on. Earlier on I mentioned LAG, there’s also LEAD. I also mentioned FIRST_VALUE, there’s also LAST_VALUE. Finally the LOG function now allows you to specify a base.

 

Server Broker hasn’t been forgotten in this version. Server Broker is our transacted queue that lives in the database and a couple of things have been changed in this. First up, we now have the ability to have a time recorded when a message arrives in a destination queue. This allows us to do things like work out how long a message has been sitting in a queue. Another one is that Server Broker has a built in mechanism for dealing with poison messages. A poison message is where I take a message out of a queue, when I process it that causes an error, which causes it to be thrown back in the queue. Then of course I take it back out and the same thing happens again. You can’t have that situation where that would go on forever, so by default Server Broker has an internal mechanism where if that message arrives back in the queue again five times, it would actually stop the queue at that point and we would have to do something about it. Usually you try to write code to trap that before it happens. In this case you can now disable the standard built-in poison message handling and build your own.

 

Server Broker also works with the new AlwaysOn higher availability areas. Another big plus is that if we have multiple receivers for messages, we can now do the equivalent of a MultiCast by specifying multiple conversations that are messages sent on.

 

Finally a few of the general related enhancements. One significant one is enhancements to the Unicode support within the product via what I call SC collations, or supplementary character collations. You may or may not realize but when you look at the possible characters, there was a project back in the 1980’s where they said look, if I want to store every single possible character from every language but store them in their correct grouping, what do I need? And the answer was 3 bytes. At the time that wasn’t an answer anybody wanted to hear given that we had 16 bit or 2 byte machines and so on. So what Unicode, the initial 2 byte version of this, was sort of a compromise that squashed that down into smaller values. There are a number of things that can be represented and that are part of the Unicode standard that don’t fit inside 2 bytes. Windows has supported those in recent versions but SQL Server hasn’t, so this has now been added. This allows for adding all sorts of interesting characters that are not part of typical computing applications today. So, for example you could now have musical notes as part of the data that’s stored in a database. The SC collations can’t be used for things like database or server collations, but you can use them as column collations and now store and process that sort of detail.

 

For user errors we’ve had raise error before allowing us to raise an error. This has been inconsistent in the way it applies the type of action that occurs. For example you would think an action would be mapped to a severity, but that’s not the case. There are different errors that all are severity 16, some of which are batch of boarding, some of which are only statement of boarding. To help with this we now have the THROW statement. THROW is unrelated to the sys messages table, but it always throws the severity 16. It has a predictable behavior every single time that it occurs. Another advantage of THROW is that if we include that in a catch block, we can now allow that to re-throw the error that caused us to end up in the catch block, including the errors that are system errors.

 

Finally Extended Events was introduced in SQL Server 2008, and allowed us to get deep insights into what’s going on inside the product. That’s been enhanced with a whole lot of new events that we can now track, most of which relate to memory page allocation.

 

In the next demo that follows up we’ll have a look at simplified result set paging in SQL Server 2012.


Return to SQL Server 2012 Developer Training Kit BOM (en-US)

Leave a Comment
  • Please add 8 and 4 and type the answer here:
  • Post
Wiki - Revision Comment List(Revision Comment)
Sort by: Published Date | Most Recent | Most Useful
Comments
  • Richard Mueller edited Revision 5. Comment: Fixed zero in <a name> tag in heading in HTML so TOC works

  • Richard Mueller edited Revision 4. Comment: Removed (en-US) from title, added tag

Page 1 of 1 (2 items)
Wikis - Comment List
Sort by: Published Date | Most Recent | Most Useful
Posting comments is temporarily disabled until 10:00am PST on Saturday, December 14th. Thank you for your patience.
Comments
  • Richard Mueller edited Revision 5. Comment: Fixed zero in <a name> tag in heading in HTML so TOC works

  • Richard Mueller edited Revision 4. Comment: Removed (en-US) from title, added tag

Page 1 of 1 (2 items)