SQL11UPD03-TSCRIPT-04

This wiki is a transcript of a previously recorded video.

Related content assets:

  • Demo: Implementing a ‘Nearest Neighbor’ Spatial Query with SQL Server 2012 (SQL11UPD03-DEMO-02)
  • Video: Implementing a ‘Nearest Neighbor’ Spatial Query with SQL Server 2012 (SQL11UPD03-REC-04)



 

Implementing a ‘Nearest Neighbor’ Spatial Query with SQL Server 2012

Hi folks, I’m Greg Low. I’m SQL Server MVP and part of the Microsoft RD program. Just want to show you one of the changes that’s occurred in SQL Server Spatial for SQL Server 2012 – in particular, the types of queries that benefit from the new nearest neighbor algorithm.

 

Let’s have a look in Management Studio. Now in this case I’m imagining that I have an employee that’s moving to a new home, and in this case I want to find the other employees that live in the vicinity of where this person is going to move to. From the AdventureWorks database I’m issuing a query. In this case I’m going to just initially create myself a list of where it is that the employees live. What I’m doing here is saying I’m going to go off and retrieve a set of employee addresses. I’m joining in a common table expression details of where these addresses come from, from a variety of tables. Then I’m putting a BusinessEntityID, a FullName, a SpatialLocation and the type of address; just dropping that in a table called EmployeeLocations. So let’s execute that.

 

Once I’ve got this I’ll just do a select top 10 from that table just to show you the type of data that’s currently sitting in there. There’s BusinessEntityID, the name of the person, and this is the location. In this case that’s been shown as a binary representation of the SQL CLR object.  I could also get that as a well known text. For example, let’s just add another column on here. So if I said SpatialLocation_ToString, and then we execute that as well, scroll across and you can now see that this is the additional location and that would show us where that person lives in terms of latitude and longitude. But in this case what we’re interested in is the spatial location.

 

To do further work on this table, the first thing I’m going to do is add a PrimaryKey because I just used a Select Into create the table so I don’t actually have that in place yet. Let’s base that on the BusinessEntityID. The next thing I’m going to do to improve the performance of the spatial queries is to create a spatial index on the table. A few things have changed in SQL Server 2012 compared to the previous version with this. I’m creating a spatial index and giving it a name. I’ve said on the EmployeeLocations table and it’s the SpatialLocation column that I want it based on, all that’s exactly that same. In previous versions what I would then have to do is start allocating the density that I want at four different levels of the grid that would allow it to quickly get to where the data is. Now a lot of people found that difficult to work out what that should be. So in this version what’s been added is an autogrid option. In this case I’ve said using Geography_Grid, that’s the automatic grid option. The other option that I’ve made use of here is that I’ve turned on Data_Compression to the page level. This is another thing added in 2012, is that spatial indexes now support page level compression. An added bonus is that they now build four or five times faster than they did before.

 

Now in this case we’ll then go look at a query that would retrieve details of where Paul lives, and who the other employees are that live nearby. In this case I’ve simply created a geography data type and said this is where Paul lives, I’ve identified that. Then this is the type of query that now lends itself really well to a new algorithm that’s part of 2012 that’s called the nearest neighbor algorithm. The key elements for this to work, if we wander down and take a look at this query, I’ve said SELECT TOP and then asterisk in this case. The important parts of this query is that there are a couple of different formats this can take. If we look here you can see TOP 5 in the case of both of these. I’ve also specified in my WHERE clause that I’m looking for a specific distance. In this case I’ve just said find all the ones where it’s not null, but this algorithm also works for situations where I say that the distance is less than some specific distance. The other important aspect here is I need to say order by, and it needs to be based on that distance as well. So for the nearest neighbor algorithm to work, the things I need are to be doing a TOP, I need to be basing my queries on distances not null, or distances less than a certain value, and then I need to be doing an ORDER BY that same distance. So if I execute this we can – in fact, let’s execute the whole thing including defining the variable – what this then shows in order of how far away these people live, here are the people that live nearest to Paul.


Return to SQL Server 2012 Developer Training Kit BOM (en-US)