NOTE: This article is being phased out in favor of Sharding with SQL Azure. Please replace links to this article with links to Sharding with SQL Azure. Author: Dinakar Nethi
Technical Reviewers: Cihan Biyikoglu, Michael Thomassy, Wayne Walter Berry
Summary SQL Azure Database is a cloud database service from Microsoft. SQL Azure provides web-facing database functionality as a utility service. Cloud-based database solutions such as SQL Azure can provide many benefits, including rapid provisioning, cost-effective scalability, high availability, and reduced management overhead. This paper provides an overview on some scale out strategies, challenges with scaling out on-premise and how you can benefit with scaling out with SQL Azure.
Scaling out with SQL Azure
Partitioning is a technique of splitting up data into smaller subsets across multiple databases for better manageability, availability and scalability. As customers’ data needs grow, they have the option to either scale up or scale out.
The primary advantage of scaling out vs. scaling up is that you can get more work done by parallelizing the workload over distributed servers and improving throughput. Scaling up always has a ceiling as opposed to Scale out with virtually no limits. Depending on how the application is architected, scale out techniques require minimal downtime where you can provision servers while managing the availability of servers through a configuration file. While hot add memory and processor options exist in high end systems, typically they do end up taking downtime as a precaution. Scale out techniques are being successfully applied in several applications such as eBay, Flickr, YouTube, Facebook etc.
Common Partitioning Techniques
Partitioning can be achieved in several ways, the two major categories being Horizontal Partitioning and Vertical Partitioning. While the benefits of partitioning seem well understood, successfully implementing a partitioning technique and achieving the partitioning benefits depends on the nature of workload of businesses and a very carefully planned and tested partitioning key.
Vertical Partitioning: In this technique, data is split across tables with fewer columns through the process of Normalization with infrequently used columns in a separate table. A table is split into multiple smaller tables with fewer columns and can be linked back together by primary key and foreign key relationships. Another way of vertical partitioning is to put a single table/entity into its own database. For example, some tables may be so large that putting them to a different database makes sense. For instance, taking the entire customer table into another database is a form of vertical partitioning. Typically, rows from all tables using unique rowid(s) are re-constituted to form an entire row. For example, if a customer has multiple shipping addresses in his or her profile, the table can be partitioned by putting the primary address in one table and secondary addresses in another table. Most queries will directly be done against the primary address table and very few requiring the application to display all the addresses.
Horizontal Partitioning: In this technique, data is distributed across tables by a key, similar to Database Partitioning in SQL Server 2005. Typically, there is a master table that maps the data to the partition. Depending on the query that the user submits, the application queries this master table to find out which partition has the required data and routes the query to the particular database. Typically, horizontal partitioning splits data within an instance of a database server. Sometimes this approach is taken further where these partitions are distributed across multiple instances. Each of these partitions can be in a separate database on a physically separate server. The database schema is similar across all partitions.
For example, if you consider a typical online store, all inventories belonging to a particular country can be put into one database. Similarly, sales, customer information belonging can be distributed based on the country, country being the partitioning key here. Another partitioning key in the above scenario could be an alphabetical list of countries. Further partitioning can also be done by using a state as a partitioning key. A combination of criteria can also be considered for a partitioning key, depending on the type of application and the nature of workload.
Typically, data is normalized for better performance. In horizontal partitioning, logical groups of data are stored together. There are some key considerations to achieve the benefits of horizontal database partitioning.
Benefits of Scaling out with SQL Azure
SQL Azure is a cloud based service offering relational database capabilities as part of the Azure platform. SQL Azure is based on Microsoft SQL Server and offers a subset of the features available on on-premise SQL Server. It provides a highly available, scalable multi-tenant database service hosted By Microsoft.
Currently, SQL Azures is offered in two editions – a Web edition and a Business edition. The latest offerings and the database size limits with the offerings is available at Pricing Overview. You can use the SQL Azure portal to create one SQL Azure server, per subscription. The SQL Azure portal provides a user interface that you can use to provision servers and databases.
A SQL Azure server is not a physical server like an on-premise instance of SQL Server. Instead, it’s a logical grouping of user databases. Each SQL Azure server comes with a read only master database. The master database maintains a list of databases and keeps track of logins, and their permission levels. The master database also contains usage metrics.
Applications that need tens or hundreds of databases for a short period of time due to a seasonal/anticipated increase in demand can do so seamlessly and these additional databases can be de-provisioned when the usage drops. This kind of scale out capability with SQL Azure greatly benefits customers from having to purchase high end hardware to sustain these occasional peaks in demand and helps them save infrastructure costs from under-utilized hardware that they would otherwise incur when scaling out on-premise.
Billing with SQL Azure
The pricing structure for SQL Azure is available at Azure Pricing. When scaling out with SQL Azure customers can accurately calculate the costs they would incur and also very accurately estimate the cost savings when they need to scale down. Customers can also get real-time cost estimates of the number of databases they have and their usage charges from the portal with very little effort.
Conclusion
Although database partitioning can help improve performance, scalability and costs, it can be a challenging effort to implement successful partitioning scheme. Some applications have a natural partitioning scheme that the applications can take advantage of while some applications may have to be re-architected to become scalable, although the benefits may be well worth it. SQL Azure provides the platform where applications can scale out from one or two databases to tens or hundreds of databases very seamlessly without incurring heavy infrastructure costs. Customers also do not lose any infrastructure investments if they need to scale down due to changing usage patterns.
· SQL Azure Portal
· SQL Azure Developer Center
· SQL Azure Team Blog