Sharding databases for performance

There are a few scenarios where it may be appropriate to partition data horizontally across several servers, with performance being the most obvious reason. The concept itself is known as sharding and is used in many large scale applications, such as Facebook.

In this recipe, we'll show you how to use NHibernate.Shards to split our data set across three databases.

Getting ready

In SQL Server, create three new, blank databases named Shard1, Shard2, and Shard3. Complete the Getting Ready instructions at the beginning of Chapter 4, Queries.

How to do it...

  1. Add a reference to NHibernate.Shards using NuGet Package Manager Console:
    Install-Package NHibernate.Shards –Project QueryRecipes
    
  2. Add a new folder named Sharding to the project:
  3. Add a new embedded resource named ShardedProduct.hbm.xml:
    <?xml version="1.0" encoding="utf-8" ?>
    <hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
      assembly="QueryRecipes"
      namespace="QueryRecipes.Sharding">
      <class name="ShardedProduct">
        <id name="Id">
          <generator class="NHibernate.Shards.Id.ShardedUUIDGenerator,NHibernate.Shards" />
        </id>
        <property name="Name" />
      </class>
    </hibernate-mapping>
  4. Add an App.config file with the following connection strings (or add to your existing file):
    <?xml version="1.0" encoding="utf-8" ?>
    <configuration>
      <connectionStrings>
      <add name="Shard1" connectionString="Server=.SQLExpress; Database=Shard1; Trusted_Connection=SSPI"/>
      <add name="Shard2" connectionString="Server=.SQLExpress; Database=Shard2; Trusted_Connection=SSPI"/>
      <add name="Shard3" connectionString="Server=.SQLExpress; Database=Shard3; Trusted_Connection=SSPI"/>
      </connectionStrings>
    </configuration>
  5. Add a new class named ShardedProduct:
    namespace QueryRecipes.Sharding
    {
      public class ShardedProduct
      {
        public virtual string Id { get; protected set; }
        public virtual string Name { get; set; }
      }
    }
  6. Add a new class named ShardStrategyFactory with the following code:
    using System.Collections.Generic;
    using NHibernate.Shards;
    using NHibernate.Shards.LoadBalance;
    using NHibernate.Shards.Strategy;
    using NHibernate.Shards.Strategy.Access;
    using NHibernate.Shards.Strategy.Resolution;
    using NHibernate.Shards.Strategy.Selection;
    
    namespace QueryRecipes.Sharding
    {
      public class ShardStrategyFactory : IShardStrategyFactory
      {
        public IShardStrategy NewShardStrategy(
          IEnumerable<ShardId> shardIds)
        {
          var loadBalancer = 
      new RoundRobinShardLoadBalancer(shardIds);
          return new ShardStrategyImpl(
            new RoundRobinShardSelectionStrategy(loadBalancer),
            new AllShardsShardResolutionStrategy(shardIds),
            new SequentialShardAccessStrategy());
        }
      }
    }
  7. Add a new class named Recipe:
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using NH4CookbookHelpers;
    using NHibernate.Cfg;
    using NHibernate.Dialect;
    using NHibernate.Driver;
    using NHibernate.Shards;
    using NHibernate.Shards.Cfg;
    using NHibernate.Shards.Session;
    using NHibernate.Shards.Tool;
    
    namespace QueryRecipes.Sharding
    {
        public class Recipe : BaseRecipe
        {
            private IShardedSessionFactory _sessionFactory;
        }
    }
  8. Add a new method Initialize to the class:
    public override void Initialize()
    {
        var connStrNames = new List<string> {
    "Shard1", "Shard2", "Shard3" 
        };
    
        var shardConfigs = connStrNames.Select((x, index) => 
          new ShardConfiguration
          {
            ShardId = (short)index,
            ConnectionStringName = x
          }
        );
    
        var protoConfig = new Configuration()
            .DataBaseIntegration(
                x =>
                {
                    x.Dialect<MsSql2012Dialect>();
                    x.Driver<Sql2008ClientDriver>();
                })
          .AddResource("QueryRecipes.Sharding.ShardedProduct.hbm.xml",   
             GetType().Assembly);
    
        var shardedConfig = new ShardedConfiguration(
         protoConfig, 
         shardConfigs, 
         new ShardStrategyFactory()
        );
    
        CreateSchema(shardedConfig);
    
        try
        {
          _sessionFactory = 
           shardedConfig.
           BuildShardedSessionFactory();
        }
        catch
        {
            DropSchema(shardedConfig);
            throw;
        }
    }
  9. Add two methods for creating and dropping the schema:
    private void CreateSchema(ShardedConfiguration shardedConfiguration)
    {
        new ShardedSchemaExport(shardedConfiguration)
              .Create(false, true);
    }
    
    private void DropSchema(ShardedConfiguration shardedConfiguration)
    {
        new ShardedSchemaExport(shardedConfiguration)
              .Drop(false, true);
    }
  10. Finally, add a new Run method, which inserts data and queries it:
    public override void Run()
    {
        using (var session = _sessionFactory.OpenSession())
        {
            using (var tx = session.BeginTransaction())
            {
                for (var i = 0; i < 100; i++)
                {
                    var product=new ShardedProduct()
                    {
                        Name = "Product" + i,
                    };
                    session.Save(product);
                }
                tx.Commit();
            }
        }
                
        using (var session = _sessionFactory.OpenSession())
        {
            using (var tx = session.BeginTransaction())
            {
                var query = @"from ShardedProduct p 
                                where upper(p.Name)
                                like '%1%'";
                var products = session.CreateQuery(query)
                    .List<ShardedProduct>();
    
                foreach (var p in products)
                {
                    Console.WriteLine(
    "Product Id: {0}, Name: {1}",p.Id,p.Name);
                }
                tx.Commit();
            }
            session.Close();
        }
    }
  11. Run the application and start the Sharding recipe.
  12. Inspect the product table in each of the three databases. You should find one third of the products in each.

How it works...

NHibernate Shards allows you to split your data across several databases, named shards, while hiding this additional complexity behind the familiar NHibernate APIs. In this recipe, we use the sharded UUID POID generator, which generates UUIDs with a four-digit shard ID, followed by a 28 hexadecimal digit unique ID. A typical ID looks similar to 0001000069334c47a07afd3f6f46d587. You can provide your own POID generator, provided the shard ID is somehow encoded in the persistent object's IDs.

The ShardConfiguration class configures a session factory for each shard. These session factories are grouped together with an implementation of IShardStrategyFactory to build an IShardedSessionFactory. A sharded session factory implements the familiar ISessionFactory interface, so the impact on larger applications is minimal.

An implementation of IShardStrategyFactory must return three strategies to control the operation of NHibernate shards.

First, the IShardSelectionStrategy assigns each new entity to a shard. In this recipe, we use a simple round-robin technique that spreads the data across each shard equally. The first entity is assigned to shard 1, the second to shard 2, the third to shard 3, the fourth to shard 1, and so on.

Next, the IShardResolutionStrategy is used to determine the correct shard, given an entity name and entity ID. In this example, we use the AllShardsShardResolutionStrategy, which doesn't attempt to determine the correct shard. Instead, all shards are queried for an entity. We could provide our own implementation to get the shard ID from the first four characters of the entity ID. This would allow us to determine which shard contains the entity we want and query only that shard, reducing the load on each database. In this recipe though, we used an ID generator implementing IShardEncodingIdentifierGenerator and that is enough to solve the resolution issue.

Finally, the IShardAccessStrategy determines how the shards will be accessed. In this example, we use the SequentialShardAccessStrategy, so the first shard will be queried, then the next, and so on. NHibernate Shards also includes a parallel strategy.

Once we've built a sharded session factory, the application code looks similar to any other NHibernate application. However, there are a few caveats. The major one possibly being that currently, LINQ or QueryOver are not supported. Also, many of the lesser used features of NHibernate cannot be used. For example, session.Delete("from Products") throws a NotImplementedException. Additionally, sharded sessions expect to be explicitly closed before being disposed.

NHibernate Shards doesn't support object graphs spread across shard boundaries. The idea of well-defined boundaries between object graphs fits well with the domain-driven design pattern of aggregate roots and is generally considered a good NHibernate practice, even without sharding.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.7.102