There are a few scenarios where it may be appropriate to partition data horizontally across several servers, with performance being the most obvious reason. The concept itself is known as sharding and is used in many large scale applications, such as Facebook.
In this recipe, we'll show you how to use NHibernate.Shards
to split our data set across three databases.
In SQL Server, create three new, blank databases named Shard1, Shard2, and Shard3. Complete the Getting Ready instructions at the beginning of Chapter 4, Queries.
NHibernate.Shards
using NuGet Package Manager Console:Install-Package NHibernate.Shards –Project QueryRecipes
Sharding
to the project:ShardedProduct.hbm.xml
:<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
assembly="QueryRecipes"
namespace="QueryRecipes.Sharding">
<class name="ShardedProduct">
<id name="Id">
<generator class="NHibernate.Shards.Id.ShardedUUIDGenerator,NHibernate.Shards" />
</id>
<property name="Name" />
</class>
</hibernate-mapping>
App.config
file with the following connection strings (or add to your existing file):<?xml version="1.0" encoding="utf-8" ?> <configuration> <connectionStrings> <add name="Shard1" connectionString="Server=.SQLExpress; Database=Shard1; Trusted_Connection=SSPI"/> <add name="Shard2" connectionString="Server=.SQLExpress; Database=Shard2; Trusted_Connection=SSPI"/> <add name="Shard3" connectionString="Server=.SQLExpress; Database=Shard3; Trusted_Connection=SSPI"/> </connectionStrings> </configuration>
ShardedProduct
:namespace QueryRecipes.Sharding { public class ShardedProduct { public virtual string Id { get; protected set; } public virtual string Name { get; set; } } }
ShardStrategyFactory
with the following code:using System.Collections.Generic; using NHibernate.Shards; using NHibernate.Shards.LoadBalance; using NHibernate.Shards.Strategy; using NHibernate.Shards.Strategy.Access; using NHibernate.Shards.Strategy.Resolution; using NHibernate.Shards.Strategy.Selection; namespace QueryRecipes.Sharding { public class ShardStrategyFactory : IShardStrategyFactory { public IShardStrategy NewShardStrategy( IEnumerable<ShardId> shardIds) { var loadBalancer = new RoundRobinShardLoadBalancer(shardIds); return new ShardStrategyImpl( new RoundRobinShardSelectionStrategy(loadBalancer), new AllShardsShardResolutionStrategy(shardIds), new SequentialShardAccessStrategy()); } } }
Recipe
:using System; using System.Collections.Generic; using System.Linq; using NH4CookbookHelpers; using NHibernate.Cfg; using NHibernate.Dialect; using NHibernate.Driver; using NHibernate.Shards; using NHibernate.Shards.Cfg; using NHibernate.Shards.Session; using NHibernate.Shards.Tool; namespace QueryRecipes.Sharding { public class Recipe : BaseRecipe { private IShardedSessionFactory _sessionFactory; } }
Initialize
to the class:public override void Initialize() { var connStrNames = new List<string> { "Shard1", "Shard2", "Shard3" }; var shardConfigs = connStrNames.Select((x, index) => new ShardConfiguration { ShardId = (short)index, ConnectionStringName = x } ); var protoConfig = new Configuration() .DataBaseIntegration( x => { x.Dialect<MsSql2012Dialect>(); x.Driver<Sql2008ClientDriver>(); }) .AddResource("QueryRecipes.Sharding.ShardedProduct.hbm.xml", GetType().Assembly); var shardedConfig = new ShardedConfiguration( protoConfig, shardConfigs, new ShardStrategyFactory() ); CreateSchema(shardedConfig); try { _sessionFactory = shardedConfig. BuildShardedSessionFactory(); } catch { DropSchema(shardedConfig); throw; } }
private void CreateSchema(ShardedConfiguration shardedConfiguration) { new ShardedSchemaExport(shardedConfiguration) .Create(false, true); } private void DropSchema(ShardedConfiguration shardedConfiguration) { new ShardedSchemaExport(shardedConfiguration) .Drop(false, true); }
Run
method, which inserts data and queries it:public override void Run() { using (var session = _sessionFactory.OpenSession()) { using (var tx = session.BeginTransaction()) { for (var i = 0; i < 100; i++) { var product=new ShardedProduct() { Name = "Product" + i, }; session.Save(product); } tx.Commit(); } } using (var session = _sessionFactory.OpenSession()) { using (var tx = session.BeginTransaction()) { var query = @"from ShardedProduct p where upper(p.Name) like '%1%'"; var products = session.CreateQuery(query) .List<ShardedProduct>(); foreach (var p in products) { Console.WriteLine( "Product Id: {0}, Name: {1}",p.Id,p.Name); } tx.Commit(); } session.Close(); } }
Sharding
recipe.NHibernate Shards allows you to split your data across several databases, named shards, while hiding this additional complexity behind the familiar NHibernate APIs. In this recipe, we use the sharded UUID POID generator, which generates UUIDs with a four-digit shard ID, followed by a 28 hexadecimal digit unique ID. A typical ID looks similar to 0001000069334c47a07afd3f6f46d587. You can provide your own POID generator, provided the shard ID is somehow encoded in the persistent object's IDs.
The ShardConfiguration
class configures a session factory for each shard. These session factories are grouped together with an implementation of IShardStrategyFactory
to build an IShardedSessionFactory
. A sharded session factory implements the familiar ISessionFactory
interface, so the impact on larger applications is minimal.
An implementation of IShardStrategyFactory
must return three strategies to control the operation of NHibernate shards.
First, the IShardSelectionStrategy
assigns each new entity to a shard. In this recipe, we use a simple round-robin technique that spreads the data across each shard equally. The first entity is assigned to shard 1, the second to shard 2, the third to shard 3, the fourth to shard 1, and so on.
Next, the IShardResolutionStrategy
is used to determine the correct shard, given an entity name and entity ID. In this example, we use the AllShardsShardResolutionStrategy
, which doesn't attempt to determine the correct shard. Instead, all shards are queried for an entity. We could provide our own implementation to get the shard ID from the first four characters of the entity ID. This would allow us to determine which shard contains the entity we want and query only that shard, reducing the load on each database. In this recipe though, we used an ID generator implementing IShardEncodingIdentifierGenerator
and that is enough to solve the resolution issue.
Finally, the IShardAccessStrategy
determines how the shards will be accessed. In this example, we use the SequentialShardAccessStrategy
, so the first shard will be queried, then the next, and so on. NHibernate Shards also includes a parallel strategy.
Once we've built a sharded session factory, the application code looks similar to any other NHibernate application. However, there are a few caveats. The major one possibly being that currently, LINQ or QueryOver are not supported. Also, many of the lesser used features of NHibernate cannot be used. For example, session.Delete("from Products"
) throws a NotImplementedException
. Additionally, sharded sessions expect to be explicitly closed before being disposed.
NHibernate Shards doesn't support object graphs spread across shard boundaries. The idea of well-defined boundaries between object graphs fits well with the domain-driven design pattern of aggregate roots and is generally considered a good NHibernate practice, even without sharding.
3.15.168.214