NHibernate Forge
The official new home for the NHibernate for .NET community

NHibernate POID Generators revealed

(Disclaimer: This post will be more or less a paraphrase of Fabio Maulo’s post, and I hope I can improve it a bit)

This topic is something that I wanted to write because I wasn’t aware of the drawbacks of “ native/identityimage” generator has until Fabio told me. Now it is my turn to spread the information to those who aren’t aware too. I even made a small poll via twitter, to see who uses what, and the result turns out to be that majority of people use identity/native for some reasons.

NHibernate has several object identifier generators for entities. Each of them has their cons and pros as anything else does.

We can basically seperate generators into two: PostInsertGenerator and ORM Style generators ( you can also call them identity style vs orm stlye generators). impact on your  I will investigate them in their categories.

ORM Style Generators 

ORM style generator can generate the identifiers before objects are sent to database. This is advantageous because you don’t need to go to database in order to have the ID, then set a relation based on this id. It also promotes Unit-Of-Work since you don’t need to go to database everytime an object is added/updated instead you do those at the moment of commit. Those generators are what WE SUGGEST.

Currently NHibernate provides several ORM style generators, some of them are listed below.

  • Guid
    Generates id’s by calling Guid.NewGuid(). Main drawback of this is with indexes. We know that Guids are more or less random(or pseudo-random let’s say) and this randomness creates fragmentation in database index. If you also think that the field is a PK, then it becomes more dramatic since they are stored in sorted manner.
  • Guid.Comb
    A very clever improvement over the Guid way. It creates guid based on the system time, and the guid it creates is database friendly. It doesn’t cause fragmentation in the table. You can see it from here

    I wonder if anybody reads the ALT of images? 
    (Image taken from Pamir Erdem’s blog)
    The effect of SequentialNewId() for default value has more or less the same effect of Guid.Comb
  • HiLo/Sequence HiLo
    This one is the one I like the most. It is both index friendly and user friendly. A HiLo id has 2 parts as the name suggests they are Hi and Lo. Each session factory gets the Hi value from database (with locking enabled), and lo values are managed by the session factory on its own. This algorithm also scales really well. All factories gets the Hi value only once. This reduces the the database traffic that aims to get the Hi values.

Post Insert Generators / Identity Style Generators

Post insert generators, as the name suggest, assigns the id’s after the entity is stored in the database. A select statement is executed against database. They have many drawbacks, and in my opinion they must be used only on brownfield projects. Those generators are what WE DO NOT SUGGEST as NH Team.

Some of the drawbacks are the following

  1. Unit Of Work is broken with the use of those strategies. It doesn’t matter if you’re using FlushMode.Commit, each Save results in an insert statement against DB. As a best practice, we should defer insertions to the commit, but using a post insert generator makes it commit on save (which is what UoW doesn’t do).
  2. Those strategies nullify batcher, you can’t take the advantage of sending multiple queries at once(as it must go to database at the time of Save)

There are several Post Insert Generator strategies (hey 2.1 has even more!) some of which are listed below(there are many, check Fabio’s post here)

  1. Identity
    Identity generator uses the value that is generated by MsSQL "identity” stuff. However, it’s meaning in the mapping changes depending on the dialect. For example, if database supports MsSQL like identity, then it will be used, if it supports sequences, then sequences will be used, etc. Something I learnt today from the NHUsers group is that MSSQL may sometimes return invalid SCOPE_IDENTITY() value.
  2. Guid.Native
    If I am to speak in terms of MsSQL terminology, it uses the NEWID() function to get a uniqueidentifier.

Comparison

I hear you say “you speak too much, all those doesn’t tell much, show me the code!” There it is, the comparison of post insert generators vs ORM style generators.

I will first start with demonstrating how they break UoW, then continue with Batcher! (did you know that NH uses NonBatchingBatcher by default? ;) )

The code under test is simple

[Test]
public void Should_not_insert_entity_in_a_transaction_HiLo()
{
var post = new PostWithHiLo {Title = "Identity Generators Revealed"};
var postComment = new PostCommentWithHiLo { Post = post, Comment = "Comment" };
using (ISession session = factory.OpenSession())
using (var tran = session.BeginTransaction())
{
session.Save(post); //No commit here
session.Save(postComment);
long insertCount = factory.Statistics.EntityInsertCount;
Assert.That(insertCount, Is.EqualTo(0), "Shouldn't insert entity in a transaction before commit.");
}
}

[Test]
public void Should_not_insert_entity_in_a_transaction_Identity()
{
var post = new PostWithIdentity {Title = "Identity Generators Revealed"};
var postComment = new PostCommentWithIdentity {Post = post, Comment = "Comment"};
using (ISession session = factory.OpenSession())
using (var tran = session.BeginTransaction())
{
session.Save(post);
session.Save(postComment);
long insertCount = factory.Statistics.EntityInsertCount;
Assert.That(insertCount, Is.EqualTo(0), "Shouldn't insert entity in a transaction before commit.");
}
}

Now, let’s try it. What do you expect in both cases? Should both test pass? The test with identity strategy fails as it tries to insert the entity even before calling a commit.

Now here is the explanation for the batcher:

using (ISession session = factory.OpenSession())
using (var tran = session.BeginTransaction())
{
for (int i = 0; i < 3; i++)
{
var post = new PostWithHiLo {Title = string.Format("Identity Generators Revealed {0}", i)};
session.Save(post);
}
tran.Commit();
}

The upper code sends queries to database only once. However, if you’re using the Identity style generators, then you’re in trouble.

Conclusion

You should know what you’re gaining and what you’re losing when using an identifier strategies. In case of a greenfield application, my choice would be to use HiLo as it is more user friendly(and this is what NH team suggests actually), and Guid.Comb in case a replication kinda thing is required. Most probably I wouldn’t use Identity. However, on a brownfield application, where you can’t really change the DB schema for some reason, than Identity should be used as a last resort.

I’d like to end this post with two sayings that I hear/see from Fabio

Human knowledge belongs to the world!
Quality is not achieved by chance!


Posted mar 20 2009, 09:49 a.m. by Tuna Toksoz
Filed under: , ,

Comments

Ayende @ Rahien wrote NHibernate: Avoid identity generator when possible
on 03-20-2009 7:32

NHibernate: Avoid identity generator when possible

Paulo Roberto Quicoli wrote re: NHibernate POID Generators revealed
on 03-20-2009 8:43

Greate explanation !

mhnyborg wrote re: NHibernate POID Generators revealed
on 03-20-2009 12:16

I have googled for an implementation of seghilo and witout success.

Can use show me the table structure and the extra code you need to write to make it work.

Thanks

Tuna Toksoz wrote re: NHibernate POID Generators revealed
on 03-20-2009 18:43

@Paulo Thanks

@mhnyborg Try setting your generator class in the mapping to   <generator class="hilo"/>. NH should create the table.

Tolomaus wrote re: NHibernate POID Generators revealed
on 03-21-2009 20:06

Hi Tuna,

> each Save results in an insert statement against DB

I don't have the NHibernate code around at the moment, but I'm in the impression that in one of the recent versions a Persist() method was added to the session API, in accordance with the java version. Calling this method instead of Save() on a transient entity should delay the insert statement to where it actually belongs: at commit time.

Here is the link to the java API: www.hibernate.org/.../Session.html

The identity being the default PK in many databases, it should be easy to use in NHibernate in my opinion.

Kind regards,

Tolomaüs

Tobin Harris wrote re: NHibernate POID Generators revealed
on 03-23-2009 11:33

Clear post, and I loved the extra detail on the index fragmentation. I hadn't considered that.

ro.ferraris wrote re: NHibernate POID Generators revealed
on 04-07-2009 6:36

Hi Tuna,

First of all thanks for the very interesting post.

After that a question about HiLo.

I've used it in a project where the DBMS is MySql and all work well as long as I use NHibernate.ISession.BeginTransaction, but when in a particular case I use a SysstemTransaction.TransacitonScope the HiLo doesn't work.

In fact during a save operation I receive a InvalidOperationException "Nested transactions are not supported".

It's possible to use HiLo in this context?

In NH source code I found that TableGenerator.Generate method make an exception for SQLLite in creating a new Transaction, do you think it's possible to include a control about the use of TransactionScope in this method?

Best Regards

Tuna Toksoz wrote re: NHibernate POID Generators revealed
on 04-07-2009 12:21

Hi ro.ferraris

Which version of NH are you using? I don't have a mysql running under my hands, but we haven't heard any problems so far.

ro.ferraris wrote re: NHibernate POID Generators revealed
on 04-08-2009 5:20

Hi Tune,

currently I'm using 1.2.0.3002.

The problem is MySql in conjunction with the use of a TransactionScope.

MySQL doesn't support nested transactions, and creating a new connection in a current TransactionScope automatically add it to the TransactionScope, the following new transaction is nested in the one used in TransactionScope, so the error from MySQL.

Yesterday I've implemented a custom generator copying HiloGenerator and TableGenerator and simply turning off the connection generation in Generate method, like for SQLLite.

I'm trying to investigate if there is a method to exclude a Connection from a TransactionScope, because I understand that theorretically is right to generate new Ids in a separate transaction, but I don't know if thati is possible.

Thanks for the answer.

Roby

Tuna Toksoz wrote re: NHibernate POID Generators revealed
on 04-08-2009 12:21

1.2. is pretty old and there have been some improvements on NH regarding to transaction scope. Can you try if it is a problem with 2.1(the trunk)?

BTW, in your generator, you can access to connection provider( don't remember the exact name) or you can get the connection and open a new connection so that it will be outside of the transaction thingy.

ro.ferraris wrote re: NHibernate POID Generators revealed
on 04-09-2009 4:49

Thank Tuna for the answer.

At the momento I use this workaround of custom generator because I've a deadline, but I think that for the next week I can do some other attempt (also trying NH 2.1) and I soon as possible I'll let you know about it.

Kind Regards,

  Roby

ro.ferraris wrote re: NHibernate POID Generators revealed
on 04-17-2009 6:33

Hi Tuna,

I've produced a sample solution that use NH 2.0.1 and the behaviour is the same.

The use of a TransactionScope is incompatible with HiLo generator when using MySQL as a database.

To solve this problem I've changed the Generate method of NHibernate.Id.TableGenerator changing

bool isSQLite = session.Factory.Dialect is SQLiteDialect;

in

bool dontUseNewConnection = session.Factory.Dialect is SQLiteDialect ||

               session.Factory.Dialect is MySQLDialect &&

               System.Transactions.Transaction.Current != null;

I don't know if this could be a general solution, but in my case it works.

Do you think it is better to open a new issue on NHibernate?

Kind Regards,

 Roby

Tuna Toksoz wrote re: NHibernate POID Generators revealed
on 04-17-2009 13:20

Hi Roby,

Yes, please create a jira for that.

new ThoughtStream("Derick Bailey"); wrote Database ID: Int vs. BigInt vs. GUID
on 07-14-2009 11:30

I’ve been hearing a lot of talk about using a GUID as a database row ID, in recent months… last night

Steve Strong wrote re: NHibernate POID Generators revealed
on 07-14-2009 15:19

We normally use GUIDs, and aim for sequential ones where possible since the index fragmentation is a real killer on high throughput systems.  However, does anyone have a good GUID generation strategy when there are multiple app servers?  Obviously a post-insert approach would work, but we'd like to avoid that for all the reasons stated above.  Currently, we have a GUID generation service that is shared between all of the app servers, but it is clearly a single point of failure...

DotNetShoutout wrote NHibernate POID Generators revealed - NHibernate blog - NHibernate Forge
on 07-14-2009 16:34

Thank you for submitting this cool story - Trackback from DotNetShoutout

new ThoughtStream("Derick Bailey"); wrote Storage Size And Performance Implications Of A GUID PK
on 07-15-2009 11:00

I sent the same Guid vs. Int. vs BigInt question to a group of coworkers yesterday. One of the responses

Maxim wrote re: NHibernate POID Generators revealed
on 11-07-2011 19:49

I would like very much to use "hilo" generator but there is no any example how to create "specific" table and which values pass to it.

<id name="Id" >

     <column name="CatId" sql-type="Int64" not-null="true"/>

     <generator class="hilo"/>

</id>

Doesn't create anything in the database.

Guid.Comb - gives 96% of fragmentation when I added in a loop 20K cat objects (from your tutorial). Could you please make an example of "hilo" implementation? Thanks.

Organo Gold wrote Organo Gold
on 09-14-2014 12:24

NHibernate POID Generators revealed - NHibernate blog - NHibernate Forge

cheapest commercial sewing machines wrote cheapest commercial sewing machines
on 09-26-2014 18:59

NHibernate POID Generators revealed - NHibernate blog - NHibernate Forge

Powered by Community Server (Commercial Edition), by Telligent Systems